Special Reports

Mobile DNA in Old World Monkeys: A Glimpse Through the Rhesus Macaque Genome

See allHide authors and affiliations

Science  13 Apr 2007:
Vol. 316, Issue 5822, pp. 238-240
DOI: 10.1126/science.1139462


The completion of the draft sequence of the rhesus macaque genome allowed us to study the genomic composition and evolution of transposable elements in this representative of the Old World monkey lineage, a group of diverse primates closely related to humans. The L1 family of long interspersed elements appears to have evolved as a single lineage, and Alu elements have evolved into four currently active lineages. We also found evidence of elevated horizontal transmissions of retroviruses and the absence of DNA transposon activity in the Old World monkey lineage. In addition, ∼100 precursors of composite SVA (short interspersed element, variable number of tandem repeat, and Alu) elements were identified, with the majority being shared by the common ancestor of humans and rhesus macaques. Mobile elements compose roughly 50% of primate genomes, and our findings illustrate their diversity and strong influence on genome evolution between closely related species.

Old World monkeys (OWMs) represent one of the most closely related primate groups to humans. The rhesus macaques (Macaca mulatta), along with other OWMs, have been extensively used in biomedical studies (1). An improved understanding of their genomic architecture could hold important implications for medicine, evolutionary understanding, and beyond. Similar to the human and chimpanzee genomes, roughly 50% of the rhesus macaque genome consists of various repetitive sequences (24). The majority of these repeats are mobile elements, which can be divided into class I DNA transposons (5) and class II retrotransposons (6). Related transposable elements are further categorized into families, with each family further classified into subfamilies on the basis of their sequence relationships. The insertion of mobile elements can alter gene expression (7), generate genomic deletions (8), and even create new genes and gene families (9). Existing repetitive elements can also mediate recombinations between similar elements at different genomic locations (ectopic recombination) (10). In addition, the GC-rich nature of certain mobile elements {e.g., Alu and SVA [short interspersed element (SINE), variable number of tandem repeat (VNTR), and Alu] elements} can introduce new GC islands through their insertion (3). Despite the overall similarity in retrotransposon mobilization activity in the OWM and hominoid (human and ape) lineages, mobile elements have continued to evolve independently in both lineages. Close examination of the overall mobile-element composition in OWMs, with the rhesus macaque genome used as a reference, allows an understanding of their lineage-specific expansion and illustrates their overall contribution to genome evolution.

Without any detected lineage-specific copies, DNA transposons, which mobilize through a cut-and-paste mechanism, appear to have been inactive in the rhesus macaque lineage since their speciation from humans. The paucity of DNA transposon mobilization in mammals, and in amniotes in general, is noteworthy by comparison with other organisms (e.g., plants) and may result from the relative difficulty in horizontal transfer into animals' germ lines (11).

Similar to the human genome, the rhesus macaque genome contains over half a million recognizable copies of endogenous retroviruses (ERVs) and their nonautonomous derivatives, with the great majority being present or fixed before the hominoid-OWM split (12). We found evidence for at least eight instances of horizontal transmission of ERVs in the OWM lineage resulting in 2750 extant copies (table S1 and SOM Text). This is much higher than in the human lineage, where there is evidence for only one or two invading elements leaving fewer than 10 extant copies (13). Five of the eight horizontally transmitted ERVs belong to class I retroviruses, and the remaining three belong to class II retroviruses (shown in red letters in Fig. 1). Apart from these new invasions, at least seven ERV families already entered the genome before the hominoid-OWM split and remained active afterward. There are over 3500 copies of these ERV subfamilies in the OWM lineage, similar to the number of lineage-specific ERV copies in humans.

Fig. 1.

Phylogenetic tree of retroviruses based on full-length Pol proteins. Common infectious retroviruses and endogenous retroviruses, present in fish, birds, mammals (nonprimate), and primates, were included in the analysis. Color identifications for each group are shown in the upper right corner. Asterisks and circles show deep-rooted branches with >95 and >75% bootstrap values, respectively. The ERVs identified in this study that invaded the OWM genome horizontally (i.e., through external germline infection) are indicated with red letters. For all ERVs shown in blue letters, the original insertion occurred in the common ancestor of humans and rhesus macaques (i.e., vertically) and is present in both genomes. All ERVs indicated with blue letters also generated new insertions in the OWM lineage. The scale bar indicates 10% divergence in the amino acid sequence.

The L1PA (primate A) family of long interspersed elements (LINES) represents the dominant active L1 lineage throughout primate evolution. In our analysis, L1PA5 was the most commonly recovered L1 subfamily, and ∼19,000 L1PA5 elements specific to the OWM lineage were identified in the rhesus macaque genome. Most of these elements represent insertion events that occurred along the OWM lineage leading to rhesus macaques and are therefore present in multiple OWM species (fig. S2). A total of 32 OWM-specific L1 subfamilies were identified with the use of diagnostic substitutions present in these elements (table S2). To investigate the relationship of L1s, we constructed a median-joining network with their consensus sequences (Fig. 2 and SOM Text) and estimated the age of each subfamily (table S2). The network results indicated that the OWM-specific L1 lineage rooted with the L1PA6 consensus sequence, and several lineages roughly followed a sequential order, with little overlap in their amplification period. The sequential evolution of L1 elements appears to follow a general trend seen in mammalian L1s (14) and may result from amplification competition between two distinct L1 lineages (15). Altogether, we identified nine putative retrotransposition-competent L1s in the rhesus macaque genome, and they belonged to the L1CER-3 or L1CER-4 subfamilies; each L1 subfamily name is identified by “CER” (which stands for Cercopithecidae, indicating the origin of the consensus sequence) and an Arabic numeral indicating its lineage (12). Nine was a considerably lower number of potentially active L1 elements than that in the human genome, which has 80 to 100 active copies (16). Nevertheless, it is likely that additional retrotransposition-competent L1 elements will be recovered in more refined drafts of the rhesus macaque genome.

Fig. 2.

Median-joining network of OWM-specific L1 subfamilies. Subfamilies are represented by circles, with the circle size symbolizing the relative size of each subfamily. The length of the lines corresponds to the number of substitutions. The scale of a single substitution is shown in the upper left corner. Broken lines indicate segments not drawn to scale. Gray circles represent the subfamilies belonging to the L1CER-3 lineage, which include an 18–base pair (bp) duplication in their 3′ untranslated region (3′UTR), and green-edged circles contain intact full-length L1 elements. The dashed line and red arrow represent two alternative pathways for the origin of the L1CER-4 subfamily. The subfamilies in the blue and pink ovals share the same diagnostic mutations but do not share the 18-bp duplication. My, million years.

Retrotransposon-mediated DNA sequence transduction is a process whereby a retrotransposon carries a flanking genomic sequence during its mobilization that can result in exon or gene duplication (17). Three L1 elements with 5′ transduced exon–derived sequences were identified in the rhesus macaque genome. Moreover, detailed analysis indicated that one of the three insertions occurred in an exon of another gene (table S3 and SOM Text). These three events empirically demonstrate that exon-derived sequences can be transferred via 5′ L1–mediated transduction within primate genomes and that 5′ transduction constitutes a second mechanism of retrotransposon-mediated “exon shuffling.”

Alu elements are the most successful SINEs in primate genomes (18), and ∼110,000 Alu insertions are specific to OWMs. Fourteen different OWM lineage–specific AluY subfamilies fell into four lineages, shown in a median-joining network analysis (Fig. 3), and were identified with estimated copy numbers (table S4). All subfamilies were estimated to have originated after the hominoid-OWM divergence and were congruent with our phylogenetic analyses showing that all of these Alu subfamilies were restricted to OWMs (SOM Text). The simultaneous retrotransposition activity of multiple Alu subfamilies is similar to that in the human genome, and the activity of multiple “source genes” may have contributed to the amplification success of Alu elements despite their reliance on L1 enzymatic machinery for mobilization (19).

Fig. 3.

Median-joining network of OWM-specific Alu subfamilies. Subfamilies are represented by circles. The length of the lines corresponds to the number of substitutions, and the scale of a single substitution is shown in the upper left corner. Broken lines indicate segments not drawn to scale. Gray circles represent all subfamilies belonging to the AluYRb lineage containing a 12-bp deletion. Red-edged circles denote the youngest Alu subfamily within each lineage, and the blue-edged circle indicates the AluY subfamily consensus sequence.

About 100 precursors of SVA were identified in the rhesus macaque genome. The variable number of tandem repeat (VNTR) regions of these elements share >90% identity with the VNTR unit in hominoid SVA elements (20), although they have no sequence homology with other components of SVA elements. Thus, these elements appear to have contributed a portion of the genetic material required to form the SVA composite retrotransposon family in hominoids. The majority of these elements are shared between human and rhesus macaque, indicating that these elements were active before the divergence of hominoids and OWMs. The low number of lineage-specific elements (∼20 in the OWM lineage) suggests a very low retrotransposition rate of SVA precursor elements over the past 25 million years.

Composing nearly half of all sequenced primate genomes, mobile elements—especially retrotransposons—are major components of genomic variation and a driving force of primate evolution. Although the overall number of mobile elements is similar in the human, chimpanzee, and rhesus macaque genomes (24), a large fraction of the elements inserted independently into different locations within each genome and thus shaped the genomes differently (21). Whereas most retrotransposon insertions remain neutral in the genome, many insertions can have deleterious effects of varying severity. Mobile elements can cause genetic diseases not only by direct gene disruption or by the deletion of exonic sequence upon insertion but also by mediating subsequent recombination between existing retrotransposons. Indeed, more than 118 human genetic disorders are caused by retrotransposons, including hemophilia B, breast cancers, and congenital muscular dystrophy [see (22) and (23) for reviews]; they are likely to have a similar impact on the rhesus genome. Yet, retrotransposons are also responsible for creating a variety of genomic novelties. They are involved in mediating gene duplication, exon shuffling, and RNA-editing–mediated exonization (9, 17, 24). All these mechanisms can contribute to new gene formation, as well as potentially altering DNA methylation patterns and contributing to X chromosome inactivation in females (25, 26). In addition, retrotransposons provide highly valuable genetic systems for primate population and phylogenetic studies, because they have a known ancestral (i.e., insertion-absent) state, and the chance that the same type of element would integrate at precisely the same location in multiple individuals is essentially zero (i.e., the insertions are identical by descent) (27, 28). Altogether, understanding the mobile-element landscape in primates is not only important for biologists but also crucial for biomedical researchers using primate animal models.

Supporting Online Material


Materials and Methods

SOM Text

Figs. S1 and S2

Tables S1 to S7


Source Code

References and Notes

Stay Connected to Science

Navigate This Article