Research Article

Horizontal Transfer of Entire Genomes via Mitochondrial Fusion in the Angiosperm Amborella

See allHide authors and affiliations

Science  20 Dec 2013:
Vol. 342, Issue 6165, pp. 1468-1473
DOI: 10.1126/science.1246275

Shaping Plant Evolution

Amborella trichopoda is understood to be the most basal extant flowering plant and its genome is anticipated to provide insights into the evolution of plant life on Earth (see the Perspective by Adams). To validate and assemble the sequence, Chamala et al. (p. 1516) combined fluorescent in situ hybridization (FISH), genomic mapping, and next-generation sequencing. The Amborella Genome Project (p. 10.1126/science.1241089) was able to infer that a whole-genome duplication event preceded the evolution of this ancestral angiosperm, and Rice et al. (p. 1468) found that numerous genes in the mitochondrion were acquired by horizontal gene transfer from other plants, including almost four entire mitochondrial genomes from mosses and algae.


We report the complete mitochondrial genome sequence of the flowering plant Amborella trichopoda. This enormous, 3.9-megabase genome contains six genome equivalents of foreign mitochondrial DNA, acquired from green algae, mosses, and other angiosperms. Many of these horizontal transfers were large, including acquisition of entire mitochondrial genomes from three green algae and one moss. We propose a fusion-compatibility model to explain these findings, with Amborella capturing whole mitochondria from diverse eukaryotes, followed by mitochondrial fusion (limited mechanistically to green plant mitochondria) and then genome recombination. Amborella’s epiphyte load, propensity to produce suckers from wounds, and low rate of mitochondrial DNA loss probably all contribute to the high level of foreign DNA in its mitochondrial genome.

Many of the fundamental properties of eukaryotes arose from horizontal evolution on a grand scale–—that is, the endosymbiotic origin of the mitochondrion and plastid from bacterial progenitors (1). Since their birth, however, mitochondrial and plastid genomes seem to have been little affected by horizontal gene transfer (HGT). The most notable exception involves land plants, especially flowering plants (angiosperms), in which HGT is common in the mitochondrial genome but unknown in plastids (210).

To gain insight into the causes and consequences of HGT in mitochondrial DNA (mtDNA), we sequenced the mitochondrial genome of Amborella trichopoda because polymerase chain reaction–based sampling had shown it to be rich in foreign genes (4). This large shrub is endemic to rain forests of New Caledonia and is probably sister to all other angiosperms, a divergence dating back about 200 million years (11, 12).

Overall Genome Properties

The Amborella mitochondrial genome assembled as five autonomous, circular-mapping chromosomes of lengths 3179, 244, 187, 137, and 119 kb, giving a total genome size of 3,866,039 base pairs (bp) (Fig. 1A and figs. S1 to S4) (13). The five chromosomes are distinct in sequence but similar in base composition (45 to 47% G + C), stoichiometry, and HGT properties (Fig. 1A and figs. S2 and S4). Stoichiometry was assessed by sequencing coverage and Southern blot analysis of 32 individuals from three populations (fig. S5) (13).

Fig. 1 Foreign DNA in the Amborella mitochondrial genome.

(A) Map of its five chromosomes shown linearized and abutted (see arrows). Numbers give unified genome coordinates in kb. Shown are regions of inferred organelle origin whose ancestry was assignable (see key: mt, mitochondrial; pt, plastid). Full-height boxes indicate genes. Half-height boxes indicate tracts of native and angiosperm-HGT DNA. Labeled black lines indicate horizontally transferred mitochondrial genomes (Figs. 3 and 4) or partial genomes. M1 to M4 mark a moss-derived genome. A1, A2, B1, B2, and C1 to C4 mark three green algal–derived genomes. D marks the seven fragments of a partial genome from a fourth green algal donor. Oxalidales, Santalales, Fagales, and Ricinus mark angiosperm tracts whose donors were identified to at least the level of taxonomic order. The pie chart depicts the roughly eight genome equivalents of organelle DNA present in Amborella mtDNA. Genome equivalents: mt moss, 1.0; mt green algal, 3.3; mt angiosperm, 2.0; mt native, 1.0; pt IGT, 0.8. See table S11 for photo credits and names of the plants shown. (B) Detailed view of three 150-kb regions of Amborella mtDNA. Histograms show the angiosperm score (13). Triangles indicate intergenic regions of species-specific identity to Ricinus and Bambusa (fig. S21). Gene names are given only for well-supported cases of angiosperm HGT.

As described in the next three sections, Amborella mtDNA possesses an extensive and diverse collection of foreign sequences, corresponding to about six genome equivalents of mtDNA acquired from mosses, angiosperms, and green algae. Multigene HGT has been described in two other lineages of plant mtDNA (8, 10), but not on a scale approaching Amborella. The Amborella mitochondrial genome also contains a large amount (138 kb) of plastid DNA (ptDNA) (Fig. 1A, fig. S2, and table S1).

Multichromosomal mitochondrial genomes in plants were only recently discovered (14, 15) and mostly involve large (>1 Mb) genomes, with Silene genomes of 6.7 and 11.3 Mb dwarfing Amborella in size and chromosome number (15). These three mitochondrial genomes are the largest completely assembled organelle genomes, larger than many bacterial genomes and even some nuclear genomes. However, the processes responsible for their expansion differ in that Silene genomes possess no readily discernible foreign mtDNA and relatively little ptDNA (15).

HGT from Mosses

Amborella mtDNA contains four regions, of lengths 48, 40, 9, and 4 kb, acquired from moss mtDNA (Fig. 1A and fig. S2). With one exception, the 41 protein and ribosomal RNA (rRNA) genes from these four regions were placed phylogenetically, almost always strongly, as sister to the moss Physcomitrella (Fig. 2, A to D, and figs. S8 and S9). Gene order in the four regions (Fig. 3 and fig. S6) is highly similar to both Physcomitrella and Anomodon (mosses that are themselves identical in gene order and content) (16) and extremely different from angiosperms. The mosslike regions in Amborella also harbor the same 27 introns and largely the same set of intergenic sequences as moss mtDNAs (fig. S6) (13).

Fig. 2 Maximum likelihood evidence for HGT in Amborella mtDNA.

(A to D) Mitochondrial gene trees of land plants and green algae reveal diverse donors in Amborella mtDNA. Colors are as in Fig. 1. See fig. S8 for outgroups. Bootstrap values ≥50% are shown. The number after each Amb (Amborella) sequence corresponds to its left-most coordinate in kb (figs. S2 and S4). Scale bars correspond to 0.1 [(A) to (D)] or 0.01 [(E) to (H)] substitutions per site. Bold branches are reduced in length by 50%. (E to H) Plastid gene trees of angiosperms showing strong support for HGT to the level of taxonomic order: light blue, Santalales [(E) and (F)]; brown, Oxalidales (G); violet, Fagales (H). Amborella labels: Amb plastid, gene in Amborella plastid; Amb IGT; gene in mitochondrion via IGT; red Amb, gene in mitochondrion via HGT. Outgroups are not shown, but see fig. S19 for more taxon-rich analyses, including outgroups. rps7 denotes the rps7-rps12-trnV-rrnS cluster.

Fig. 3 A nearly full-length moss mitochondrial genome in Amborella mtDNA.

Colored boxes and arrows indicate the position and relative orientation, respectively, of the seven blocks of synteny between the mitochondrial genome of the moss Anomodon (top) and the four moss-derived regions in Amborella mtDNA (M1 to M4) (Fig. 1A). Selected genes are shown; see figs. S2 and S6 for all genes.

The four moss regions contain one, and only one, copy of 61 of the 65 genes present in sequenced moss mtDNAs (Fig. 3 and fig. S6) (13). Taking into account six inferred deletions and duplications larger than 100 bp, the 101.8 kb of moss DNA in Amborella reconstructs to a hypothetical donor genome of 106.0 kb, compared with the 104.2- and 105.3-kb genomes in Physcomitrella and Anomodon, respectively. We infer, therefore, that Amborella captured an entire mitochondrial genome (13) from a moss with nearly identical mtDNA architecture to those of Physcomitrella and Anomodon. This foreign genome subsequently rearranged into four pieces, with a few gene-order changes and 11 gene losses, truncations, and/or partial duplications, all of which are associated with rearrangement breakpoints (Figs. 1A and 3, figs. S2 and S6, and table S2).

HGT from Green Algae

The Amborella mitochondrial genome contains an average of three green algal–derived copies of each protein and rRNA gene commonly found in green algal mtDNAs (Figs. 1A and 2, A to D; figs. S2, S4, S8, S10, and S11; and table S3). Many of these genes are clustered in two large tracts of lengths 83 and 61 kb. The 83-kb tract (B1 + A2 in Fig. 1A) contains two copies of a 10-gene cluster (each marked by 10 red arrows in the top comparison of Fig. 4), with all 10 “duplicates” highly divergent from each other. The 61-kb tract (B2 + A1 in Fig. 1A) lacks these 10 genes and instead contains highly divergent duplicates of two genes that are absent from the 83-kb tract. A single hypothesized recombination event between these two tracts (Figs. 1A and 4) accounts for the above duplications, with the initial, 92- and 52-kb regions each containing a nearly complete set of green algal mitochondrial genes and no extra copies (fig. S11). We conclude that the 83-kb and 61-kb tracts arose by acquisition of whole mitochondrial genomes (designated the A and B genomes) from two green algae, followed by a single recombination between them and a few gene losses (13). Additionally, the two inferred donor genomes are phylogenetically distinct: Whenever Amborella has three or more green algal copies of a given gene, the A-genome copy is separated by a relatively long branch from a well-supported clade containing the other green algal copies (Fig. 2, A and D, and fig. S8). Furthermore, the two regions assigned to the A genome have a lower noncoding G + C composition (39%) than the two B-genome regions (47%) (table S4).

Fig. 4 Pairwise comparisons of the green algal B-genome donor to Amborella with the A- and C-genome donors.

Brackets on the A, B, and C genomes indicate their fragmentation in Amborella (Fig. 1A). Blocks of two or more genes with identical order in a comparison are colored the same, regardless of gene orientation. Open boxes mark genes present in both genomes but not part of a syntenic block. Bullets mark genes present in only one genome.

Most of the remaining green algal mtDNA in Amborella, comprising tracts of lengths 49, 18, 16, and 2 kb (Fig. 1A and fig. S2), also appears, on the basis of synteny and genome reconstruction (Fig. 4 and fig. S11), to be derived by whole-genome transfer (from donor C). Seven of the eight remaining, mostly short tracts of green algal mtDNA (Fig. 1A) can tentatively be reconstructed as resulting from the transfer and/or retention of about one-third of a genome from a fourth green algal donor (donor D); alternatively, the D regions may result from multiple HGT events. Although the B, C, and D genomes are relatively similar in sequence (Fig. 2, A to D, and fig. S8), their many differences in gene order (Figs. 1A and 4) and intron content (e.g., cox1 has two introns in the D genome but none in B) rule out the possibility that they result from only one or two transfers followed by large-scale duplication within Amborella. We therefore conclude that Amborella acquired its ~3.3 genome equivalents of green algal mtDNA (Fig. 1A) through at least four transfers, including three whole-genome transfers.

The multiple copies of each green algal gene present in Amborella almost always ally, usually strongly, with the trebouxiophyte Coccomyxa (Fig. 2, A to D, and fig. S8). Likewise, gene order within the A, B, and C genomes is most similar to that of Coccomyxa (fig. S7). The B, C, and D copies of each gene invariably form a strongly supported clade (Fig. 2, A to D, and figs. S8 and S10), with the B + C genomes sister to the A genome in gene-loss phylogeny (fig. S12). Thus, Amborella probably acquired its green algal mtDNA from the Coccomyxa subgroup of trebouxiophytes. Because members of this subgroup often live as lichen photobionts, and lichens commonly grow on Amborella (Fig. 5), its algal genomes may have been acquired from lichens.

Fig. 5 Ecological setting of HGT in Amborella.

(A and B) Prostrate branches of Amborella with suckers (green arrows) and epiphytes, including mosses, liverworts, ferns, and angiosperms. (C to F) Amborella leaves and branches covered predominantly with lichens [(C) and (F)], leafy liverworts (D), and mosses (E). See table S11 for photo credits.

HGT from Angiosperms

Amborella mtDNA contains 150 angiosperm-like copies (full or partial) of the 49 protein and rRNA genes likely present in the ancestral angiosperm mitochondrial genome (fig. S13) (17) [see (13) for how trans-spliced genes are counted (table S5)]. We designated 82 of these copies as foreign, 63 as native, and 5 as uncertain (table S3). Angiosperm-specific phylogenetic analyses provided strong support for 26 (32%) of the foreign assignments and 16 (25%) of the natives and lower support for an additional 20 foreign and 22 native assignments (figs. S14 to S16 and tables S6 and S9). The lower support values reflect the generally poor resolution in many of the trees (fig. S14), which is a consequence of low substitution rates in most angiosperm mtDNAs (18).

Four other lines of evidence were used to distinguish foreign from native angiosperm genes and intergenic DNA. First, the extent of cytidine to uridine (C-to-U) RNA editing, which is much higher in Amborella than in all examined eudicots and monocots (table S7) (13), provided evidence for native versus foreign origin for many of the 150 angiosperm genes in Amborella mtDNA (13). Second, six genes were exceptionally divergent relative to all other genes analyzed phylogenetically (fig. S15), suggesting that they came from angiosperms with much higher mtDNA substitution rates than Amborella (fig. S17) (13, 18). Third, levels of sequence identity to other angiosperm mtDNAs were measured on a genome-wide basis to define native as well as angiosperm-HGT regions (13). Finally, native (or angiosperm-HGT) sequences defined by the above four criteria and located within 5 kb of each other were combined into continuous native (or angiosperm-HGT) tracts (13).

These analyses identified 753 kb of DNA as having been acquired from other angiosperms (Fig. 1A and figs. S2 and S4). This DNA contains an average of 2.0 copies of the 32 protein and rRNA genes that are virtually always present in angiosperm mtDNA (table S3) (17) and thus corresponds to roughly two genome equivalents of foreign angiosperm mtDNA. Most (86%) of the 753 kb is intergenic, consistent with the high proportion of intergenic mtDNA in angiosperms (11, 13). About half of the 753 kb shares ≥90% sequence identity with one or more sequenced angiosperm mitochondrial genomes (fig. S4). This far surpasses the level of highly conserved mtDNA in other angiosperms (fig. S18) (13). The 753-kb estimate is probably conservative owing to the limited number of angiosperm mtDNAs available for comparison (13).

Angiosperm Donors

One class of plastid-derived DNA played a key role in donor identification. Phylogenetic analysis shows that most of the 138 kb of ptDNA present in Amborella mtDNA was acquired through intracellular gene transfer (IGT), that is, from the Amborella plastid genome (Fig. 2, E to H, and fig. S19). Analysis of the remaining 10 kb of ptDNA, which probably entered Amborella from foreign mitochondria, identified donors with much greater specificity than did the mitochondrial gene analyses (13). Four of the HGT plastid regions identified Fagales, Oxalidales, or the predominantly parasitic Santalales as the donor, while a fifth pointed to Magnoliidae (Fig. 2, E to H, and fig. S18). A santalalean origin is also supported by four of the five mitochondrial genes for which multiple Santalales have been sampled (fig. S14, nad1b, and fig. S20). The exceptionally high and specific similarity of two featureless regions to Ricinus communis or Bambusa oldhamii (Fig. 1B and fig. S21) identified transfers from these lineages. Finally, the exceptionally high divergence that diagnosed six angiosperm-like genes as foreign also suggests that they came from additional donors, with high mitochondrial substitution rates.

Because some angiosperm-HGT tracts in Amborella mtDNA are of mixed phylogenetic origin (Fig. 1) (13), some of its foreign DNA may be the product of serial, angiosperm-to-angiosperm-to-angiosperm HGT (13). In particular, the rbcL gene of santalalean origin (Fig. 2E) resides only 3 kb from the Bambusa-derived sequence on the same 27-kb foreign tract (Fig. 1B). Because all four genes of meaningful length on this tract evidently came from core eudicots (fig. S14), and because parasitic plants are especially active in mitochondrial HGT (5, 710), this tract probably came from a santalalean donor that had previously acquired Bambusa DNA through HGT (13). The presence of santalalean DNA in six, mostly long HGT tracts (Fig. 1A) suggests that a large portion of the foreign angiosperm DNA in Amborella came from Santalales. Indeed, RNA-editing data indicate that the 27-kb tract of putative santalalean origin may actually be part of a much larger (>105 kb) HGT tract (13).

A Graveyard of Foreign Genes

The 197 foreign mitochondrial protein genes in Amborella are predominantly pseudogenes, with only 50 (25%) of them having full-length, intact open reading frames (tables S2 and S8). The intact genes are predominantly short (figs. S22 and S23), suggesting that many of these have remained intact by chance; that is, they are pseudogenes that have yet to sustain an obvious pseudogene mutation. Consistent with this, many of these intact genes are not expressed properly.

On the basis of phylogenetic, RNA editing, and/or linkage evidence (table S9) (13), Amborella mtDNA is hypothesized to contain a functional, native copy of all but one (rpl10) of the 49 mitochondrial protein and rRNA genes inferred to be present in the ancestral angiosperm (fig. S13) (17). cDNA sequencing of 44 of the 48 native genes showed that, with one apparent exception, they are all transcribed and properly RNA edited (table S10) (13). In contrast, no transcripts were detected for many genes of foreign origin, and 13 of 14 transcribed genes of foreign angiosperm origin (eight of them intact) were poorly edited, suggesting that they are pseudogenes (table S10) (13, 19).

The strongest candidates for functional replacement of native genes are tRNA genes. Several native tRNA genes are missing from Amborella mtDNA (fig. S13). These, and even some of the native tRNA genes still present (20), may have been functionally replaced by some of its dozens of intact foreign tRNA genes (figs. S2 and S4) (13). This would not be surprising, because cognate tRNAs of diverse origin (plastid, nuclear, bacterial) often replace native tRNAs in plant mitochondrial translation (6, 11, 20, 21). Moreover, even a modest number of tRNA gene replacements could have led to the fixation, through genetic hitchhiking, of a considerable portion of the foreign mtDNA in Amborella.

In summary, the great majority of the foreign mitochondrial genes in Amborella are unlikely to be functional. Given its six genomes worth of foreign mitochondrial genes, Amborella mtDNA serves as a striking example of neutral evolution.

Ancient Transfers, Remarkably Intact

Our ability to date the many mitochondrial HGTs in Amborella is limited. However, the extensive pseudogene decay of its foreign DNA (tables S2 and S8)—in conjunction with low mitochondrial substitution rates in angiosperms (including Amborella) (fig. S17) (18) and low rates of pseudogene decay (19)—suggests that most transfers are probably millions of years old (13).

Angiosperm mitochondrial genomes typically experience high rates of DNA gain, loss, and rearrangement (13, 17). Amborella mtDNA seems, however, less prone to lose and rearrange DNA. Relative to their many pseudogene mutations, the four moss and green algal whole-genome transfers are surprisingly intact with respect to overall sequence content and arrangement. Only 11% of the protein-coding sequence content inferred to be present at the time of these four transfers has been deleted, mostly due to a few single- or multigene deletions (Fig. 4, fig. S6, and tables S2 and S8) (13). The green algal A and B genomes are both intact syntenically except for a single, mutual recombination event, whereas the C and moss genomes have each been fragmented into just four segments (Figs. 1, 3, and 4). In typical angiosperm mtDNAs, comparably old and large tracts of largely nonfunctional DNA would be expected to have mostly been lost by now, and what remained to be more highly rearranged (13, 17).

Mitochondrial Fusion Drives and Limits Mitochondrial HGT

Two mechanisms have been proposed to account for the relatively high frequency of HGT in land plant mitochondria and its absence from plastids of land plants, including Amborella (6, 8, 9). First, plant mitochondria are transformation competent (22), whereas no such evidence has been reported for plastids. Second, plant mitochondria regularly fuse in vivo, whereas plastids do not (23, 24). Three aspects of the horizontally acquired DNA in Amborella argue that its entry into the mitochondrion was driven principally, if not entirely, by mitochondrial fusion—that is, this DNA entered predominantly in large pieces, including whole genomes (13), is limited to other mitochondrial genomes (13) and is limited to green algae and land plants.

Why are the many Amborella donors limited to green plants, as opposed to, for instance, fungi, given their pervasive interactions with plants as mycorrhizal partners, endophytes, epiphytes, and pathogens? We propose that this reflects a phylogenetically deep incompatibility in the mechanism of mitochondrial fusion. The mechanism of mitochondrial fusion in fungi and animals is fundamentally the same, involving a core machinery of dynamin-related guanosine triphosphatases that are absent from green plants (2527). This absence, combined with evidence for differences in the physiological requirements for fusion, has prompted speculation that mitochondrial fusion occurs by a different mechanism in angiosperms than in animals and fungi (24, 27, 28). Our data provide evolutionary support for this hypothesis and also lead us to propose that mitochondrial fusion occurs in a fundamentally similar manner across land plants and green algae (Fig. 6). This model explains why, despite presumably broad phylogenetic exposure to foreign mitochondria, the vast majority of HGT in the mitochondrion of Amborella—and other plants (210, 13)—is restricted to other plant mitochondria.

Fig. 6 Evolutionary model of mitochondrial fusion compatibilities.

Green and orange indicate different mechanisms of mitochondrial fusion (24, 25, 27, 28), due to either highly divergent evolution from a common ancestral mechanism or independent origins of fusion. See table S11 for photo credits.

Capture of Foreign Mitochondria

Biological vectors large enough to mobilize entire mitochondria, such as pollen (9, 29), insects, and fungi, could account for some of the mitochondrial HGT in Amborella (bacteria and viruses are presumably too small to transfer an entire mitochondrion). However, in light of its ecology and development, nonvectored processes involving direct contact between Amborella and potential donors probably predominate. Amborella grows in montane rainforests, often covered by a diversity of epiphytes, mostly bryophytes (including mosses) and lichens (a potential source of its green algal genomes), and sometimes even other angiosperms (Fig. 5). Amborella is often wounded and responds by producing abundant suckers (Figs. 5, A and B). Wounding can break cells belonging to both Amborella and the organisms growing on and within it. We postulate that some of the broken Amborella cells are healed and incorporated into a new meristem—a new germline arising thanks to the totipotency of plant cells. Indeed, plant meristems often form in direct response to wounding and may be especially active in “massive mitochondrial fusion” (24). Given the ease of both mitochondrial membrane fusion and mitochondrial genome recombination, those healed cells that have taken up a mitochondrion from another green plant could well incorporate a portion of the foreign mitochondrial genome. A fraction of these transfers could then become fixed.

The wounding-HGT model applies not only to plants that live on Amborella but also to parasites. The Santalales—probably the major source of foreign angiosperm mtDNA in Amborella—are also the major group of parasitic plants in New Caledonia and the largest group of parasitic angiosperms worldwide (30, 31).

Concluding Remarks

The Amborella mitochondrial genome has both captured other mitochondrial genomes whole and retained them in remarkably intact form for ages. Its assemblage of foreign mtDNA probably reflects a range of factors—ecological, developmental, and molecular—that promote the capture of foreign mtDNA and retard its loss and rearrangement. This genome highlights the potential scale of neutral evolution and is thus relevant to current debates on the issue of “junk DNA” in nuclear genomes (32). The greatest importance of this genome is mechanistic: It provides compelling support for mitochondrial fusion as the key that unlocks mitochondrial HGT and for fusion incompatibility as a major barrier to phylogenetically unconstrained mitochondrial “sex.”

Supplementary Materials

Materials and Methods

Figs. S1 to S23

Tables S1 to S12

References (3376)

References and Notes

  1. Supplementary materials for this article are available on Science Online.
  2. Acknowledgments: We thank E. Dalin, J. Gummow, and P. Lowry for assistance; R. Wing and the Arizona Genomics Institute for Amborella bacterial artificial chromosome sequences; the North and South Environmental Services of New Caledonia for collecting permits; M. Moore, P. Soltis, and D. Soltis for two unpublished plastid-genome sequences; and those individuals (see table S11) who supplied the photographs for figures. This work was supported by NIH-RO1-GM-76012 (J.D.P. and E.B.K), the U.S. Department of Energy–Joint Genome Institute Community Sequencing Program under contract DE-AC02-05CH11231 (J.D.P, E.B.K, and J.L.B), NSF-GRF-112955 (A.O.R.), NSF-DBI-0638595 (C.W.D), and the METACyt Initiative of Indiana University, funded by the Lilly Endowment. The data reported in this paper are deposited in GenBank under accessions KF754799-KF754803 and KF798319-KF798355.
View Abstract

Navigate This Article