The Root of Angiosperm Phylogeny Inferred from Duplicate Phytochrome Genes

See allHide authors and affiliations

Science  29 Oct 1999:
Vol. 286, Issue 5441, pp. 947-950
DOI: 10.1126/science.286.5441.947


An analysis of duplicate phytochrome genes (PHYA and PHYC) is used to root the angiosperms, thereby avoiding the inclusion of highly diverged outgroup sequences. The results unambiguously place the root near Amborella (one species, New Caledonia) and resolve water lilies (Nymphaeales, ∼70 species, cosmopolitan), followed by Austrobaileya (one species, Australia), as early branches. These findings bear directly on the interpretation of morphological evolution and diversification within angiosperms.

The evolution of flowering plants fundamentally altered the biosphere. Deciphering the causes and consequences of their origin and radiation requires knowledge of phylogeny, especially the order in which branches diverged near the root of the tree (1). However, the root of angiosperms has remained unresolved, as different lines of evidence have suggested many disparate alternatives (2). This ambiguity stems in part from uncertainty surrounding the identity of their closest relatives (3–5), and from the great differences between angiosperms and all other living lines of seed plants. Such differences render homology assessments exceptionally difficult in morphological analyses, and may lead to “long branch attraction” in molecular analyses, which occurs when convergent nucleotide substitutions cause the spurious connection of highly diverged sequences (6). In analyses of angiosperms, distant outgroup sequences might connect (perhaps with confidence) to the most divergent angiosperm sequence or sequences. For this reason it has been suggested that the angiosperm root may never be resolved using nucleotide sequencevariation (7).

We used analysis of duplicate genes to root the angiosperm phylogeny without outgroups. This approach has been used to root the entire tree of life, for which outgroups are unknown (8). It seldom has been used elsewhere (9) but might be of general use, especially for rooting clades that are highly diverged from all known relatives. We reasoned that simultaneous analysis of members of a gene pair that duplicated along the line leading to extant angiosperms should yield an unrooted network of two identical (or very similar) gene subtrees connected by a single branch. If the subtrees are congruent, rooting the network along the connecting branch allows the network to fit into a rooted species tree without requiring additional hypotheses of gene duplication, sorting, or horizontal transfer events (2).

Phylogenetic analyses of phytochrome genes in green plants suggest that the phytochrome gene pair, PHYA and PHYC, diverged along the branch leading to angiosperms (2, 10).PHYA and PHYC are found in most angiosperms examined, whereas only one gene lineage related to this pair is known from other seed plants (10, 11). We obtained and analyzedPHYA and PHYC sequences from 26 angiosperms (12) representing most taxa previously suggested to be early-diverging lineages. Parsimony analysis yielded six shortest unrooted networks. The strict consensus of these networks best fits a rooted species tree in which Amborella is separate from all other angiosperms (Fig. 1). This rooting is strongly supported in both PHYA and PHYC subtrees (92% and 83% bootstrap support, respectively). Nymphaea + Cabombaceae (water lilies) and Austrobaileya diverge next in both subtrees; Austrobaileya branches first in thePHYA subtree with moderate support (66%), whereasNymphaea + Cabombaceae, or a clade including all three taxa, branches first in the PHYC subtree.

Figure 1

One of the six most parsimonious networks ofPHYA and PHYC from 26 angiosperms (1104 nucleotide sites, of which 634 are parsimony-informative). Heuristic parsimony analysis [100 random taxon addition replicates with tree bisection and reconnection swapping in PAUP* 4.0 (33)] yielded trees of 6423 steps [retention index (RI) = 0.50; consistency index (CI) = 0.24, excluding autapomorphies]. Identical components in the PHYA and PHYCsubtrees are labeled A through Q. Bootstrap percentages (from 500 replicates with the same search parameters, but using 10 random addition replicates) are above branches. Arrows indicate branches that collapse in the strict consensus.

Among the most parsimonious networks, the one shown in Fig. 1 maximizes identical components (A through Q) in the PHYA andPHYC subtrees. The remaining angiosperms diverge into a clade composed of magnoliids (H), within which winteroids (J) are sister to Piperales (K), and another clade (D) containing eudicots (G) and monocots + Chloranthus (E). Magnoliales (P) and Laurales (M) either form a clade (PHYA subtree) or are paraphyletic with respect to winteroids + Piperales (PHYC subtree). Bootstrap support for winteroids (J) and monocots (F) is relatively strong in the PHYA subtree (96% and 80%, respectively) but moderate in the PHYC subtree (66% and 78%), whereas support for Laurales (M) is strong in the PHYC subtree (100%) but weak in the PHYA subtree (57%). Conflicts between subtrees apparently are not significant (13).

The duplicate gene analysis clearly resolves the angiosperm root along the branch to Amborella, and agreements between gene subtrees are many and strong. We therefore combined PHYA andPHYC data from each species for analysis, and rooted the resulting network by Amborella. Two most parsimonious trees resulted, which are identical except for the relative position ofHernandia and Hedycarya (Fig. 2). All clades in the consensus tree correspond to components resolved in one or both gene subtrees, and most are better supported. The two branches above Amborella are resolved with reasonably strong support;Nymphaea + Cabombaceae branch first, followed byAustrobaileya (bootstrap values of 80% and 86% for the remaining angiosperms, respectively).

Figure 2

Strict consensus of the two most parsimonious trees from analysis of concatenated PHYA and PHYCsequences (2208 nucleotide sites, of which 1042 are parsimony-informative), rooted at Amborella based on the duplicate gene analysis (Fig. 1). Heuristic parsimony analysis [100 random taxon addition replicates with TBR swapping in PAUP* 4.0 (33)] yielded trees of 6175 steps (RI = 0.36; CI = 0.38, excluding autapomorphies). Bootstrap percentages (from 500 replicates with the same search parameters, but using 20 random addition replicates) are above branches. Components identical with those found in both PHYA and PHYC subtrees are labeled A through Q. The arrow indicates the branch that collapses in the strict consensus.

Together, our phytochrome analyses provide insights into angiosperm phylogeny that were not evident in earlier studies. Other analyses have suggested at least nine different angiosperm rootings (2,14), many with substantially different implications for angiosperm evolution. Only an analysis of 18S rDNA sequences (15) suggested a rooting near Amborella, but the result was equivocal. Results similar to ours are now being reported from analyses of large data sets that combine chloroplast, mitochondrial, and ribosomal sequences (16–18), or that combine many chloroplast sequences (19). This concordance implies that the question of the angiosperm root has been convincingly answered.

The conclusion that Amborella, water lilies, andAustrobaileya [which occurs in a clade with Illiciaceae and Schisandraceae (14–18) and Trimenia (17,20)] form a grade at the base of the angiosperms has major implications for early evolution in flowering plants. Factors that may have contributed to their diversification include carpel closure (21), self-incompatibility (22), and the evolution of vessels (23). Our results imply that the carpels of the first angiosperms were sealed by secretions, and that postgenital fusion of epidermal layers evolved later (24). In our trees, Chloranthus andNelumbo could represent reversions to closure by secretion; however, the placement of Chloranthus is only weakly supported. The conclusion that the first angiosperms were self-compatible plants (25) should be reexamined. Selfincompatibility has been noted inAustrobaileya and Illicium, implying either that it originated in the clade including these plants (probably early in angiosperm evolution) or that it was retained from the first angiosperms. Self-incompatibility may occur within water lilies, but the data are ambiguous (25). The condition inAmborella, with mostly unisexual flowers, is unknown, but possibly could be determined if functional stamens were found in otherwise carpellate flowers (26, 27). Our results suggest that xylem vessels evolved after the origin of the branch containing Amborella, which is vesselless. Vessels with small pores in the pit membranes are reported from most, but not all, water lilies (28), implying either that vessels originated (28) or were lost several times within water lilies. If the first water lilies had such vessels, they may have been transitional to typical vessels that lack pit membranes at maturity, perhaps through intermediates that retain remnants of pit membrane, as seen in Illiciaceae and Chloranthaceae (29). The absence of vessels in Winteraceae and the Trochodendronales may represent losses (30).

Our finding that the earliest branches within angiosperms are not species-rich could imply massive undetected extinction within these lineages. More likely it supports the conclusion that a shift in diversification rate did not coincide directly with the origin of flowering plants, but occurred later (31). Our results suggest that the first angiosperms probably were woody plants, in which case one or more shifts to the herbaceous habit may have fueled the major radiation of angiosperms.

  • * To whom correspondence should be addressed. E-mail: smathews{at}


View Abstract

Navigate This Article