Research Article

Whole-genome analyses resolve early branches in the tree of life of modern birds

See allHide authors and affiliations

Science  12 Dec 2014:
Vol. 346, Issue 6215, pp. 1320-1331
DOI: 10.1126/science.1253451
  1. Fig. 1 Genome-scale phylogeny of birds.

    The dated TENT inferred with ExaML. Branch colors denote well-supported clades in this and other analyses. All BS values are 100% except where noted. Names on branches denote orders (-iformes) and English group terms (in parentheses); drawings are of the specific species sequenced (names in table S1 and fig. S1). Order names are according to (36, 37) (SM6). To the right are superorder (-imorphae) and higher unranked names. In some groups, more than one species was sequenced, and these branches have been collapsed (noncollapsed version in fig. S1). Text color denotes groups of species with broadly shared traits, whether by homology or convergence. The arrow indicates the K-Pg boundary at 66 Ma, with the Cretaceous period shaded at left. The gray dashed line represents the approximate end time (50 Ma) by which nearly all neoavian orders diverged. Horizontal gray bars on each node indicate the 95% credible interval of divergence time in millions of years.

  2. Fig. 2 Metatable analysis of species trees.

    Results for different genomic partitions, methods, and data types are consistent with or contradict clades in our TENT ExaML, TENT MP-EST*, and exon-only trees and previous studies of morphology (15), DNA-DNA hybridization (24), mitochondrial genes (14), and nuclear genes (17). Letters (A to DD and a to e) denote clade nodes highlighted in Fig. 3, A and B, of the ExaML and MP-EST* TENT trees. Each column represents a species tree; each row represents a potential clade. Blue-green signifies the monophyly of a clade, and shades show the level of its BS (0 to 100%). Red, rejection of a clade; white, missing data. We used a 95% cut-off (instead of a standard 75%) for strong rejection due to higher support values with genome-scale data. The threshold for the mitochondrial study was set to 99% due to Bayesian posterior probabilities yielding higher values than BS. An expanded metatable showing partitioned ExaML, unbinned MP-EST, and additional codon tree analyses is shown in fig. S2.

  3. Fig. 3 Evidence of ILS.

    (A) Cladogram of ExaML TENT avian species tree, annotated for nodes from Fig. 2 (letters), for branches with less than 100% BS without and with (parentheses) third codon positions, for strong (>75% BS) intron gene tree incongruence and congruence, and for indel congruence on all branches (except the root). Thin branch lines represent those not present in the MP-EST* TENT of (B). (Inset) ExaML branch lengths in substitution units (expanded view in fig. S7). Color coding of branches and species is as in Fig. 1. (B) Cladogram of MP-EST* TENT species tree, annotated similarly as in the ExaML TENT in (A). Thin branch lines represent those not present in the ExaML TENT of (A). (C) Percent of intron gene trees rejecting (≥75% BS) branches in the ExaML TENT species tree relative to branch lengths. Letters denote nodes in (A) that either have less than 100% support or are different from the MP-EST* TENT in (B). (D) Percent of intron gene trees supporting (≥75% BS) branches in the ExaML TENT species tree relative to branch lengths. (E) Indel hemiplasy [the inverse of percent of retention index (RI) = 1.0 indels that support the branch; see SM9] correlated with ExaML TENT branch length (log transformed). r2, correlation coefficient. (F) Indel hemiplasy correlated with ExaML and MP-EST TENT internal branch divergence times in millions of years (log transformed). Plotting with internal branch times was necessary to compare both trees (SM9). (G) TE hemiplasy with owls among the core landbirds. Line color, shared TE tree topology; line thickness, relative proportion of TEs that support a specific topology (total numbers shown in the owl node). Circles at end of lines indicate loss of the TE allele in that species after ILS, as the sequence assembly contains an empty TE insertion site (SM10). Only topologies with two or more TEs are shown. (H) TE hemiplasy with songbirds among the core landbirds.

  4. Fig. 4 Species trees inferred from concatenation of different genomic partitions.

    (A) Intron tree. (B) UCE tree. (C) Exon c12 tree. (D) Exon c123 tree. The tree with the highest likelihood for each ExaML analysis is shown. Color coding of branches and species is as in Fig. 1 and fig. S1. Thick branches denote those present in the ExaML TENT. Numbers give the percent of BS.

  5. Fig. 5 Comparisons of total support among species trees and gene trees.

    (A) Average BS across all branches of species trees from varying input data as in Fig. 2, ordered left to right from lowest to highest values. (B) Numbers of incompatible branches (out of 45 internal), at different support thresholds, with the ExaML TENT tree, ordered left to right from most to least compatible (expanded analysis in fig. S6). (C) Analyses of intron, exon, and UCE gene tree congruence and incongruence with nodes in the ExaML TENT, MP-EST* TENT, and other species trees. Names and letters for clades are as in Figs. 2 and 3. “Missing” denotes the case in which an ortholog is not present for any taxa or is present for only one taxon, and hence monophyly cannot be determined. “Partially missing” indicates the case in which some taxa are missing but at least two of the taxa are present, and thus we can still categorize it as either monophyletic or not. For further details, see SM7.

  6. Fig. 6 Life history incongruence in protein-coding trees.

    (A) Species tree inferred from low–base composition variance exons (n = 830 genes) graphed with branch length, third codon position GC (GC3) content (heatmap), and log of body mass (numbers on branches). (B) Species tree inferred from high–base composition variance exons (n = 830 genes), graphed similarly as in (A). The %GC3 scale is higher and ~10 times wider for the high-variance genes, and the branch lengths are ~3 times longer [black scales at the bottom of (A) and (B)]. Color coding of species’ names is as in Fig. 1. Cladograms of trees in (A) and (B) are in figs. S16, A and B. (C and D) Correlations of branch length with GC content (C) and body mass (D) of the low-variance and high-variance exons. Correlations were still significant after independent contrast analyses for phylogenetic relationships (SM11). (E and F) Relative chromosome positions of the low-variance (E) and high-variance (F) exons normalized between 0 and 1 for all chicken chromosomes and separated into 100 bins (bars). The height of each bar represents the number of genes in that specific relative location. The two distributions in (E) and (F) are significantly different (P < 2.2 × 10–16, Wilcoxon rank sum test on grouped values). For further details, see SM11.

Stay Connected to Science