Technical Comments

Pleistocene Speciation and the Mitochondrial DNA Clock

Science  11 Dec 1998:
Vol. 282, Issue 5396, pp. 1955
DOI: 10.1126/science.282.5396.1955a

John Klicka and Robert M. Zink (1) used pairwise mitochondrial DNA (mtDNA) distance data and a 2% per million year (My) “mtDNA clock” to examine whether Late Pleistocene (≤250,000 years ago) glaciations may have been an important mechanism of speciation in North American songbirds. They conclude that sequence divergence values and corresponding estimates of times of evolutionary divergence between presumptive sister pairs of North American songbirds were sufficiently large to reject a Late Pleistocene Origins model (LPO model, 2) for most species. Furthermore, they conclude that the majority of North America's “youngest” species of songbirds originated in the Late Pliocene or Early Pleistocene, which would suggest that Pleistocene glaciation, in general, did not play an important role in shaping patterns of speciation in this group. These conclusions are not supported by the data in the report (1).

There are three major problems with the report by Klicka and Zink: (i) the authors apparently assumed that dates of divergence can be accurately estimated by dividing observed mtDNA divergence values, uncorrected for saturation (superimposed substitutions) (3), by an uncorrected rate (2% per My) of mtDNA evolution; (ii) they did not provide a measure of error associated with their estimated dates; and (iii) they did not provide a test of their implicit assumption that a molecular clock holds for their data.

In order to estimate dates of divergence from DNA sequence distance data under the assumption of a molecular clock, the number of substitutions that have occurred since two sequences diverged must be estimated under an appropriate model of nucleotide substitution. This applies to the taxonomic group for which dates are to be estimated (that is, North American passerines) and also for the group or groups on which the rate calibration (the “clock”) is based. Otherwise, because observed sequence divergence does not accumulate linearly over time, the rate of substitution will be underestimated, and estimated dates of divergence will be biased. Only when all of the distances are adequately corrected for superimposed substitutions will the effect of time (and saturation) be factored out (4).

To illustrate this point, we obtained Klicka and Zink's original cytochrome b sequence data [available for 21 of the 35 species pairs they examined; sources “d,” “k,” and “l” in table 1 of (1)] (5). We then used likelihood ratio tests (LRTs) to determine the best-fit model of nucleotide substitution for the 21 species represented by at least 1000 base pairs (bp) from the cytochrome b gene [source “k” in table 1 of (1)] (6). The best fit model, the gamma-HKY85 (6), was then used to correct for superimposed substitutions for all 21 species pairs for which cytochrome b data were provided (5).

We then used this same approach to determine the best fit model and to correct for superimposed substitutions in the cytochrome bgene of two groups for which a 2% per My mtDNA clock has been proposed: primates (great apes plus humans), the group on which the original 2% per My mtDNA clock was based (7), and galliform birds (chicken Gallus gallus plus the partridge genus Alectoris) (8), one of the avian groups cited by Klicka and Zink as having a 2% per My rate of mtDNA evolution. With the use of published estimates for the date of divergence between chimpanzee and human (7), and between chicken and partridge (8), we were able to estimate a corrected rate of substitution for the cytochrome b gene in these two groups and, in the spirit of Klicka and Zink's study, use the corrected rate of substitution to re-estimate dates of divergence for the 21 species pairs of North American songbirds for which cytochrome b data were made available (5).

For primates and galliform birds the rate of substitution estimated for the cytochrome b gene under the best-fit model (the gamma-HKY85 model in each case) was 0.0278 and 0.0252 substitutions per site per lineage per My, respectively, or more than 2.5 times faster than the rate predicted by a 2% per My mtDNA clock (0.01 substitutions per site per lineage per My) (9). This increase in the estimated rate of substitution is directly related to saturation. The best fit model uses a gamma distribution to incorporate among-site rate heterogeneity and predicts that the actual number of substitutions that have occurred since the divergence of chimpanzees and humans, and since the divergence of chicken and Alectoris, is much greater than the number that can be directly observed, and considerably more than the number predicted by models of nucleotide substitution that do not address among-site rate heterogeneity (10).

Correcting for saturation under the gamma-HKY85 model has a large effect on estimated dates of divergence and temporal patterns of speciation in North American songbirds (Fig. 1). Whereas a scenario based on uncorrected distances and a 2% per My mtDNA clock would suggest that 86% of the 21 avian species pairs diverged before the Pleistocene, corrected distances and a corrected rate of 0.025 substitutions per site per lineage per My predict that 76% of these species pairs have mtDNA divergences estimated to have occurred within the Pleistocene (11). Furthermore, because mtDNA haplotypes diverge within a common ancestral population before species are formed, species are younger than the ages estimated for the coalescence of their haplotypes (typically by several hundred thousand years in birds) (12). Therefore, based on our corrections, a Pleistocene origin can be safely ruled out for only a few of the 21 species pairs for which cytochrome b data were made available; the majority of avian species pairs would have divergences in the early-to-middle Pleistocene, a time of major glacial activity in North America (13). Approximately 10% of these divergences would fall within the last few hundred-thousand years.

Figure 1

Frequency distributions of estimated dates of mtDNA divergence for 21 presumptive sister pairs of North American songbirds based on: (top) uncorrected mtDNA distances (5) and a 2% per My mtDNA clock; and (bottom) corrected values of mtDNA sequence divergence estimated under the best fit gamma-HKY85 model and a corrected rate of substitution for the cytochromeb gene of 0.025 substitutions per site per My (9,16). Dates based on uncorrected values of sequence divergence are considerably younger than those based on uncorrected values (17).

The second problem we address is that Klicka and Zink do not provide a measure of error associated with their estimated dates of divergence. The error inherent in estimating dates of divergence using a molecular clock tends to be quite large (14). For example, based on the confidence limits depicted by Hillis et al. (14) for dates of divergence estimated by regression from the original primate 2% per My mtDNA clock (7), none of the estimated dates of divergence reported by Klicka and Zink would have a lower 95% confidence limit that did not include zero years (that is, the present). Therefore, even if we were to overlook the issue of saturation and accept Klicka and Zink's estimated dates, there would still be little statistical support for rejecting the LPO model.

Finally, Klicka and Zink's conclusions depend on their assumption that a molecular clock holds for their data, yet they did not provide a test of this assumption. We used a likelihood ratio test (LRT) to test for a molecular clock in the 21 species of North American songbirds represented by ≥1000 bp of cytochrome b data. The results call for rejection of the molecular clock hypothesis (15). Therefore, even if Klicka and Zink had addressed the issues of saturation and error outlined above, their study would still not be valid because the songbird sequences they examined are not evolving in a clock-like manner.

Accurately estimating dates of divergence from molecular data is, at best, a challenging process. Saturation, error, and differences in the rate of molecular evolution among lineages must be addressed before strong biological conclusions can be drawn from evolutionary dates based on molecular clocks. Although the LPO model may not accurately reflect temporal patterns of speciation in North American songbirds, it cannot be rejected on the basis of the report by Klicka and Zink.


Response: In our report (1), we challenged the conventional notion that a previously defined set of North American songbird (order Passeriformes) species pairs originated as a consequence of being isolated during the last one [100,000 years before the present (B.P.)] or two (250,000 years B.P.) cycles of North American glaciations (2). Mitochondrial DNA (mtDNA) sequence divergences calculated for 35 such pairs of sister species differed on average by 5.1%. This value is an order of magnitude greater than the amount of divergence expected of species that originated within the last 250,000 years (the Late Pleistocene as we defined it). On this basis, and two other lines of evidence (3), we rejected the prevailing model of “Late Pleistocene Origins” (LPO) for this particular group of birds. We are gratified that Arbogast and Slowinski's “reanalysis” supports our main conclusion. Even accepting their recalibration, for the moment, 19 of the 21 (90%) species pairs that they examined diverged over 1 million years ago [figure 1 (bottom) of the comment]. There are, however, problems with their analysis.

Our original conclusions were derived from three main assumptions regarding North American songbirds of recent origin: (i) that rates of sequence evolution are constant (clock-like) among species pairs; (ii) that uncorrected molecular distances provide a reasonably accurate measure of molecular evolution; and, (iii) the 2% per My rate (4) of evolution is a reasonable divergence rate (5) for this set of species. Arbogast and Slowinski contend that these assumptions are biased such that our study fails to provide a test of the LPO. We disagree.

The first challenge raised by Arbogast and Slowinski concerns our assumption that our sequences are evolving in a clock-like manner. They purport to demonstrate that our data show among-taxon rate heterogeneity, thus invalidating our study. In fact, the highly significant likelihood ratio test (LRT) that Arbogast and Slowinski report is an erroneous result of enforcing the molecular clock assumption on an improperly rooted phylogeny. When rooted correctly the results of the LRT for a molecular clock are not significant (6), indicating that the assumption of homogeneous rates among taxa is valid for this data set (7, 8). Thus, their conclusion that among-taxon rate heterogeneity negates our test of the LPO is spurious.

A second disagreement concerns the time at which significant levels of saturation (multiple substitutions over time at the same base position) occur. The maximum likelihood model Arbogast and Slowinski used (gamma-HKY85) is presumed to estimate more accurately the true number of substitutions that have accrued between two DNA sequences since they diverged from a common ancestor. For example, the average uncorrected divergence for the three Passerinabuntings in our common data set is 6.63%. The gamma-corrected divergence estimate for these same three closely related species averages 8.76%, an increase of 32% due entirely to putative saturation effects. As another example, the human-chimpanzee observed distance of 11% has a gamma-HKY85 distance of 27.4%. Thus, the method used by Arbogast and Slowinski would indicate that saturation is substantial, if not enormous, even at relatively low levels of divergence (9).

This conclusion, however, conflicts with empirical evidence. Of the mtDNA distance estimates obtained in our study, 94% (33 of 35) differed by less than 10%. Most studies suggest that saturation would not bias rate calibrations until uncorrected sequence divergences exceed this value (10) and all plots of avian mtDNA genetic distances (by codon position) versus time, of which we are aware, are linear within this range. For avian cytochrome b data, the evidence suggests that “progress toward transition saturation accelerates between 10% and 18% divergence” (11). More compelling, a plot of distances derived from cytochrome b(α = 0.22) versus those for a nuclear intron (β-fibrinogen intron 7, α = 0.89) for woodpeckers (order Piciformes) is linear to approximately 13% (12). Because noncoding introns are thought to be relatively less biased measures of time, this correlation is strong support for the linear relationship of cytochromeb divergence and time at the evolutionary level we considered. In contrast, the maximum likelihood model of Arbogast and Slowinski suggests significant non-linearity by 5% sequence divergence. The discrepancy between the empirical evidence and their conclusions is likely due, in part, to the maximum likelihood model that they used overestimating saturation, at least at low levels of molecular divergence (13).

A third issue concerns calibration—“setting” the clock. Arbogast and Slowinski derive a “universal” vertebrate substitution rate of roughly 5% per My from primate and galliform data sets. Arbogast and Slowinski did not justify the extrapolation of substitution and rate parameters derived from older and unrelated taxa onto recently evolved songbirds (14). That primates display significantly heterogeneous rates of mtDNA evolution has been established elsewhere (8). Sequences not evolving in a clock-like manner would seem to be a questionable source with which to calibrate a general vertebrate clock. The partridge (Alectoris)-chicken (Gallus) calibration also does not inspire confidence. Curiously, Arbogast and Slowinski used the original partridge data of E. Randi (15), but not the age of the fossil that Randi, after examination of all available data, considered correct for calibration purposes. It would seem that Arbogast and Slowinski chose, from among a range of potential values (8 to 20 My), the “fossil” date that yielded a substitution rate most similar to the one obtained for primates. The date that they did choose to represent the time of partridge-chicken divergence (17 My B.P.) in fact represents an indirect “provisional estimate” (16) that was obtained from restriction mapping of nuclear genes. Calibrations based on other molecular markers are generally considered inappropriate (17).

If possible, calibrations should be derived from within the group of organisms for which they are used (17). Arbogast and Slowinski do not mention the only relevant calibration available (18), that of the Hawaiian honeycreepers. This study has relevance to our work in that (i) it considers songbirds of similar body size and generation length (19), (ii) these species have recent origins, and (iii) the calibration dates (emergence times for three different islands) are recent and well established. With the use of cytochrome b sequence data and similar analytical methods (a maximum likelihood model with a gamma correction) to obtain divergence estimates, Fleischer et al. (18) obtained a substitution rate of 0.008 per site per lineage per million years (1.6% per My), a rate very different from those (over 5%) derived from primates and fowl by Arbogast and Slowinski. This songbird calibration suggests that the plot of divergence values [figure 1 (top) of the comment] would be pushed slightly to the right (older), not to the left as Arbogast and Slowinski's reanalysis [their figure (bottom)] would indicate (20). In sum, the difference in the two histograms (figure 1 in the comment) stems from (i) recomputed mtDNA distances corrected for saturation, and (ii) a calibration of these distances based on primates/fowl. Both aspects are not correct.

Arbogast and Slowinski note that stochastic error associated with a molecular clock may be nontrivial. A general regression of separation times on sequence divergence for birds is lacking for the reasons they suggest. Although regression error values are typically large, this is, in part a statistical artifact resulting from an inadequate number of calibration points (that is, accurate fossil dates). We agree that more independent and recent fossil calibrations are needed, but this discussion detracts from our main focus on songbird diversification during the most recent 250,000 years. It is difficult to envision a plausible clock correction that would compress 5% sequence divergence into the last 250,000 years. However, a relevant regression would be constructed using songbird divergences. In their study of Hawaiian honeycreepers, Fleischer et al. (18) compared gamma-corrected cytochrome b distances with island emergence times in a regression analysis. The tight linearity of their plot (Mantel matrix r = 0.995, P = 0.018) implies the existence of a predictable rate of molecular evolution in recently evolved songbirds with a higher degree of precision than Arbogast and Slowinski recovered by using the primate regression (21).

The conclusions that can be supported by the reanalysis in the comment differ little from our own. Both analyses falsify the LPO model of speciation, and both support (see figure 1 of the comment) our contention that (1, p. 1668) “the majority of the ‘youngest’ songbird species have late Pliocene or early Pleistocene origins.” Arbogast and Slowinski state that we (1) suggest that “Pleistocene glaciation, in general, did not play an important role in shaping patterns of speciation in this group.” In fact we (1, p. 1668) stated, “Periodic glacial cycles may have strongly influenced the diversification of the North American songbird fauna…” albeit over a more extended period (22). We stand by our original conclusion (1) that the LPO model of North American songbird evolution is not correct.


Related Content

Navigate This Article