A Molecular Clock for Malaria Parasites

See allHide authors and affiliations

Science  09 Jul 2010:
Vol. 329, Issue 5988, pp. 226-229
DOI: 10.1126/science.1188954


The evolutionary origins of new lineages of pathogens are fundamental to understanding emerging diseases. Phylogenetic reconstruction based on DNA sequences has revealed the sister taxa of human pathogens, but the timing of host-switching events, including the human malaria pathogen Plasmodium falciparum, remains controversial. Here, we establish a rate for cytochrome b evolution in avian malaria parasites relative to its rate in birds. We found that the parasite cytochrome b gene evolves about 60% as rapidly as that of host cytochrome b, corresponding to ~1.2% sequence divergence per million years. This calibration puts the origin of P. falciparum at 2.5 million years ago (Ma), the initial radiation of mammalian Plasmodium at 12.8 Ma, and the contemporary global diversity of the Haemosporida across terrestrial vertebrates at 16.2 Ma.

The rate of nucleotide substitution in DNA sequences can provide a molecular clock useful for inferring absolute times in phylogenetic trees (1). This rate can be estimated by direct observation over reasonable time periods, as with several viral parasites of humans (2, 3) and experimental populations of Drosophila (4). Relatively slow nucleotide substitution precludes this approach for malaria parasites, for which calibration is indirect. For some specialized parasites, phylogenetic analyses have revealed codivergence of host and parasite evolutionary lineages, which permits calibration of genetic distance in one relative to the other (57). In contrast, Plasmodium and other haemosporidian parasites of terrestrial vertebrates exhibit widespread host switching, often across considerable host taxonomic distance (811). Cospeciation cannot, therefore, provide a means of clock calibration.

In spite of evident host switching, biologists have used the ages of host phylogenetic ancestral nodes to calibrate the rate of nucleotide substitution in Plasmodium and to estimate the ages of Plasmodium lineages. For example, Ollomo et al. (12) suggested that a Plasmodium lineage newly discovered in chimpanzees diverged from another chimpanzee pathogen, Plasmodium reichenowi, 21 ± 9 million years ago (Ma) on the basis of placing the P. reichenowiP. falciparum divergence coincident with the human-chimp divergence 4 to 7 Ma. In another analysis, Hayakawa et al. (13) calibrated amino acid substitutions in three mitochondrial genes based on host-parasite codivergence of P. gonderi (a parasite of African primates) and a clade of malaria parasites of southeast Asian primates, including humans. This calibration yielded a divergence time of either 2.5 ± 0.6 million years (My) or 4.0 ± 0.9 My for P. falciparum and P. reichenowi, depending on the dating of the split between lineages of Asian and African macaques, on one hand, and Asian and African colobine monkeys, on the other hand. However, as Rich et al. (14) point out, humans could have acquired P. falciparum any time after the split of the human-chimpanzee lineage “by a single host transfer, which may have occurred as early as 2 to 3 million years ago, or as recently as 10,000 years ago.”

Here, we calibrate the mitochondrial cytochrome b nucleotide substitution rate in haemosporidian parasites of birds, relating it to the rate of cytochrome b evolution in avian hosts (15). We assume that nucleotide substitution is clocklike, so that the number of substitutions is binomially distributed, and that the distribution of switching times is uniform over the age of the endemic host taxon (Fig. 1) (supporting online text, appendix S1). Our approach requires identification of endemic parasite lineages, which in turn depends on thorough sampling. In our survey of avian haemosporidians in the West Indies (1619), we have screened extensive samples of small land birds from all the major islands, except Cuba. Moreover, many host species are endemic to individual islands. We identified seven endemic parasite lineages (appendix S2) and measured the genetic distances, based on cytochrome b, between these parasite lineages and their sister lineages, and between their hosts and the sister taxon of each host (Fig. 1). The variable of interest is the ratio (k) of the rates of nucleotide substitution between the pairs of parasite and host sequences.

Fig. 1

An approach to estimating a calibration for the rate of haemosporidian nucleotide substitution. We assume that a parasite can switch to a new host at any point with equal probability during the host’s independent evolutionary history. Although the range of switching times corresponds to the age of the contemporary host taxon, the range of genetic distances relative to the host is equal to the ratio of the parasite-to-host nucleotide substitution rate. Endemic parasites limited to a single host are suitable for analysis because their divergence from their sister taxon in a different host represents the historical event of host switching (for alternatives, see appendix S1).

The probability (p) of a nucleotide substitution at a single position over time is equal to the rate of substitution (r) × time (t), or p = rt. The mean number of substitutions is the number of nucleotides (n) times the probability, or np, and the variance is np(1 – p); for small numbers of nucleotide substitutions (p near 0), this approaches a Poisson distribution with variance np; the probability of multiple events is low enough to be ignored (20).

Because we assume that a parasite can switch to a new host any time after the ancestral host lineage splits, the expected mean for the number of substitutions separating the parasite sequences (N) isEmbedded Image (1)where k is the ratio of the rate of substitution (parasite/host). Accordingly, N = knrt/2 and k = 2N/nrt.

For the seven comparisons of host and parasite genetic divergence considered in this analysis, nrt is the estimated number of host substitutions [average 48.94 base pairs (bp), with correction for within-species variation]; the corrected parasite distances (N) averaged 15.23 bp; thus, k = 2 × 15.2/48.9 = 0.62 (appendix S3). According to this estimate, the rate of substitution in the parasite lineages is 62% of that in the host lineages. The average ratio of the parasite divergence to the host divergence for each of the seven comparisons was 0.292 ± 0.119 SD (0.048 SEM; 95% confidence limits, 0.197 to 0.387). The value of k estimated from these parasite/host ratios was 2 × 0.292 = 0.584 ± 0.096 SEM [95% confidence interval (CI), 0.394 to 0.774]. Thus, in seven comparisons of genetic distance in avian haemosporidian cytochrome b sequences, ranging from 1 to 3% sequence divergence, the estimated rate of nucleotide substitution was close to 60% of the host rate. (See appendix S4 for an analysis of the variance in genetic distances among pairs of host and parasite taxa.)

An important assumption in our calibration is that switching times are distributed uniformly over the age of the host lineage. Alternatively, new host lineages might be available immediately for “colonization,” and parasites switch quickly to the new host. In this case, the ages of endemic parasite lineages would be skewed toward the host age, with few host/parasite genetic distance ratios at low values. In our data, the ratios of parasite to host genetic distances are broadly spread between 0.14 and 0.45 (Fig. 2) and are consistent with an even distribution of switching times (appendix S5).

Fig. 2

Rank-ordered ratios of parasite-to-host genetic distances are consistent with a uniform distribution.

Assuming the rate of nucleotide substitution in Haemosporida is 0.584 times that of their avian hosts, the parasite rate can be obtained directly from calibrations of the host rate. In the case of birds, a generally agreed-upon average value for the rate of nucleotide divergence in cytochrome b is ~0.021 My−1 (21, 22). Fifty-eight percent of this rate is ~0.012 (1.2%) genetic divergence My−1 (0.002 SEM, 0.0079 to 0.0155 CI), which is equivalent to 0.83 My per 0.01 (1%) sequence divergence.

To calculate absolute divergence times for the major clades of the Haemosporida, we used two estimates of the depths of the major nodes. First, we produced DNA distance matrices for subsets of lineages using the F84 model of nucleotide substitution (appendices S6 and S7). Clades were identified on the basis of a maximum likelihood (ML) phylogenetic tree of 54 species, or lineages, of representative haemosporidian parasites, including the sister pairs analyzed here (Fig. 3A). Ages of the basal nodes within designated clades were calculated as the means of the distances between all pairs of species or lineages descending from the two branches emanating from the node. Second, we produced a phylogenetic tree under a strict clock assumption, which resulted in branch lengths proportional to time. The resulting phylogeny is rooted between the clades of mammalian and avian-reptilian parasites and has a topology similar to that of the ML tree rooted at this point (Fig. 3B).

Fig. 3

Phylogenetic trees for representative haemosporidian cytochrome b sequences. (A) Tree produced by maximum likelihood optimization under a GTR + Γ model of nucleotide evolution. (B) Tree produced under a GTR + Γ model of nucleotide evolution using a strict clock. Letters in (B) indicate aged nodes (see Table 1). Calibration pairs in both panels are indicated in red.

The estimated node depths in the clock-enforced (ultrametric) tree match reasonably well those calculated independently from sequence distances (Table 1), particularly with respect to the origins of the major clades: rodent versus Old World monkey (OWM) (ultrametric, 9.3 My, and F84, 8.7 My); basal node in mammalian Plasmodium (11.9 and 12.8, respectively); avian Haemoproteus versus Plasmodium (9.3 and 9.0); avian-reptilian versus mammalian Haemosporida (16.9 and 16.2). Although the estimated age of the split between P. gonderi and the Plasmodium parasites of Asian Old World monkeys was similar (2.8 and 2.8), estimated ages for many of the other mammalian nodes were younger in the clock-enforced tree, including nodes within the rodent malarias and the node ancestral to P. falciparum and P. reichenowi (1.2 and 2.5) (appendix S7).

Table 1

Genetic distances based on the mitochondrial cytochrome b gene and estimated ages of principal nodes in a phylogenetic tree of the Haemosporida. Genetic distances were obtained in Phylip-3.69 (program dnadist.exe) using the following default settings: F84 model, Ts/Tv = 2.0, homogeneous substitution rate. The depth of each node was calculated as the average pairwise distance between sequences on either side of the node. Upper and lower confidence limits are based on the 95% CI calculated for the ratio of parasite-to-host nucleotide substitution. No standard deviation (SD) is available for (B) and (G) because only a single sequence from each side of the split was used to estimate the genetic distance.

View this table:

The divergence of the human P. falciparum from the chimpanzee P. reichenowi dates to either 2.5 Ma (F84 distance) or 1.2 Ma (ultrametric distance), which is considerably more recent than the estimated divergence time of their hosts, on the basis of fossil evidence and corroborated by molecular dating (4 to 7 Ma) (23). The divergence times between the parasites of African and Asian Old World monkeys, estimated—if we assume codivergence—from primate fossil evidence at 6 and 10 My (24) and used by Hayakawa et al. (13) to calibrate age on their parasite phylogenetic tree, date to 2.8 Ma with both F84 and ultrametric distances.

The basal node in the phylogeny of contemporary haemosporidian parasites in the genera Plasmodium and Haemoproteus can be dated to between 16 and 17 Ma, well after the evolutionary diversification of their hosts, as well as fossil evidence of the parasites in dipteran vectors preserved in amber (appendix S8). Thus, the history of the contemporary Haemosporida is one of rapid diversification and spread through the terrestrial vertebrate classes (10, 25). In addition, both the avian-reptilian and mammalian parasite clades have long stems, indicative of considerable pruning (extinction) of lineages since their origin. With respect to cytochrome b [but not other genes such as the mitochondrial cytochrome oxidase I and the apicoplast caseinolytic protease C (ClpC)], the divergence between bird-reptile and mammal parasite clades involved substantial protein evolution (i.e., nonsynonymous nucleotide substitution) possibly associated with the shift between nucleated and nonnucleated erythrocytes (26). Shifts between avian and reptilian hosts have occurred more recently and likely several times (27, 28); birds and reptiles both have nucleated erythrocytes.

The age estimates for nodes in the parasite phylogeny emphasize that a new disease might emerge in a host soon after its origin and at any time thereafter. The lineage of Plasmodium falciparum evidently has infected the ancestors of humans for several million years and likely was relatively benign through much of that period, as is the case of most haemosporidian parasites (29, 30). The recent expansion of the P. falciparum population, evidenced by its low genetic diversity (31), and the emergence of malaria as a major disease in humans, almost certainly was associated with the origins of agriculture and increasing population density, as well as large-scale movements of humans and introduction of the parasite to susceptible human populations (29, 30, 32).

The haemosporidian parasites of terrestrial vertebrates apparently began to diversify ~20 Ma, possibly displacing other types of parasites in the phylum Apicomplexa and, through host switching, bridging several hundred million years of vertebrate evolution. Because of their prevalence and broad distribution among terrestrial vertebrates, haemosporidian parasites make an excellent model system for investigating host-parasite coevolutionary relationships, host switching, and emerging diseases. A time calibration for the evolution of the group now provides a context for haemosporidian evolution with respect to host diversification, biogeographic distribution, and environmental change.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 and S2


References and Notes

  1. Materials and methods are available as supporting material on Science Online.
  2. We thank S. S. Renner and two anonymous reviewers for constructive comments on the manuscript. The study was supported by NSF DEB 0542390 to R.E.R., who also acknowledges the generous support of the Curators of the University of Missouri and the Alexander von Humboldt Foundation.
View Abstract

Navigate This Article