Recent Origin of Plasmodium falciparum from a Single Progenitor

See allHide authors and affiliations

Science  20 Jul 2001:
Vol. 293, Issue 5529, pp. 482-484
DOI: 10.1126/science.1059878


Genetic variability of Plasmodium falciparum underlies its transmission success and thwarts efforts to control disease caused by this parasite. Genetic variation in antigenic, drug resistance, and pathogenesis determinants is abundant, consistent with an ancient origin of P. falciparum, whereas DNA variation at silent (synonymous) sites in coding sequences appears virtually absent, consistent with a recent origin of the parasite. To resolve this paradox, we analyzed introns and demonstrated that these are deficient in single-nucleotide polymorphisms, as are synonymous sites in coding regions. These data establish the recent origin of P. falciparum and further provide an explanation for the abundant diversity observed in antigen and other selected genes.

Plasmodium falciparumcauses the most virulent form of human malaria, resulting in 200 million to 300 million infections and 1 million to 3 million deaths annually (1). Genetic variation within this human pathogen facilitates its transmission and pathogenesis and limits efforts to combat the disease. In the case of P. falciparum, the issue is caught up in controversy (2, 3). Genetic variation in proteins for antigenic determinants (4), drug resistance (5–8), and pathogenesis is abundant (9–13), whereas DNA variation at silent (synonymous) sites in coding sequences appears virtually absent (14). Nevertheless, microsatellite variation within and among subpopulations is widespread (15,16). These discrepancies could be reconciled if all extantP. falciparum derived from a single progenitor that spread through the human population within the past few thousand years (14). Alternatively, codon usage may be so constrained that synonymous mutations are eliminated by selection. To resolve these possibilities, we analyzed 25 introns from eight independent isolates and found only eight single-nucleotide polymorphisms (SNPs), five of which occur within microsatellite repeats. In contrast, microsatellite polymorphisms are common within introns. Our results support the recent progenitor hypothesis and imply a high mutation rate for the creation of microsatellite repeats.

We chose introns for our analysis because introns are subject to selective constraints that differ from those for codon usage. Apart from pseudogenes, introns are among the most rapidly evolving sequences in eukaryotes (17), and they are of general utility in studies of population structure (18). We chose to analyze introns from general metabolic or housekeeping genes on chromosome 2 (19) and chromosome 3 (20), the first chromosomes completely sequenced, and examined these regions in each of eight independent isolates from diverse geographic regions including Africa, Honduras, Southeast Asia, and Papua New Guinea (21). For each intron, the target sequence was amplified by the polymerase chain reaction and the products were cloned and sequenced (21). To guard against polymerase incorporation error, we sequenced each intron in both directions from each of three clones derived from each of three independent amplifications (21).

The results demonstrate that microsatellite variation in P. falciparum is widespread within introns (Table 1), which is consistent with previous results (16, 22). Across all introns there are 71 microsatellite repeats, which we define as a region of eight or more tandem repeats of a sequence 1 to 8 base pairs (bp) in length. Among the microsatellite repeats, 36 (51%) are monomorphic and 35 (49%) are polymorphic with two or more alleles in the sample [Web fig. 1 (21)]. The tremendous amount of genetic diversity generated by the alteration of these repetitive sequences is illustrated by the microsatellite genotype of each isolate with respect to these polymorphisms [Web fig. 2 (21)]. The genotype of each isolate is unique, even among contemporary isolates from the same geographic region. The potential for microsatellite diversity in P. falciparum is also evident by the number of distinct alleles for each polymorphic microsatellite repeat (21). These data support a high rate of microsatellite mutation within introns, presumably as a consequence of replication slippage (23, 24).

Table 1

Summary of microsatellite (MS) repeat polymorphisms and SNPs observed in P. falciparum. The number of MS repeats and SNPs for each of 25 introns predicted within nine housekeeping genes is shown. The length of the intron analyzed is indicated in base pairs (bp).

View this table:

In contrast to the microsatellite variation, SNPs within the introns are rare (Table 1). Altogether, we observed eight SNPs in the introns. Among these, five were located within microsatellite polymorphisms. Across all of the 4217 bp of intron sequence (counting with respect to the 3D7 reference sequence), only 800 bp (19%) are located in polymorphic microsatellites. The excess of SNPs in the microsatellite repeats is, therefore, highly significant (P = 0.008, Fisher's exact test). This finding strongly suggests that the process of replication slippage that generates microsatellite variation (24) also increases the rate of single-nucleotide substitutions. Therefore, we have ignored the five SNPs associated with microsatellite polymorphisms in estimating the time since the most recent common ancestor of all extant P. falciparum. The excess SNPs within polymorphic microsatellite repeats may also explain the relatively high frequency of synonymous polymorphisms within amino acid repeat sequences in certain proteins, such as the circumsporozoite protein (24). Introduction of SNPs within repetitive sequence occurs in both introns and exons, but selective constraints on coding sequences results in fewer polymorphisms within coding regions of the genome. Our findings suggest that antigenic variation associated with these repeated amino acid sequences has occurred within P. falciparum, rather than by lateral transfer or some other mechanism.

Discounting the intron sequence located in microsatellite polymorphisms, we sequenced 3417 bp in each of eight isolates (total of 27,336 bp) and identified only one certain SNP. The remaining two SNPs are found in one small intron predicted by GlimmerM (25) and only in the D6 isolate, where there is evidence of alternative splicing [Web fig. 3 (21)] within the aspartyl protease gene. Therefore, we present the statistical analysis both with and without the D6 SNPs. Combining our data with those from Rich et al. (14), and assuming that the rate of nucleotide substitution in unique intron sequence and monomorphic microsatellites is equal to that for fourfold-degenerate sites in coding regions, we estimate the age of the most recent common ancestor (MRCA) of all extant P. falciparum to be in the range of 3200 to 7700 years [the estimate was obtained with equation 1 (14)]. The smaller estimate is based on the number of synonymous substitutions in an estimated 55 million years (My) since the divergence between P. falciparum and P. berghei (a rodent parasite), and the larger estimate is based on the number of synonymous substitutions in an estimated 7 My since the divergence between P. falciparum and P. reichenowi(26) (a chimpanzee parasite) (14). If the two suspect SNPs are also included, the estimated age of the MRCA, on the basis of the same assumptions, is in the range of 9500 to 23,000 years. These results are consistent with the hypothesis that all extant P. falciparum derived from a recent common ancestor. Our data are also consistent with the finding that mitochondrial DNA is virtually identical in nucleotide sequence among diverse isolates of P. falciparum(27) and with historical biogeography (28–30). Our estimates coincide with the establishment of slash-and-burn agriculture in the African rainforest less than 6000 years ago, which could have provided suitable expansion conditions for the mosquito vectors of P. falciparum(31) and adequate human population size to maintain transmission.

Although a single recent common ancestor is the most parsimonious interpretation of all the data, alternative explanations have been proposed [see (2, 3) for discussion]. These include a series of severe population bottlenecks, which seems to be inconsistent with the high level of microsatellite variation; or a series of selective sweeps, which would predict that regions of the genome left unaffected may contain a high frequency of SNPs that were present in the ancestral population. To date, there has been insufficient sequence analysis to reveal regions with excess SNPs. Some observations are inconsistent with a single common ancestor. For example, MSP1 alleles are widely divergent from one another, suggesting an ancient origin of falciparum malaria (32). Our data do not directly address MSP1; however, others (33) have suggested that a mechanism of replication slippage followed by immune selection could result in the current allelic variants. Our data are not only consistent with this model, but also provide evidence for an increased frequency of SNPs in regions of polymorphic microsatellites. An alternative explanation is that multiple tandem paralogs of MSP1 were present in progenitors and that biological and immune selection resulted in the selection of different MSP1 genes. There is evidence for expansion and contraction of the var gene family, another biologically and immunologically selected gene (11–13).

The relative lack of SNP polymorphisms in P. falciparum is seemingly inconsistent with the ability of the parasite to evade chemotherapeutic and immunological interventions by the rapid evolution of drug resistance and antigenic variation. Inasmuch as most of the genes examined so far in P. falciparum have been studied because they are associated with drug resistance or antigenic determinants, it is not surprising that most amino acid replacements so far reported are associated with adaptation to drug therapy or the immunological status of the host (34). Our data go well beyond these classes of genes in implying that P. falciparum may be unique among pathogens in having essentially no nucleotide polymorphisms apart from those that can be explained by recent mutation and positive selection. Identification of SNPs throughout the genome of P. falciparum may, therefore, reveal whether changes in regulatory or protein-coding sequences have been predominant in the evolution of human pathogenesis, drug resistance, and immune evasion.

  • * These authors contributed equally to this work.

  • To whom correspondence should be addressed. E-mail: dfwirth{at}


View Abstract

Stay Connected to Science

Navigate This Article