Success and Virulence in Toxoplasma as the Result of Sexual Recombination Between Two Distinct Ancestries

See allHide authors and affiliations

Science  05 Oct 2001:
Vol. 294, Issue 5540, pp. 161-165
DOI: 10.1126/science.1061888


Toxoplasma gondii is a common human pathogen causing serious, even fatal, disease in the developing fetus and in immunocompromised patients. Despite its ability to reproduce sexually and its broad geographic and host range, Toxoplasma has a clonal population structure comprised principally of three lines. We have analyzed 15 polymorphic loci in the archetypal type I, II, and III strains and found that polymorphism was limited to, at most, two rather than three allelic classes and no polymorphism was detected between alleles in strains of a given type. Multilocus analysis of 10 nonarchetypal isolates likewise clustered the vast majority of alleles into the same two distinct ancestries. These data strongly suggest that the currently predominant genotypes exist as a pandemic outbreak from a genetic mixing of two discrete ancestral lines. To determine if such mixing could lead to the extreme virulence observed for some strains, we examined the F1 progeny of a cross between a type II and III strain, both of which are relatively avirulent in mice. Among the progeny were recombinants that were at least 3 logs more virulent than either parent. Thus, sexual recombination, by combining polymorphisms in two distinct and competing clonal lines, can be a powerful force driving the natural evolution of virulence in this highly successful pathogen.

Most parasitic protozoa and many prokaryotic pathogens possess a “clonal” population genetic structure consisting of independently propagating, genetically quite divergent clonal lineages (1–4). ForToxoplasma gondii, a widespread zoonotic pathogen with a well-described sexual cycle in cats, the vast majority of strains (>94%) fall into one of three distinct clonal lines (referred to as “type” I, II, or III) rather than showing the mixing expected of a sexual population (5, 6). Meiotic recombination in natural populations of Toxoplasma has thus not been considered a major force driving strain diversity and variation.

The three clonal types are apparently not minor or random polymorphic states of no phenotypic consequence: type I lineage strains are highly virulent in outbred mice and perhaps humans (7), whereas type II and III lineage strains are relatively avirulent (8–11). These two avirulent lines do, however, predominate and readily establish chronic infections in animals and humans (5, 11). Occasionally, however, unusual strains are isolated that, based on limited restriction fragment length polymorphism (RFLP) analyses, appear to possess a shuffled combination of alleles similar to those found in the three major lineages (5, 7, 10).

The true extent of polymorphism between the three types (“intertypic”) and within a given type (“intratypic”) could not previously be estimated, because most polymorphism data were from isoenzyme or RFLP analyses, which do not give this level of detail. We have sequenced two loci (BSR4 and SAG3) from six representatives of each of the three major types as well as 13 loci from one representative for each type. The 18 strains come from a variety of hosts (seven species) and three continents, although the majority of isolates are of pan-American origin (12).BSR4 and SAG3 are single-copy genes encoding surface antigens that are immunogenic in natural infections (13,14) and therefore are expected to possess higher levels of genetic polymorphism. We have chosen loci encoding antigenic proteins to increase the probability of detecting significant differences. Our analyses, and those from the literature, show that “housekeeping” genes possess much less polymorphism and are thus not so informative when looking for relatedness within the species (15–19).

For BSR4, all 18 strains have one or the other of only two sequences that differ at 44 positions over the 1.2 kilobases (kb) sequenced, with types I and II possessing one allele and type III the other (Fig. 1A). The complete lack of intratypic polymorphism is remarkable given the considerable breadth of geographic and host-species sampling. For the SAG3locus, no intratypic variation and only two sequence classes were found among these same 18 isolates (20). In this case, however, types I and III shared essentially the same allele, whereas the type II allele was the outlier. Types I and III did show minor intertypic variation with three unique polymorphisms (relative to the consensus) in the type I allele and four in the type III allele; in contrast, the type II allele possesses 24 polymorphisms.

Figure 1

Allele relationships by sequence and phylogenetic analysis. (A) Polymorphic nucleotide sites atBSR4 for 28 geographically disparate isolates ofToxoplasma gondii (37–39). The “A” allele was found in all six type I and six type II strains as well as ELG and P80. The “E” allele was found in all six type III strains as well as SSI, P89, and TONT. The annotated numerical positions refer to the numbered sites in the published sequence (BSR4, GenBank accession AF015290). Sites demarcated by an asterisk indicate unique polymorphic nucleotide positions; R = replacement for nonsynonymous polymorphic sites that result in an amino acid substitution within the protein coding region; S = silent for synonymous substitutions. Dashes indicate identity with the consensus sequence. (B) Analysis of 13 strains forSAG1/SAG2/SAG4 and 28 strains for SAG3/BSR4 using PAUP4.0b5. The set of 13 strains included a representative of each of the three archetypal strains [types I (RH), II (PRU), and III (CEP)], as well as five strains (P80, P89, SSI, TONT and ELG) that contain different combinations of the canonical A and E alleles and a further five strains representing the most divergent strains known. The set of 28 strains used for the SAG3 and BSR4 loci included the same 13 plus an additional five for each of types I, II, and III (none of which showed intratypic variation). Alignment was generated with Neighbor-Joining Distance program defaults (i.e., gap weight = 3.0; gap length weight = 0.1). Statistical support was generated by the Bootstrap Resampling method (1000 resamplings). Bootstrap values are shown above branch lines. Data is presented as a phylogram using midpoint rooting with horizontal distances between groups representing the percentage of nucleotide sequence divergence between samples. The bold line in each phylogram indicates where five or more strains are identical and highlights the dimorphism that predominates at each locus.

This total lack of intratypic variation agrees with several studies in which a smaller number of isolates and limited regions were analyzed for SAG1, SAG2, and GRA4(21–24). In the case of SAG1and GRA4 coding regions, our expanded sequencing (Table 1) shows that type II and III strains share the identical allele, and type I is the clear outlier with 15 and 24 polymorphisms for the two genes, respectively. ForSAG2, the situation is similar to that seen forSAG3, with the type II allele having virtually all of the polymorphisms (n = 12) and the type I and III sequences differing by only one nucleotide (Table 1). Recently, the coding region of GRA6 was sequenced, and in those strains we believe to be types I, II, or III (23 in total), again, only two allelic classes were identified (25). Studies on introns and housekeeping genes in a number of natural isolates also concluded that within-lineage allelic diversity is virtually absent (15–19), with the exception of a deletion in theROP1 locus in a single lab strain (26). The overall lack of intratypic variation, even for the immunogensBSR4 and SAG3, strongly suggests that these three clonal lineages of Toxoplasma have emerged as the dominant strains only relatively recently.

Table 1

Allelic dimorphism in the three major lineages ofToxoplasma gondii (37–39).Consensus is the nucleotide sequence common to at least two of the three archetypal strains. Each subscript identifies the number of unique polymorphisms relative to the consensus. For all 15 loci there are, at most, only two allelic classes, and these have been designated as “A” and “E” where the A allele is defined as the allelic class shared by at least two of the parasite types. Sequences where variation exists within the A allelic class were either reamplified for re-sequencing and/or checked against the Toxoplasma EST database for confirmation of the identified polymorphism.

View this table:

To determine the full extent of this dimorphism, we sequenced 15 mostly unlinked loci from archetypal type I, II, and III lineage strains (∼65 kb of total sequence) (Table 1). Again, strict dimorphism was found at 7 of the 13 additional loci with only two sequences being found in the three types. For four of the remaining six loci, the pattern was essentially dimorphic with two of the three sequences sharing an almost identical allele whereas the third was markedly different. This is identified by the subscript that denotes the number of polymorphisms different from consensus. For example, atGRA3, the type II allele has 20 differences from consensus, whereas types I and III each differ by just two nucleotides. We refer to these nearly identical alleles (defined as having <0.5% variation from consensus) as being of the same allelic class. By this definition, the remaining two loci, SRS3 and GRA2, are of a single allelic class in all three strains. Thus, all 15 loci have, at most, only two allelic classes which we have designated as “A” and “E,” where the A allele is defined as the allelic class shared by at least two of the parasite types. For those six antigenic loci where there is variation in the A allele between types, it is limited to an average of two nucleotides from consensus (<0.2%). In contrast, polymorphism between the A and E alleles is substantial, ranging from ∼1% to ∼5% for 11 of the loci. At two loci, SRS1 andSRS2, the polymorphism is only just enough (0.5%) to warrant assigning an E allele, but this designation is reinforced by the total absence of variation in the A allele among the other two types. Note that the overall ratio of substitutions in the entire coding region that are silent versus replacements (81:157;Table 1) is about what would be expected by chance genetic drift without selection for or against change in the protein sequence. Overall, these results indicate that there were two, genetically distinct founding lines or populations in the evolution of the three currently predominant lines found at present.

In order to accurately estimate the number of crosses involved in the emergence of these three successful lines, extensive polymorphism data for all three types across most or all of an entire chromosome will be required. Previous analyses of RFLP data for 36 loci gave some suggestion of this dimorphism, although the resolution of those data precluded any firm conclusions except that type II and III strains are apparently more similar to each other than either is to type I (27, 28). The data we present do not support that latter association, although too few informative loci have been examined to strongly refute it. Our data do, however, clearly indicate that differences between strains are largely due to segregation rather than drift, and so phylogenetic trees for individual loci should not be used to infer their relatedness.

The predominance of types I, II, and III among isolates collected worldwide could be the result of the species having undergone a recent population bottleneck out of which only a few highly related clones have emerged. For one of the most common protozoan parasites on Earth (at least in terms of warm-blooded vertebrates), it is hard to imagine such a bottleneck arising through chance environmental factors. Instead, our data support the theory that today's most prevalent strains comprise remarkably successful, recombinant genotypes that rapidly and effectively came to dominate the niches examined.

Strains possessing different genotypes from the three predominant lineages have been reported (5, 6, 9, 29). These latter “rare” strains, isolated primarily from exotic species or geographically remote regions, might collectively represent a panmictic, diverse gene pool on which the species relies for its ability to expand into new niches. To examine this hypothesis and to determine the extent of the genetic complexity embodied by these rare strains, we sequenced genes from 10 additional isolates, each of which possess novel genotypes as determined by isoenzyme analyses (6,29). Five antigenic loci (SAG1, SAG2A,SAG3, SAG4, BSR4) were selected, and sequencing of polymerase chain reaction (PCR)–amplified genomic DNA from these latter strains (∼50 kb in total) shows that for half of these rare strains (TONT, SSI, P80, P89, and ELG), the A and E allelic classes are the only ones identified, except that in these strains, they exist in some new, shuffled combination relative to types I, II, and III (Fig. 1B). This is consistent with these rare strains being recombinants between types I, II, and III, and/or the less successful sibling progeny or cousins from the mating(s) that gave rise to the three archetypal lineages.

For the remaining five strains (RUB, MAS, CASTELLS, VAND, and COUGAR), the A and E allelic classes again predominate, but examples of a truly novel allele (i.e., >0.5% polymorphism from either A or E) were also found at some of the loci examined (Fig. 1B). Inclusion of these more “exotic” isolates in the phylogenetic analyses, however, still clustered the vast majority of alleles (with the exception of some from MAS and COUGAR) into the same two distinct lineages even though their inclusion sometimes obscured the strict dichotomy. At the most polymorphic locus, BSR4, 21 different single-nucleotide polymorphisms were detected in these five strains (i.e., compared to the A or E alleles), and the majority of these polymorphisms (17 of 21) were possessed by the two most divergent strains: COUGAR and MAS (Fig. 1B). The COUGAR allele clearly shares a common ancestry with the A allele, and the VAND allele is closest to the E allele (Fig. 1B). The alleles in RUB and CASTELLS appear to be chimeric between A and E, but evolution of their sequence can be envisaged by genetic drift. For MAS, however, the most parsimonious explanation for the relationship between alleles at this locus requires meiotic recombination. In the related apicomplexan parasite Plasmodium falciparum, the demonstration of high-frequency intragenic recombination at antigenic loci exists as a powerful example of genetic sex driving the evolution of diversity among alleles (30–32). These results show that although drifted or chimeric alleles do exist in the most exotic strains at the most variable loci, the dimorphic trend is generally still observed and thus, even the exotic strains seem to derive from intermixing between the proposed two ancestral lineages.

Differences in virulence between the major strain types could be the result of a gradual selection for mutations and/or reassortment of existing alleles following a genetic cross (i.e., without any mutation of those alleles). The latter phenomenon would be analogous to, but very different from, hybrid vigor in diploids, whereby the impact of alleles that are deleterious when homozygous is reduced by out-breeding. In Toxoplasma, where all vegetatively growing forms are haploid, such a phenomenon cannot occur. Interactions between alleles at different loci, however, could confer altered biological properties to the progeny, including virulence. To test if this might be the explanation for why the archetypal strains are so different, despite sharing a very limited, common gene pool, we assessed the infectious properties of 16 F1 progeny from a cross between the nonvirulent ME49 (type II) and CEP (type III) strains (33).

CBA/CaJ mice were infected intraperitoneally with 103tachyzoites of each of the F1 strains and the two parental lines. The inoculum used was below the median lethal dose (LD50) for the parental strains, and so all mice were expected to survive the acute stage. Although this was indeed the case for the parental strains and most of the F1progeny, high mortality was observed during acute infection for the S23 and CL11 strains (Table 2). The S23 strain was, in fact, up to 3 logs more virulent than either parent, and even inocula containing, in theory, a single parasite were lethal. The CL11 strain was also significantly more virulent (P< 0.005) than either of the parents, although, with an LD50 of ∼500, it is not so virulent as S23. There were no differences in invasion rate, growth rate, or time to lyse a culture between the virulent F1 progeny and their avirulent siblings (20). Mice infected with S23 showed pathology similar to infections with the highly virulent type I strains: high parasite loads in the peritoneal cavity and various organs including the liver, spleen, and lung (12). No tachyzoites were detected in the peritoneal cavity, and neither was significant inflammation seen in all organs examined in mice infected with the nonvirulent sibling S22. The two seropositive S23 survivors possessed a normal cyst burden, thereby ruling out an inability of this strain to differentiate from the disseminating tachyzoite to the cystogenic bradyzoite form as a probable explanation for its heightened virulence.

Table 2

Mortality in CBA/CaJ mice after intraperitoneal infection with ME49, CEP or the F1 progeny of a cross between them (33). F1 (x14) indicates the cumulative results for 14 different F1 progeny (CL12, CL13, CL16, CL18, CL19, S21, S22, and S25-30) none of which resulted in any death using a dose of 1000 tachyzoites and a total of 85 mice. F1-S23 and F1-CL11 correspond to the 2 of 16 F1 progeny that showed a difference in virulence compared to their siblings and parents. na, not applicable.

View this table:

Because Toxoplasma is haploid and neither parent showed the phenotype, it is highly unlikely that a single locus is responsible for the heightened virulence of S23 and CL11. This possibility cannot be excluded, however, because a hypothetical virulence gene that is phenotypically nonapparent in one parental background could be epistatic to differences in the genotype at one or more other loci. To gain some insight into which genes might be responsible for the virulence, we compared the published genotype of the virulent S23 and CL11 strains to those of the other F1 progeny (33). No single region is clearly associated with virulence. Comparing S23 and CL11 to the avirulent S22 and CL12 strains, however, identified at least two regions, one on chromosome III and the other on chromosome IV, as possibly involved. The presence of only one of these two regions is clearly not sufficient because, for example, the CEP chromosome III found in S23 is also found in several avirulent F1 progeny (i.e., CL19, S21, S25, S26, and S30). One or more genes in these regions, however, could interact with loci on chromosome IV derived from the ME49 parent to give the virulent phenotype.

Another possible explanation for the results presented here is that soluble factors secreted from the two parents interact to produce the virulence. To determine if this sort of trans-acting phenomenon is operating, an equal mix of the two parents was used as the inoculum (totaling 103 parasites) and the resulting infection was monitored in mice. No mortality was seen with this mix compared to the same number of either parent alone, indicating that it is not the mixture of strain types, per se, that leads to hypervirulence.

The results presented here indicate that through random genetic reassortment, two avirulent strains of Toxoplasma gondii can give rise to highly virulent progeny. The possibility that this is the result of a chance mutation arising in the progeny cannot be formally excluded, but several facts strongly argue against it. First, the recombinant progeny were frozen after their initial expansion and thawed only for these experiments. Second, a change in virulence was seen in 2 of 16 F1 progeny and their phenotypes were different. And, third, continuous passage of nonvirulent strains over many years has never produced the magnitude of a change in virulence seen with S23. Instead, it would appear that virulence is the result of some combination of alleles at two or more loci. In combination with the population data described above, these results show that recombination can rapidly generate progeny possessing significantly altered biological properties. This suggests that type I strains and the other highly virulent natural recombinant strains so far isolated, including the recently described type IV strain (57, 9), do not possess discrete genes or “pathogenicity islands” that inevitably lead to heightened virulence, but rather they have some combination of alleles that cooperate to confer increased pathogenicity, each of which on their own is not an intrinsic virulence allele.

These results are similar to the situation seen with another haploid pathogen, the influenza virus, where once adapted to mice, a mixed infection with two nonvirulent strains can yield neurovirulent reassortants (34). This is the first example of this phenomenon, however, in a nonviral pathogen, and these data have important implications for the nature of virulence and the role of sexual recombination in the evolution of new strains able to take advantage of an ever-changing spectrum of hosts. This differs from the proposed clonal propagation theory held for certain other protozoa, which predicts that extensive genetic divergence exists between clonal lineages, as is the case in Trypanosoma cruzi (2) and perhaps Trypanosoma brucei rhodesiense (35). Instead, our data favor a selective sweep hypothesis whereby a few strains, drawing on a remarkably limited (essentially dimorphic) gene pool, emerge with a more potent assortment of alleles from rare crosses. These recombinants rapidly and effectively come to dominate amidst a background of strains with less optimal (for a given time and place) genotypes, similar to what has been described forTrypanosoma brucei brucei (35). Note that a mixed infection with different strains of Toxoplasma can produce in excess of 108 recombinant F1 progeny from a single cat (36) and so occasionally such infections can have a dramatic effect on the population biology of the species.

  • * Present address: Unité d'Immunologie Moléculaire des Parasites, CNRS URA 1960, Institut Pasteur, 25 Rue du Dr Roux, 75724 Paris Cedex 15, France.

  • Present address: Institute of Parasitology, University of Zürich, Winterthurerstrasse 266a, CH-8057, Zürich, Switzerland.

  • Present address: Department of Biomedical Sciences and Pathobiology, Virginia Polytechnic Institute, Blacksburg, VA 24061, USA.

  • § To whom correspondence should be addressed. E-mail: John.Boothroyd{at} (J.C.B.); ysuzuki{at}


View Abstract

Stay Connected to Science

Navigate This Article