Research Article

A Comprehensive Survey of the Plasmodium Life Cycle by Genomic, Transcriptomic, and Proteomic Analyses

See allHide authors and affiliations

Science  07 Jan 2005:
Vol. 307, Issue 5706, pp. 82-86
DOI: 10.1126/science.1103717

Abstract

Plasmodium berghei and Plasmodium chabaudi are widely used model malaria species. Comparison of their genomes, integrated with proteomic and microarray data, with the genomes of Plasmodium falciparum and Plasmodium yoelii revealed a conserved core of 4500 Plasmodium genes in the central regions of the 14 chromosomes and highlighted genes evolving rapidly because of stage-specific selective pressures. Four strategies for gene expression are apparent during the parasites' life cycle: (i) housekeeping; (ii) host-related; (iii) strategy-specific related to invasion, asexual replication, and sexual development; and (iv) stage-specific. We observed posttranscriptional gene silencing through translational repression of messenger RNA during sexual development, and a 47-base 3′ untranslated region motif is implicated in this process.

Rodent malaria parasite species provide model systems that allow issues to be addressed that are impossible with the human-infectious species Plasmodium falciparum and P. vivax (1). Three closely related species, P. chabaudi, P. yoelii, and P. berghei, are in common use in the laboratory. Comparative sequencing and analysis of the genomes of such model species, in addition to the complete genome sequence of P. falciparum (2), provide insights into the evolution of Plasmodium genes and gene families (3).

The malaria parasite differentiates into a series of morphologically distinct forms in the vertebrate and mosquito hosts. It alternates between morphologically related invasive stages (sporozoite, merozoite, and ookinete) and replicative stages (pre-erythrocytic, erythrocytic-schizont, and oocyst) interposed by a single phase of sexual development that mediates transmission from the human host to the anopheline vector (1). This report integrates genome sequence analyses of P. berghei and P. chabaudi with transcriptome and proteome data for P. berghei, allowing the categorization of protein expression, the analysis of regulation mechanisms for gene expression, and the identification of species-specific gene families and genes under selective pressure.

Genome sequencing and annotation. Partial shotgun sequencing (4) of the genomes of P. c. chabaudi (AS clone) and P. berghei (ANKA clone) generated assemblies of ∼17 and ∼18 Mb, respectively (Table 1 and table S1). Orthologous genes of these two genomes and of P. y. yoelii (3) and P. falciparum (2) were inferred through bidirectional BLAST searches (Table 2). Combining the gene predictions of the three rodent parasites revealed that 4391 genes had orthologs in P. falciparum. These orthologs represent a universal Plasmodium gene set (table S2), which was mainly distributed across the central “core” regions of the 14 P. falciparum chromosomes. For example, in the core region of P. falciparum chromosome 2, 144 of 158 genes had rodent parasite orthologs (Fig. 1), whereas in the subtelomeric regions, only 3 of 65 genes showed (low) homology to rodent parasite genes (figs. S1 to S14). In addition to BLAST analysis, we manually examined the orthology of gene models on the basis of the conservation of gene order between the rodent parasites and P. falciparum, resulting in the identification of an additional 109 orthologs (table S3). There were no orthologs in the rodent parasite genomes for 736 P. falciparum genes, and 161 of these were located in the core regions (table S3). The other 575 are located in the subtelomeric regions, and Markov clustering (5) of these P. falciparum–specific genes revealed that almost half could be assembled into 12 distinct gene families (Fig. 1 and table S4). Only five subtelomeric gene families are obviously shared between all the sequenced Plasmodium species (table S4) (6). Previous studies have shown that a subtelomeric gene family of P. vivax, the P. vivax interspersed repeats (vir) (7), has related gene families in P. berghei (bir), P. chabaudi (cir), and P. yoelii (yir) (8, 9), and we suggest pir (Plasmodium interspersed repeats) to collectively describe the families. The bir and cir families code for highly variable proteins that share ∼30% sequence identity at the amino acid level. The copy number appears to be much higher in P. y. yoelii (>800 copies) compared to P. berghei (180 copies) and P. c. chabaudi (138 copies).

Fig. 1.

Schematic map of P. falciparum chromosome 2. Arrow heads and boxes represent genes and their orientation on the DNA molecule. Thin and thick vertical lines represent 1-kb and 10-kb intervals, respectively. P. falciparum genes with orthologs in the rodent malaria parasite genomes are marked in shades of blue according to their degree of similarity, from light blue (indicating 40% identity) through dark blue (indicating 100% identity); white genes show <40% identity to their closest ortholog. Weak orthologs not detected by reciprocal BLAST analyses are indicated in dark gray and in light gray if the gene is absent in all rodent malaria parasite genomes. P. falciparum genes with no detectable ortholog are classified as follows: orange, var, rif, and stevor gene families; yellow, centrally located expanded gene families shared with the rodent malaria parasite; red, all other P. falciparum orphan genes. A full list of these genes and their classification can be found in table S3. Shaded areas of the map indicate the boundaries of the conserved chromosome core. Transcriptome and proteome data are marked above each gene where available. Transcripts that are up-regulated in asexual and gametocyte stages are shown as red or green horizontal lines, respectively; yellow lines denote genes that are up-regulated in both stages. Protein expression data are indicated by use of a bar code in which shading of each level indicates the following: top bar, sporozoite; second bar, oocyst; third bar, ookinete; fourth bar, gametocyte; and lowest bar, asexual stages. The identifier for every fifth gene (e.g., PFI0025c) is indicated. Schematic maps of all the P. falciparum chromosomes are shown in figs. S1 to S14.

Table 1.

Genome summary statistics. A more detailed set of statistics is given in table S1.

Statistic P. bergheiP. c. chabaudiP. y. yoeliiP. falciparum
Size (bp) 17,996,878 16,866,661 23,125,449 22,853,764
No. contigs 7,497 10,679 5,687 93
Average contig size (bp) 2,400 1,580 4,066 213,586
Sequence coverage 14.5×
No. protein coding genes 5,864View inline 5,698View inline 5,878 5,268
  • View inline* An excessive number of gene models were predicted for P. berghei and P. c. chabaudi because of the fragmented nature of the genome sequence data for these species. Thus, the gene numbers indicated are for gene predictions where orthologs were identified in other Plasmodium species only.

  • Table 2.

    Genome comparisons between the four sequenced Plasmodium species. Av., average; P.b., P. berghei; P.c., P. chabaudi; P.f., P. falciparum; P.y., P. yoeli.

    Statistic P.b. vs. P.y.P.b. vs. P.c.P.c. vs. P.y.P.y. vs. P.f.P.b. vs. P.f.P.c. vs. P.f.
    Av. protein identity (%) 88.2 83.2 84.6 61.2 62.9 61.9
    Av. nucleotide identity (%) 91.3 87.1 88.1 69.6 70.3 70.1
    Median dN 0.05 0.07 0.06 0.29 0.26 0.26
    Median dS 0.026 0.49 0.53 49.4 26.1 26.5
    Median dN/dSView inline 0.16 0.13 0.11 0.008 0.009 0.009
    No. orthologous gene pairs 3153 4641View inline 3318 3375 3890 3842
  • View inline* The high number of orthologs inferred between P. c. chabaudi and P. berghei compared to pairwise comparisons of the other species most likely reflects the method of automated annotation of both genomes, which used identical gene-finding algorithms (4).

  • View inline Median dN/dS value represents the median value of dN/dS for every gene pair and is not calculated from the median dN and dS values for each comparison. The median dN/dS for comparisons with P. falciparum are low because of the saturation of synonymous changes in the alignments, resulting in high dS values.

  • Selective pressure. Comparison of orthologous genes of different species through models of nucleotide sequence evolution can be used to investigate variable (and positive or negative) selective pressures (10, 11). We determined the relative number of synonymous (dS) versus nonsynonymous (dN) substitutions between orthologs of P. berghei and P. chabaudi. In general, we found that orthologous gene pairs are under purifying selection pressure (and have dN/dS < 1) and that the observed ratios of median values for genes of rodent parasites (Table 2) were similar to those reported for Caenorhabditis elegans/C. briggsae and mouse/human comparisons (10, 12). This strong divergence from dN/dS = 1 suggests that most rodent malaria parasite gene models code for proteins and are not mispredictions or pseudogenes. The distribution of dN/dS ratios of genes containing transmembrane (TM) domains or signal peptides (SPs) (i.e., genes that may be extra-cellular) was greater than that of cytoplasmic proteins lacking these domains (Fig. 2A), indicating reduced purifying or increased diversifying pressure on SP/TM-containing proteins, possibly as a result of selective pressure from the host. When these data are correlated with expression data from the transcriptome and proteome analysis (table S5), we observe significant difference between the dN/dS values in SP/TM-containing and non–SP/TM-containing genes in blood-stage proteins but not vector-stage proteins (Fig. 2B). This indicates that diversifying selection might result from selective pressure from the host adaptive immune response, although some parasite proteins expressed in the vector are also clearly under diversifying selection. Annotated genes with the highest dN/dS values include many genes that one would expect to play a role in host-parasite interactions, such as reticulocyte binding protein (0.81), rhoptry-associated protein (0.94), and erythrocyte binding antigen (0.78). We have compared our data set with the data generated by a recent study of selection that measured codon volatility in P. falciparum (13). There are 15 P. berghei genes with a dN/dS ratio > 1 that have detectable orthologs in P. falciparum. Not all of these have scores indicating a high volatility, a result consistent with the facts that selection will be operating at different levels in different species and that volatility and dN/dS values measure selection over different time scales.

    Fig. 2.

    dN:dS ratios between pairs of orthologous genes in P. berghei and P. c. chabaudi and a comparison of genes containing SP or TM domains versus those lacking such domains. (A) Frequency distribution for all ortholog pairs. Open bars represent orthologs containing SP or TM domains; solid bars represent orthologs lacking such domains. (B) Analysis of distributions for all orthologs confirmed to be transcribed using transcriptome data or expressed using proteome data (table S2), partitioned according to their expression in mammalian or mosquito phases of the life cycle. The D variable represents the Kolmogorov-Smirnov test output statistic.

    Gene expression. The asexual blood stage cycle of P. berghei takes 22 to 24 hours and gametocyte development 30 hours. Gametocytes are morphologically discernable from the asexual trophozoites only after 18 hours (fig. S18). Transcriptome data were obtained from three time points during the G1 phase (rings and young and mature trophozoites) and from two time points during the S/M phase (immature and mature schizonts), as well as from purified immature (24-hour) and mature (30-hour) gametocytes. The transcription profile of these stages was compared through a series of pairwise hybridizations to a P. berghei genome survey sequence (GSS) amplicon DNA microarray (4). Proteome data were collected from mixed asexual blood stages (containing both invasive and replicative stages), gametocytes during blood stage development, ookinetes, oocysts (days 9 to 12 postinfection), and salivary gland sporozoites and analyzed by multidimensional protein identification technology (14). The proteome analysis resulted in the identification of 1836 parasite proteins with high confidence (tables S6 to S8) and >5000 parasite proteins with relaxed filtering (4). By comparing expression data for the different life cycle stages, we could categorize proteins into the following four strategies of gene expression: (i) housekeeping, (ii) host-related expression, (iii) strategy-specific expression, and (iv) stage-specific expression.

    Housekeeping. Of the 1836 proteins detected, 136 were expressed in at least four of the five stages analyzed (table S8). Given the lower number of proteins identified in the oocyst (277 proteins) and the sporozoite (134 proteins) compared to the other stages analyzed (733 to 1139 proteins), our analysis will have excluded some of the 301 proteins detected in asexual blood stages, gametocytes, and ookinetes (Fig. 3C). Recognizing that these 301 proteins were detected in both vertebrate and mosquito stages, we anticipate that some of these will also be expressed in oocysts and sporozoites.

    Fig. 3.

    Different strategies of protein and gene expression during the malaria life cycle. Venn diagrams illustrate the overlap in proteins detected in the life stages involved in (A) invasion, (B) replication, and (C) sexual development. The total number of proteins detected in each stage is shown in parentheses. Blue numbers represent proteins detected exclusively in the stages shown; red numbers represent proteins detected in the combination of stages shown out of the three stages included in each of the Venn diagrams (i.e., these proteins could also be shared with stages not shown in the figure). ND, not done. (D) A Venn diagram representing the comparison of the gametocyte transcriptome with the proteomes of the gametocyte and the ookinete. The numbers indicate the individual transcripts and proteins in each analysis. The number of gametocyte proteins includes proteins identified in P. berghei during this study and proteins identified from P. falciparum gametocytes (21). The bold number 9 in the intersect indicates the number of gametocyte transcripts found exclusively as ookinete proteins as a result of this study. (E) A WebLogo (29) representation of the 47-base motif found within 500 base pairs (bp) downstream of the open reading frames of six of the nine implicated translationally repressed transcripts for which 3′UTR sequence was available. The point size of the letter is proportional to the frequency of the appearance of each nucleotide at each position.

    Host-related expression. The proteome and transcriptome data sets revealed that enzymes of the tricarboxylic acid cycle, oxidative phosphorylation, and many other mitochondrial proteins were up-regulated in the gametocyte when compared to the asexual blood stages and were even more abundant in the ookinete (fig. S16 and table S8). These observations suggest that, as in trypanosomes (15), mitochondrial activity increases in the gametocyte as a preadaptation to life in the mosquito vector, and are consistent with the more complex organization of mitochondria in gametocytes (1, 16). Mitochondrial activity apparently continues to increase in the ookinete.

    Strategy-specific expression. Strategy-specific protein expression is related to invasion, asexual replication, or sexual development. We uniquely detected 966 proteins in invasive zoite (merozoite, ookinete, or sporozoite)–containing preparations, of which 234 were shared between at least two of the three invasive stages but not with the replicative or sexual stages (Fig. 3A). Gliding motility typifies the invasive stages of apicomplexans, and many proteins with a (putative) role in this process were detected. Micronemes and rhoptries are secretory organelles specific to the invasive stages. Although 10 known rhoptry proteins were detected in blood stages and sporozoites, these rhoptry proteins were absent from ookinetes. In contrast, most known micronemal protein families were detected in all zoites but with clear stage-specific expression of different family members. Perforin-like proteins, first described in the micronemes of P. yoelii sporozoites (17), contain a membrane attack complex/perforin (MACPF)–like domain and were found both in ookinetes and sporozoites but not in merozoites. We suggest a role for these molecules in parasite entry to and/or egress from target cells, given the role of MACPF-like domains in the formation of pores. Both the ookinete and sporozoite can traverse through several host cells (18, 19), whereas a merozoite enters a target cell only once. Our data therefore support the concept that microneme proteins mediate motility and disruption of the host cell plasma membrane and that the rhoptry proteins are essential to genesis of the parasitophorous vacuole and host cell survival.

    We uniquely detected 472 proteins in replicative stages, i.e., blood stages and oocysts (Fig. 3B). Not unexpectedly and as consistent with findings in P. falciparum (2022), the majority of these genes encode proteins involved in cell growth or division, DNA replication, transcription, translation, and protein metabolism. The more detailed transcriptome analysis of blood-stage gene expression confirmed a cell cycle–related timing of transcription of these genes during the G1 and S/M phases (figs. S18 and S19) and revealed that 215 and 355 were upregulated in the G1 and the S/M phases, respectively.

    During the first 18 hours of development, gametocytes and asexual trophozoites share the same features of the G1 phase of growth. Subsequently, the gametocytes differentiate into either males that prepare for DNA replication and mitosis or females that prepare for postzygotic growth. Transcriptome analysis demonstrated that 58% of the G1 proteins (125 genes) and 59.4% of the S/M proteins (199 genes) were also up-regulated in gametocytes (fig. S19), and the proteome data also emphasized the similarity between protein expression in asexual blood stages and gametocytes (514 proteins were shared between these stages) (table S8). Despite these similarities, the described unique morphologies indicate that sexual development is a fundamental developmental switch. This is shown by the specific up-regulation of transcription of 977 genes (4) (table S10 and fig. S19), including many of the known gametocyte-specific genes, and by the detection of 127 unique proteins in the proteome (Fig. 3C).

    Stage-specific expression. Just over half (948) of the proteins detected in the proteome analysis were found in one stage only, suggesting that stage-specific specialization is substantial. However, many of these stage-specific proteins belong to protein families whose expression is strategy-specific, reflecting both conserved mechanisms of parasite development between different stages and subtle molecular adaptations dictated by specific parasite-host interactions. For example, gene families encoding proteins that contain MACPF-like or TSP/vWA domains are examples of strategy (invasion)–specific expression whose members are stage-specifically expressed. Unexpectedly, the PIR family belongs to this category: Members of the BIR protein family were detected in all stages, but 92% were exclusive to a single stage (fig. S15 and table S8). Peptides were found matching 34 of ∼180 predicted P. berghei genes, and transcription of bir genes was detected in both the asexual blood stage and gametocytes (tables S9 and S10). Although pir are thought to play a role in immune evasion of the blood stages by antigenic variation (7), ∼9% of the total BIR repertoire in our analysis was expressed only in the mosquito stages, suggesting that these proteins may have other key functions.

    Posttranscriptional gene silencing (PTGS). It has been proposed that transcripts in Plasmodium are essentially produced when needed (22), the so-called “transcripts to go” model (23). However, it has been established that the abundant transcripts for P28 in developing and mature female gametocytes are in a state of translational repression (TR) (24), one mechanism by which PTGS is exercised. In addition, RNA binding proteins (Puf proteins) (25) that play a role in TR are found in Plasmodium and are specifically up-regulated in gametocytes and sporozoites (20, 26). Therefore, we compared the gametocyte transcriptome with the proteomes of both gametocytes and ookinetes to determine if additional gametocyte-specific transcripts might be subject to TR. Nine new genes were identified for which transcripts were detected in gametocytes but with protein products specific to the ookinete stage (Fig. 3D and table S11). The analysis of the 3′ untranslated regions (UTRs) of seven of these genes (for two genes, there was insufficient 3′UTR sequence for analysis) and the 3′UTRs of Pbs28 and Pbs25 by the motif identifier program MEME (27) revealed a 47-base motif found in six of the analyzed sequences within 1 kb of the 3′ end of the stop codon (Fig. 3E and fig. S17) (E value = 4.8e+002). Puf proteins bind to a UUGU motif in 3′UTR regions (25, 28), and the 3′UTR regions of all seven candidates and Pbs28 were enriched for this motif (P ≤ 0.001), which was found as a submotif in the 47-base motif. The 47-base motif was used to search the entire P. berghei genome database with MAST (27), and 20 additional genes were identified that had the same motif within 1 kb of their 3′UTR (E < e–05), giving a total of 29 TR candidates. Of these, 22 had orthologs in P. falciparum. Eighteen are up-regulated in gametocytes (16 genes) and/or sporozoites (5 genes), but only two were observed in gametocyte proteomes (table S11). Analysis of 1 kb downstream of the stop codon of 20 of these P. falciparum orthologs, including pfs25 and pfs28, failed to identify a sequence analogous to the P. berghei motif. Nevertheless, visual inspection identified numerous UUGU motifs at analogous positions. This lack of sequence similarity of the predicted 3′UTR binding motif is consistent with the significant sequence diversity in the predicted gene models of the Puf orthologs of P. falciparum and P. y. yoelii (25). The paucity of annotated transcription factors (2, 28) and the phased expression of blood-stage transcripts have led to the proposal that PTGS is a major mechanism of the regulation of gene expression in Plasmodium (28). Our data suggest that, at least in the gametocyte and possibly the sporozoite, TR may be an important component of these regulatory mechanisms.

    The integration and initial analysis of the four data sets presented here has permitted insights concerning genome evolution, expression of multigene families, and mechanisms of posttranscriptional gene regulation in rodent malaria parasites. This initial overview will be developed further and, as demonstrated here, will continue to emphasize the value of model systems for the study of orthologous features of human malaria parasites.

    Supporting Online Material

    www.sciencemag.org/cgi/content/full/307/5706/82/DC1

    Materials and Methods

    Figs. S1 to S19

    Tables S1 to S11

    References and Notes

    References and Notes

    View Abstract

    Navigate This Article