Report

The Trypanosoma cruzi Proteome

See allHide authors and affiliations

Science  15 Jul 2005:
Vol. 309, Issue 5733, pp. 473-476
DOI: 10.1126/science.1110289

Abstract

To complement the sequencing of the three kinetoplastid genomes reported in this issue, we have undertaken a whole-organism, proteomic analysis of the four life-cycle stages of Trypanosoma cruzi. Peptides mapping to 2784 proteins in 1168 protein groups from the annotated T. cruzi genome were identified across the four life-cycle stages. Protein products were identified from >1000 genes annotated as “hypothetical” in the sequenced genome, including members of a newly defined gene family annotated as mucin-associated surface proteins. The four parasite stages appear to use distinct energy sources, including histidine for stages present in the insect vectors and fatty acids by intracellular amastigotes.

Trypanosoma cruzi exists in four morphologically and biologically distinct forms during its cycle of development in mammals and insects (Fig. 1). Metacyclic trypomastigotes develop in the hind gut of triatomine insect vectors and initiate infection in a wide variety of animal species, including humans. In host cells, trypomastigotes convert to replicative amastigote forms that reside in the host-cell cytoplasm. After multiple rounds of binary fission, the aflagellate amastigotes convert into flagellated trypomastigotes that burst from the host cell and circulate in the bloodstream. There, the trypomastigotes can invade other host cells and thus spread the infection throughout the body. Alternatively, trypomastigotes acquired during a blood meal convert to epimastigote forms, which replicate in the insect gut before eventually differentiating into infective metacyclic trypomastigote forms. Drugs for the treatment of T. cruzi infection are inadequate, and vaccines are lacking. Like other trypanosomatids, T. cruzi appears to regulate protein expression primarily posttranscriptionally through variations in mRNA stability or the translational efficiency of mRNAs (1). This limits the use of DNA microarrays (25) and makes proteomic analysis especially attractive for examining global changes in protein expression during development in T. cruzi.

Fig. 1.

Life cycle and summary of the major findings of proteome analysis in T. cruzi. T. cruzi trypomastigotes circulate in the blood of infected hosts, including humans, but must enter host cells (oftentimes muscle cells) and convert to amastigote forms to replicate. Triatomine bug vectors become infected by ingesting trypomastigotes during the course of a blood meal on infected mammalian hosts. Conversion of the trypomastigotes into epimastigotes, replication of these epimastigotes, and their eventual transformation into metacyclic trypomastigotes, occurs in the insect gut. Metacyclic trypomastigotes initiate new infection in mammals when infected insects are ingested or by deposition of parasites in the feces, usually during a blood meal.

Metacyclic trypomastigotes, amastigotes, trypomastigotes, and epimastigotes of T. cruzi were isolated, and proteins were extracted from whole-cell or subcellular lysates (fig. S1) (6). Peptides generated by digestion of the whole-cell or subcellular lysates were independently separated and analyzed at least in duplicate by offline multidimensional liquid chromatography, online reverse-phase liquid chromatography and tandem mass spectrometry (LC-MS/MS) (6, 7). A total of 602 tryptic peptide samples were analyzed, generating 139,147 tandem mass spectra. Because of differences in protein recovery from the four life-cycle stages, trypomastigote and amastigote stages are undersampled relative to metacyclic trypomastigotes and epimastigotes (table S1). A total of 5,720 unique peptides were matched with high confidence to 1168 protein groups containing 2784 total proteins using the Mascot search engine and PROVALT parsing and clustering tools (7), as described in (6) (table S2). The approach of grouping protein isoforms (8, 9) is particularly important in T. cruzi because the genome contains multiple, nonidentical copies of many genes, including a number of large gene families with hundreds of distinct members (10). In addition, the T. cruzi CL Brener strain used for the sequencing project is a hybrid of two genotypes and thus has multiple distinct alleles for most genes. Table 1 summarizes the proteins assigned to each life-cycle stage. Nearly 30% (838 of 2784) of the identified proteins, including most of the proteins previously documented or expected to be produced in the greatest abundance, were detected in all life-cycle stages.

Table 1.

Protein group and protein identifications for each developmental stage.

Protein groups (proteins) Amastigote Trypomastigote Metacyclic trypomastigote Epimastigote
29 (49) X
21 (41) X X
44 (161) X X X
335 (838) X X X X
27 (84) X X X
65 (110) X X
146 (538) X X X
24 (50) X X
43 (125) X
47 (122) X X
53 (93) X X X
12 (22) X X
187 (315) X X
92 (162) X X
43 (74) X
1168 (2784) 691 (1871) 582 (1486) 969 (2339) 732 (1861)

Although there are limitations in the ability of shotgun proteome LC-MS/MS analysis to detect precise changes in protein levels, it is possible to track the relative abundance of proteins in the four T. cruzi stages using measures of protein coverage (11). Among the top-scoring proteins in all four T. cruzi proteomes are many housekeeping proteins that are also among the highest ranking proteins in yeast (12). However, many other highly abundant proteins in the T. cruzi proteome are either absent in the yeast genome or are expressed at very different relative levels in these two eukaryotes (see specific examples in supporting online text). Table 2 summarizes some of the major protein groups and families identified in the T. cruzi proteome. These data reflect a combination of the relative abundance of the proteins comprising each group, the size of gene families, and the ease with which certain proteins can be detected by LC-MS/MS analysis (supporting online text). Of the 2784 total proteins identified in this analysis, 1008 are from genes annotated as hypothetical,” validating these as bona fide genes in T. cruzi. More than half of these hypothetical genes have orthologs in the Leishmania major and/or T. brucei genomes.

Table 2.

Major protein families and functional classes.

Number of identified proteins
Protein functional classes
    Ribosomal 212
    Proteasome/Ubiquitin 67
    Heat shock/Chaperonins 61
    Translation/Transcription 49
    Histones 36
Gene families
    Trans-sialidase 223
    RHS 399
    GP63 29
Cysteine protease 30
    MASP 9
    Mucins 0
Hypothetical genes
    Hypothetical 155
    Hypothetical conserved 505
    Hypothetical to be annotated 348

T. cruzi trypomastigotes circulate in the blood, where they are exposed to host immune effector molecules, including specific antibodies. Unlike the related African trypanosomes, T. cruzi trypomastigotes do not undergo antigenic variation but instead express on their surface multiple members of several large families of molecules; the best characterized of these are the mucin and trans-sialidase (ts) families (13). Thirty of the 50 top-scoring proteins detected exclusively in trypomastigotes are ts family members. Likewise, the amastigote and metacyclic stages appear to express subsets of ts molecules unique to each stage, whereas no ts expression was detected in the epimastigote proteome (Fig. 2 and table S2). Trans-sialidase enzymatic activity is reportedly present in only a small subset of the >1000 ts proteins encoded in the T. cruzi genome, and it has been linked to the presence of Tyr342 in the catalytic N-terminal region and SAPA repeats in the C terminus (13). Among the 223 ts proteins detected in the proteome are the products of all 15 genes predicted to encode enzyme-active ts. The production of a large number of nonenzymatic ts family members coincident with these ts enzymes may deflect immune responses away from the enzymatically active targets or may provide a pool of altered peptides that could antagonize T cell responses (14).

Fig. 2.

Stage-specific detection of ts proteins. Cumulative protein scores based on summing the Mascot scores for all high-confidence peptides are used to display the stage-regulated expression of ts proteins detected in the proteomes. Peptides matching to 223 members of the ts family clustered into 47 protein groups; only the top-scoring protein for each protein group is shown. Most ts proteins are detected exclusively in one stage. A, amastigote; T, trypomastigote; M, metacyclic trypomastigote; E, epimastigote.

In addition to the ts and mucin families, the T. cruzi genome contains several other high-copy multigene families (Table 2 and supporting online text). We detected expression of several mucin-associated surface proteins (MASPs), a gene family first discovered as part of the sequencing and annotation effort (10), predominantly in the trypomastigote proteome. Like proteins from the other multigene families in T. cruzi, many MASP family members have predicted signal sequences and glycosylphosphatidylinositol anchor addition sites and thus are likely to be surface expressed. Nine MASP gene family proteins were identified in our analysis, each by only a single peptide match. This result suggests either that MASPs are not as abundantly expressed as the trans-sialidase proteins or that, like the mucins, MASPs have extensive posttranslational modifications that complicate their detection by shotgun proteomics. However, detection of the MASPs in the relatively undersampled trypomastigote stage suggests that they are not minor constituents of the T. cruzi proteome.

The transition from trypomastigote to amastigote can be stimulated extracellularly by simulating the low pH environment of the phagosomal/lysosomal compartment that T. cruzi initially encounters upon cell entry (15), making early time points in the transformation process to the amastigote stage amenable to transcriptome and proteome analysis. The results from this proteome analysis of amastigotes are in agreement (with one exception) with the restricted data set generated by comparison of trypomastigotes and early-stage amastigotes using DNA microarray analysis (3) (table S4), further supporting the quality of this analysis. In addition to the expression of a distinct subset of trans-sialidase–family genes, many of which are related to the amastigote surface protein 2 molecule previously reported to be preferentially expressed in amastigotes (16) (Fig. 2), the transition of trypomastigotes to amastigotes also appears to be accompanied by a dramatic shift from carbohydrate- to lipid-dependent energy metabolism (table S3). This is demonstrated by the virtual absence of glucose transporters and the detection of enzymes that oxidize fatty acids to give acetylcoenzyme A. Enzymes of the citric acid cycle, which oxidize acetyl coenzyme A to carbon dioxide and water, are also abundant in amastigotes. Amastigotes are likely to be dependent on gluconeogenesis for the synthesis of glycoproteins and glycoinositolphospholipids (GIPLs), and aspartate aminotransferases (4698.t00001, 4779.t00007) specific to amastigotes may be important in this process. These proteins lack the mitochondrial targeting signal present on the aspartate aminotransferase expressed in all stages (6015.t00007) and thus likely reside in the cytoplasm. Mitochondrially produced oxaloacetate, after transamination, may be transported to the cytosol by a malate/aspartate shuttle and then converted by the cytosolic aspartate aminotransferase and a phosphoenol pyruvate carboxykinase into phosphoenol pyruvate, the substrate for gluconeogenesis.

In addition to several heat-shock proteins and kinases, among the other proteins detected preferentially or exclusively in amastigotes are a group involved in endoplasmic reticulum (ER) to Golgi trafficking, including rab1 (4703.t00005), sec23 (8726.t00010), and sec31 (6890.t00029). The detection of this set of proteins involved in vesicular trafficking in amastigotes but not in the more highly sampled metacyclic and epimastigote stages suggests a more active trafficking process or the preferential use of selected rab and sec proteins in amastigotes (table S3). We also extend the data on the selective expression in amastigotes and epimastigotes of several ABC transporters (7164.t00003, 8319.t00008) that are hypothesized to have a role in cargo selection and/or vesicular transport in trypanosomes (17). A putative lectin (6865.t00003) with homology to ERGIC (ER Golgi intermediate compartment), a protein involved in cargo selection in coat protein complex II vesicles, is also detected in trypomastigotes and amastigotes but not in metacyclic or epimastigote forms.

In contrast to both T. brucei and L. major, the T. cruzi genome encodes enzymes capable of catalyzing the conversion of histidine to glutamate. The first two enzymes in this pathway, histidine ammonia-lyase (6869.t00022) and urocanate hydratase (4881.t00011), are abundant in the insect stages but nearly undetectable in the mammalian stages (only a single spectrum matching histidine ammonia-lyase in amastigotes), consistent with the functioning of this pathway primarily in epimastigotes and metacyclic trypomastigotes. This expression pattern is notable, given that histidine is the dominant free amino acid in both the excreta and the hemolymph of Rhodnius prolixus (18, 19), a well-studied vector for T. cruzi. The abundance of histidine in this and other bloodfeeding insects likely reflects the high histidine content of hemoglobin (20). Thus, T. cruzi epimastigotes seem particularly adapted among the kinetoplastids to take advantage of this plentiful energy source in the gut of its insect vector. This is analogous to the use of proline as an energy source by T. brucei (21).

The transformation of epimastigotes to metacyclic trypomastigotes is accompanied by the production of a number of key enzymes and substrates important in antioxidant defense in T. cruzi. The H2O2 and peroxynitrite detoxifying enzymes ascorbate peroxidase (6846.t00006, 4731.t00003) (22) and the mitochondria-localized tryparedoxin peroxidase (8115.t00003) are both elevated after epimastigote to metacyclic conversion, as are tryparedoxin (5824.t00003), the substrate for tryparedoxin peroxidase, and the enzymes trypanothione synthase (8070.t00009, 7998.t00005) and iron superoxide dismutase (5781.t00004), responsible for synthesis of trypanothione and for the conversion of superoxide anion to hydrogen peroxides, respectively. These changes are consistent with a preadaptation of metacyclic forms to withstand the potential respiratory burst of phagocytic cells in the mammalian host. Enzymes of the pentose-phosphate shunt aid this process through the production of the nicotinamide adenine dinucleotide phosphate required for the reduction of trypanothione. Also noticeable in the transition of epimastigotes into metacyclic trypomastigotes is a substantial decrease in the representation of ribosomal proteins in the metacyclic proteome; 37 of the 50 highest scoring proteins in the epimastigote proteome that are not detected in the metacyclic trypomastigote proteome are ribosomal proteins. A reduction in the capacity for protein production would be consistent with the stationary, nonreplicating status of metacyclic trypomastigotes. DNA microarray analysis has also documented a substantial down-regulation of ribosomal protein expression in metacyclic forms in L. major (23).

A search for peptides with modifications (e.g., acetylations, methylations, or phosphorylations) resulted in 8 additional protein identifications and the detection of modifications on 81 previously identified proteins (supporting online text and table S5). To identify additional genes potentially missed in the annotations provided by the T. cruzi sequencing consortium, a database of approximately 817,000 open reading frames (ORFs) of >50 amino acids was constructed and screened using spectra that failed to match proteins predicted by the annotated genome (6). This analysis yielded 79 new genes, new alleles, or modifications to existing gene annotations (table S6). Sixty-six of these ORFans are new alleles of annotated genes or corrections to existing annotations, which suggests that the prediction models and annotations of The Institute for Genomic Research–Seattle Biomedical Research Institute–Karolinska Institutet Trypanosoma cruzi Sequencing Consortium (TSK-TSC) have been extremely efficient in accurately predicting genes. In all cases, these new annotations map to the “coding” strand of DNA among genes that are part of polycistronic units. This result is consistent with the model of kinetoplastid genes being clustered in large transcriptional units on the coding strand of DNA (24). Strand-switch regions separate these clusters and allow for changing of the coding strand at sites of transcription initiation. Thus, although transcriptional activity on the “noncoding” DNA strand has been documented (25), the proteome does not provide evidence for translation of those alternative strand transcripts.

High-throughput proteome analyses are inherently incomplete, because the available methodologies do not have sufficient dynamic range to identify and quantify all proteins expressed in an organism. In this analysis, nearly 50% of all the spectra matching to proteins mapped to the 67 most abundant protein groups. A higher number of lower abundance proteins can likely be revealed by depleting these highly abundant proteins before whole proteome analysis. Analysis of the proteomes of T. cruzi reveals the operation of several previously undocumented stage-specific pathways that could be appropriate targets for drug intervention. Among the most interesting of these are the proposed pathways for energy generation in amastigotes and epimastigotes. Additionally, the identification of the proteins expressed in abundance in trypomastigotes and amastigotes of T. cruzi provides a substantial new resource of candidates for vaccine development. This proteome analysis of T. cruzi also validates the high quality of the gene predictions generated by the TSK-TSC by confirming the expression of >1000 hypothetical genes and at the same time revealing <15 genes missed in the initial annotation.

Supporting Online Material

www.sciencemag.org/cgi/content/full/309/5733/473/DC1

Materials and Methods SOM Text

Figs. S1 and S2

Tables S1 to S6

References

References and Notes

View Abstract

Navigate This Article