Report

Genomic Minimalism in the Early Diverging Intestinal Parasite Giardia lamblia

See allHide authors and affiliations

Science  28 Sep 2007:
Vol. 317, Issue 5846, pp. 1921-1926
DOI: 10.1126/science.1143837

Abstract

The genome of the eukaryotic protist Giardia lamblia, an important human intestinal parasite, is compact in structure and content, contains few introns or mitochondrial relics, and has simplified machinery for DNA replication, transcription, RNA processing, and most metabolic pathways. Protein kinases comprise the single largest protein class and reflect Giardia's requirement for a complex signal transduction network for coordinating differentiation. Lateral gene transfer from bacterial and archaeal donors has shaped Giardia's genome, and previously unknown gene families, for example, cysteine-rich structural proteins, have been discovered. Unexpectedly, the genome shows little evidence of heterozygosity, supporting recent speculations that this organism is sexual. This genome sequence will not only be valuable for investigating the evolution of eukaryotes, but will also be applied to the search for new therapeutics for this parasite.

Giardia lamblia (syn. G. intestinalis, G. duodenalis) is the most prevalent parasitic protist in the United States, where its incidence may be as high as 0.7% (1). World-wide, giardiasis is common among people with poor fecal-oral hygiene, and major modes of transmission include contaminated water supplies or sexual activity. Flagellated giardial trophozoites attach to epithelial cells of the small intestine, where they can cause disease without triggering a pronounced inflammatory response. There are no known virulence factors or toxins, and variable expression of surface proteins may allow evasion of host immune responses and adaptation to different host environments. Trophozoites can differentiate into infectious cysts that are transmitted through feces.

Unusual features of this enigmatic protist include the presence of two similar, transcriptionally active diploid nuclei and the absence of mitochondria and peroxisomes. Giardia is a member of the Diplomonadida, which includes both free-living (e.g., Trepomonas) and parasitic species. The phylogenetic position of diplomonads and related excavate taxa is perplexing. Ribosomal RNA (rRNA), vacuolar ATPase (adenosine triphosphatase), and elongation factor phylogenies identify Giardia as a basal eukaryote (24). Other gene trees position diplomonads as one of many eukaryotic lineages that diverged nearly simultaneously with the opisthokonts and plants. Discoveries of a mitochondrial-like cpn60 gene and a mitosome imply that the absence of respiring mitochondria in Giardia may reflect adaptation to a microaerophilic life-style rather than divergence before the endosymbiosis of the mitochondrial ancestor (5, 6). Because of its impact on human disease and its relevance to understanding the evolution of eukaryotes, we embarked upon a genome analysis of G. lamblia.

Thegenomeof G. lamblia WB clone C6 (ATCC50803) is ∼11.7 MB in size, distributed on five chromosomes. The edited draft genome sequence contains 306 contigs on 92 scaffolds (Supporting Online Material). The genome is compact. We identified 6470 open reading frames (ORFs) with a mean intergenic distance of 372 base pairs (bp) (Table 1). Approximately 77% of the assembled sequence defines ORFs, of which 1800 overlap and 1500 more are within 100 nucleotides (nt) of an adjacent ORF. Serial analysis of gene expression (SAGE) and cDNA sequences provided transcriptional evidence for 4787 of these ORFs (Supporting Online Material).

Table 1.

Comparison of eukaryotic genome content and organization.

View this table:

Although the total number of ORFs is similar to that of yeast, many specific giardial pathways appear simple in comparison with those of other eukaryotic organisms. Giardia's genome encodes a simplified form of many cellular processes: fewer and more basic subunits, incorporation of single-domain bacterial and archaeal-like enzymes, and a limited metabolic repertoire commonly observed in parasites. We did not detect these missing components in searches of assembled and unassembled reads; however, they may be highly divergent and difficult to recognize. Others may be nonessential or functionally redundant with other proteins in the same or another pathway. The host may provide essential metabolic products for an incomplete pathway, but this is a highly improbable explanation for missing structural proteins or subunits of core machinery.

DNA synthesis, transcription, RNA processing, and cell cycle machinery are simple (Fig. 1). The occurrence of only two origin recognition complex proteins (Orc4 and Orc1/Cdc6) in Giardia and the absence of regulatory initiation proteins (e.g., Cdt1, Dpb11, Cdc45, MCM10, and Gemini) are comparable to Archaea. Giardia has three replicative B-type DNA polymerases (Polα, Polδ, and Polϵ). The occurrence of four subunits in Giardia'sPolα/primase complex is typical of other eukaryotes, whereas the compositions of Polϵ and Polδ resemble the corresponding polymerases in Archaea. Most giardial DNA polymerase accessory proteins are typically eukaryotic.

Fig. 1.

Comparison of selected multiprotein complexes between Giardia and the yeast Saccharomyces cerevisiae. Initiation of replication: Multiple initiator proteins assemble at the origins of replication in S. cerevisiae during the cell cycle. Giardia has fewer origin recognition proteins (Orc) and most of the initiators of the pre-initiator complex. Initiation of transcription: Transcription in S. cerevisiae is initiated by the pre-initiation complex (PIC) consisting of the RNAPII core complex (12 subunits) and general transcription factors containing several subunits: TFIIA (2), TFIIB (1), TFIID (TBP plus14 TAFs), TFIIF (3), TFIIE (2), TFIIH (10), and the Mediator (24). These factors recognize DNA elements in the promoter, including the upstream activating sequence (UAS), the TATA box, the initiator element (INR), and the downstream promoter element (DPE). Giardia promoters have an AT-rich initiator element and lack many of the general transcription factors. Polyadenylation: The polyadenylation complex in S. cerevisiae recognizes an A/U-rich sequence, and it contains at least 25 proteins and the largest subunit of RNAPII with its C-terminal domain (CTD). The preferred polyadenylation signal in Giardia is AGTAAY, and Giardia has very few of the yeast polyadenylation proteins and a diverged CTD. Yth1 corresponds to CPSF30 and Ysh1 corresponds to CPSF73 in mammals.

Relative to Saccharomyces, Giardia has retained most of the RNA polymerase I (RNAPI), RNAPII, and RNAPIII core peptides. Seven proteins are missing, but six of these are unique subunits that occur in only one RNAP (7). Moreover, Giardia contains only 4 of the 12 transcription initiation factors present in Saccharomyces. The absence of polymerase core peptides is unlikely to be due to our failure to recognize highly diverged homologs in Giardia, because the missing proteins represented RNAP-specific elements rather than a random sampling of both shared and unique RNAP subunits. Absence of homologs to many of the unique subunits required for transcription is consistent with an evolutionary model hypothesizing that class-specific polymerase subunits arose after the divergence of diplomonads.

A single intron with a noncanonical 5′ splice site was identified in a 2Fe-2S ferredoxin gene, along with components of the spliceosome (8). We generated trophozoite cDNAs and examined alignments of conserved proteins to identify other possible introns. We found three candidates, in genes for ribosomal protein L7A, a dynein light-chain protein, and an unknown protein. Two were confirmed by reverse transcription–polymerase chain reaction, and the RPL7A intron was independently reported (9). These new candidates show canonical GT/AG splice sites and contain an AC-repeat motif, [AC]CT[GA]AC[AC]CACAG (fig. S1). The AC-repeat motif is very like that common to Trichomonas introns [ACTAACACACAG (10)], suggesting a shared splicing mechanism. An intron has also been reported in the excavate Carpediomonas (11).

Giardia's machinery for RNA processing is less complex than that of other eukaryotes, but the presumed polyadenylation signal (AGUAAA) (12) resembles that of other eukaryotes (AAUAAA). Searches for Giardia sequences that are similar to the many polyadenylation factors in yeast and other eukaryotes identified relatively few homologs (Fig. 1). Giardia has a relative paucity of enzymes for posttranslational modification. Like Plasmodium, it lacks the vast majority of genes encoding glycosyltransferases and so makes the shortest N-glycan precursor yet identified, dolichol-PPGlcNAc2 (13). Giardia, like Trypanosoma and Archaea, has a single-subunit oligosaccharyltransferase for transferring N-glycans from the lipid precursor to the peptide (14), compared with eight in yeast and humans. Unlike most eukaryotes, Giardia has an N-glycan–independent quality-control system for protein folding (e.g., chaperones, protein disulfide isomerases, and peptidyl-prolyl cis-trans isomerases) and protein degradation. Giardia has fewer nucleotide sugar transporters than any other eukaryotic genome, including just one for uridine 5′-diphosphate (UDP)–GlcNAc (15). Giardia is missing the set of glycosyltransferases that typically modify Nand O-linked glycans in the Golgi lumen. Instead, Giardia has a cytosolic glycosyltransferase, rare among protists, which adds O-linked GlcNAc to Ser and Thr of cytosolic proteins (15).

Giardia has a conventional endoplasmic reticulum (ER) with conserved chaperones (BiP, Hsp90, DnaJ), but is unusual in having five protein disulfide isomerases, each with only asingle active site (16), and in lacking the Ero1 protein that drives disulfide formation in the ER lumen. Membrane transport in Giardia is unlike that of other parasitic protozoa (17, 18). Despite the highly polarized cell structure, there is no conclusive evidence for a stacked Golgi apparatus or cisternae for posttranslational maturation of secretory cargo except in encysting trophozoites. Only a few Rabs, SNAREs (soluble N-ethylmaleimide–sensitive factor attachment protein receptors), and a small number of adaptor protein (AP) complexes participate in vesicle docking and membrane fusion. Unlike all other eukaryotes that have at least three AP complexes, Giardia encodes only two. The presence of only two APs with no indication of pseudogenes or orphan subunits argues for a simple membrane transport system in Giardia.

Two rounds of cytokinesis, accompanied by a single round of nuclear division, occur during excystation. Giardia's transcriptionally equivalent nuclei must synchronously divide in trophozoites and form quadrinucleate, 16N cysts (19). The presence of homologs to yeast Cin8, polo kinase, aurora kinase, and antiparallel microtubule bundling proteins suggests that the necessary spindle apparatus machinery is present. We identified giardial homologs of several mitotic exit network (MEN) proteins, indicating that regulation of cytokinesis in Giardia may be similar to that of yeast in which MEN coordinates nuclear division with cytokinesis. Homologs of actin, cyclin-dependent kinases, and the mitotic cyclins A and B are present in Giardia. However, the lack of myosin indicates that the actinmyosin cleavage furrow previously found in all eukaryotes is not present in Giardia. Possibly a nonmyosin, adhesion-dependent cytokinesis mechanism exists in Giardia, as in some mutants of Dictyostelium (20).

Like many other microaerophilic eukaryotic parasites, Giardia exhibits a limited metabolic repertoire. There are essentially no homologs for enzymes in the Krebs cycle and, except for well-known scavenging pathways, no evidence of vestigial genes associated with purine and pyrimidine biosynthesis. Amino acid metabolism is even more limited, although all tRNA synthetases are present. For lipid metabolism, the Giardia genome contains enzymes capable of limited fatty acid extension and sphingomyelin assembly, as well as phospholipid headgroup exchange and modification. Although not sufficient for de novo synthesis of lipids, these enzymes allow for remodeling of membrane components.

Glycolytic activities associated with enzymes involved in hexose processing and the interconversion and phosphorylation to fructose-1,6-phosphate glycolysis are more similar to bacterial than to higher eukaryal homologs (Fig. 2) (21). Some of these bacterial-like proteins share similarity with genes in Entamoeba and Trichomonas (table S3). Yet, the predicted origins of the sequences appear to be independent of each other and are not associated with a particular bacterial group.

Fig. 2.

Glucose, pentose-phosphate, and arginine metabolism in Giardia. Color coding denotes similarity to archaeal homolog (red), bacterial homolog (purple), or eukaryal homolog (blue). Black indicates that no homolog was found. Abbreviations and Enzyme Commission numbers: 6PGL, 6-phosphogluconolactonase, 3.1.1.31; ACYP, acylyphosphatase, 3.6.1.7; ADI, arginine deiminase, 3.5.3.6; ARG-S, arginyl-tRNA synthetase, 6.1.1.19; CK, carbamate kinase, 2.7.2.2; DERA, deoxyribose-phosphate aldolase, 4.1.2.4; ENO, enolase, 4.2.1.11; FBA, fructose-bisphosphate aldolase, 4.1.2.13; G6PD, glucose-6-phosphate dehydrogenase, 1.1.1.49; GAPDH, glyceraldehyde-3-phosphate dehydrogenase, 1.2.1.12; GCK, glucokinase, 2.7.1.2; GNPDA, glucosamine-6-phosphate deaminase, 3.5.99.6; GNPNAT, glucosamine 6-phosphate N-acetyltransferase, 2.3.1.4; GPI, glucose-6-phosphate isomerase, 5.3.1.9; NOS, nitric oxide synthase, 1.14.13.39; OCD, ornithine cyclodeaminase, 4.3.1.12; OCT, ornithine carbamoyltransferase, 2.1.3.3; ODC, ornithine decarboxylase, 4.1.1.17; PFK, phosphofructokinase (pyrophosphate-based), 2.7.1.90; PGAM, phosphoglycerate mutase, 5.4.2.1; PGD, phosphogluconate dehydrogenase, 1.1.1.44; PGK, phosphoglycerate kinase, 2.7.2.3; PGM, phosphoglucomutase, 5.4.2.2; PGM3, phosphoacetylglucosamine mutase, 5.4.2.3; PK, pyruvatekinase, 2.7.1.40; PRO-S, prolyl-tRNAsynthetase, 6.1.1.15; PRPPS, phosphoribosylpyrophosphate synthetase, 2.7.6.1; RBKS, ribokinase, 2.7.1.15; RPE, ribulose-phosphate 3 epimerase, 5.1.3.1; RPI, ribose-5-phosphate isomerase, 5.3.1.6; TKT, transketolase, 2.2.1.1; TPI, triose phosphate isomerase, 5.3.1.1; UAE, UDP-N-acetylglucosamine 4-epimerase, 5.1.3.7; UAP, UDP-N-acetylglucosamine diphosphorylase, 2.7.7.23.

Giardia metabolizes arginine by the anaerobic arginine dihydrolase pathway (Fig. 2), originally described in bacteria but unknown in eukaryotes other than Trichomonas (22). Arginine deiminase, ornithine carbamoyltransferase, and carbamate kinase generate ammonia, ornithine, and adenosine 5′-triphosphate (ATP), and all three archaeal-like enzymes are highly expressed. Trophozoites thus deprive host intestinal epithelial cells of arginine for nitric oxide biosynthesis and thereby dampen innate defenses (23, 24). During encystation, Giardia synthesizes UDPGalNAc from fructose-6-phosphate by an unusual, five-enzyme bacterial-like pathway (Fig. 2). Many eukaryotes use the first enzyme, glucosamine-6-phosphate isomerase, to generate glucosamine-6-phosphate from fructose-6-phosphate and ammonia for glycolysis. Instead, Giardia uses ammonia from arginine metabolism to drive the synthesis of glucosamine-6-phosphate for cyst wall polysaccharide biosynthesis. Although Giardia is microaerophilic and consumes oxygen, it lacks the conventional enzymes superoxide dismutase and catalase for detoxifying reactive oxygen species (25).

Motility and attachment to host cells are essential for the parasitic life-style of Giardia. The microtubule cytoskeleton organizes Giardia's eight basal bodies and flagella, as well as other structures unique to the genus, including the ventral disk and median body (table S4). The giardial cytoskeleton undergoes dramatic changes throughout the life cycle. General signaling proteins (protein kinase A, Erk kinase, calmodulin) and a protein phosphatase localize to the basal bodies, paraflagellar dense rods, and disk. The basal bodies may act as a control center that coordinates the other cytoskeletal structures during growth and differentiation. The microtubule system is well conserved and includes all five tubulin forms, proteins involved in microtubule modification, organization, and assembly (centrins, tubulin-specific chaperones, tubulin tyrosine ligase). There are coding regions for microtubule motor proteins, including kinesins and 12 dynein heavy chains.

The most notable departure from conserved cytoskeletal structure is the absence of cytoplasmic dynein and the divergent nature of the microfilament cytoskeleton. The genome contains a single actin gene, yet does not encode other classical microfilament proteins. On the basis of sequence similarities, the three genes encoding actin-related proteins participate in chromatin remodeling, rather than cytoskeletal structure. The absence of classic microfilament-associated proteins extends to actin modification, organization, and assembly proteins. In contrast to studies that used heterologous antibodies (26, 27), permissive searches of the Giardia genome failed to identify actin-associated proteins, myosins, or any members of the microfilament-specific motor protein family (28). Trichomonas, which may be a sister lineage, also lacks myosin. Either novel, divergent proteins substitute functionally for the missing proteins or altered cytoskeletal dynamics accommodate their absence. Giardia contains several unusual cytoskeletal protein families including α-giardins (annexin homologs), β-giardins (striated fiber assemblin homologs), the GASP-180 family (29), and several microtubule-associated coiled-coil proteins.

Giardia has 276 putative protein kinases (fig. S2) including members from 43 of the 61 primordial kinase subfamilies present in widely diverged eukaryotes (ciliates, fungi/metazoa, plants, Dictyostelium). Trichomonas also has a greatly expanded kinome, which might reflect their putative sister relationship or commonalities in the parasitic life-style. Giardia has no tyrosine-specific or histidine kinases. Most notable is that 180 (∼70%) of the putative giardial protein kinases belong to the NIMA (Never in Mitosis Gene A)–Related Kinase (NEK) family, and that 137 of them are predicted to be catalytically inactive. By contrast, most organisms have fewer than 10 NEK kinases.

This non-NEK kinome is the most compact known from any eukaryote, and so it is of specific functional and evolutionary interest in defining the minimal eukaryotic kinome. Broad-spectrum signal transduction proteins gain specificity by localization to specific cellular target structures. Entamoeba histolytica, another intestinal protozoan parasite, has >80 putative transmembrane kinases (30), but in stark contrast, only four predicted giardial kinases have transmembrane domains. Giardial kinases may have other means of targeting; many have either ankyrin repeats (29), coiled-coiled domains, or both, which may allow for specific localization within the cell. Protein dephosphorylation is also critical in signal transduction networks. Giardia has ∼32 predicted protein phosphatases, but only one is predicted to be membrane associated.

Giardial protein sequences commonly show insertions of amino acids when compared to their homologs in other organisms (fig. S3). We generated protein alignments for 1518 proteins and scored the alignments for the presence of insertions in the giardial protein relative to others. We found in-frame amino acid insertions in 44 ORFs (not attributable to alignment ambiguities) with an average of 1.5 insertions per ORF. The insertions ranged in size from 8 (our lower cutoff value) to 101 amino acids, with an average of 20. To determine whether this was an unusually high frequency, we examined 54 protein alignments, for which sequences were available from several other eukaryotes (Chlamydomonas, Cryptococcus, Dictyostelium, Encephalitozoon, Entamoeba, Leishmania, Mus, Phytopthora, Plasmodium, Saccharomyces, Thalassiosira, Trichomonas, and Trypanosoma; Supporting Online Material). Giardia sequences showed 15 insertions in 11 of the 54 proteins; the number of insertions detected for the other organisms ranged from 0 to 6 (Plasmodium) (Table 2). Sequence analysis of giardial cDNAs that overlap many of these insertions demonstrates that they do not represent introns. The functions of these unusual insertions remain to be determined, although when we experimentally deleted an insertion in giardial aurora kinase and measured protein production, we observed decreased protein stability (Supporting Online Material).

Table 2.

Amino acid insertions detected in alignments of conserved proteins.

View this table:

Giardial trophozoites survive in an environment of host digestive enzymes and bile. A dense single molecular layer of a variant-specific surface protein (VSP) covers the membrane and likely protects the trophozoites. Clonal VSPs on individual trophozoites switch to new VSPs every 6 to 13 generations (31). VSPs vary in sequence and size; all are cysteine-rich (about 12%) with frequent CXXC motifs. Each has an N-terminal signal peptide and characteristic C terminus including a membrane-spanning region terminating in CRGKA and an extended polyadenylation signal. Unlike surface proteins associated with immune evasion in other parasitic protists (32), giardial VSP genes distribute to many noncontiguous locations on all chromosomes (Fig. 3), and they are activated or inactivated in situ with no evidence for associated rearrangement or sequence alteration. VSPs occur at only two of the telomeres where they are truncated by TTAGG telomeric repeats, suggesting that they are pseudogenes. We estimate Giardia's VSP repertoire at 235 to 275 genes (table S5). VSPs frequently cluster as two to nine genes in head-to-tail orientation. Intergenic distances between members of a cluster can be very short, with the 5′ end of one VSP overlapping with the 3′ end of a second.

Fig. 3.

Locations of VSP and other high-cysteine proteins on assembly scaffolds. From the top, scaffolds are from chromosome 5, chromosome 4, chromosome 3, and chromosome 2. Red lines indicate high-cysteine proteins (HCNCp, HCMp, HCp) and blue lines indicate VSPs. The x axis is scaled in kilobase pairs.

In addition to the VSPs, we found two other classes of cysteine-rich proteins (Fig. 3) (33). There are 61 HCMps (high-cysteine membrane proteins) with 10% or more cysteine and 20 or more CXXC or CXC motifs. They lack the CRGKA tail, and their single membrane-spanning domain diverges from the VSPs. No additional leucine-rich repeat cyst wall proteins (CWPs), beyond those previously identified, were found.

Giardia encodes 149 proteins that are promising drug targets, as defined by Hopkins and Groom (34). As might be expected, these include a large subset of the kinases, e.g., TOR (target of rapamycin) (table S6).

When attached to the surface of the intestinal mucosa, Giardia trophozoites have ample opportunity to pick up genes from bacteria and to scavenge products of host and bacterial metabolism. Like that of both Trichomonas and Entamoeba, Giardia's genome contains many lateral gene transfer (LGT) candidates, indicating that LGT has played an important role in shaping Giardia's genome and metabolic pathways. We initially identified ORFs with similarity to bacterial or archaeal proteins at a BLAST significance level of e–10 or better within the top 10 hits. Of these, ∼100 had multiple bacterial or archaeal homologs at a significance level of e–30 or better within the top 20 matches (table S3). These include proteobacterial-like DnaK, cpn60, and cysteine sulfurtransferase (6, 35). Others are NADH (nicotinamide adenine dinucleotide, reduced) oxidase and group 3 alcohol dehydrogenase, derived by LGT from a Gram-positive coccus and a thermoanaerobic bacterium, respectively (36). Hybrid cluster protein, A-type flavoprotein, and glucosamine-6 phosphate isomerase were recently shown to be relics of LGT (37). As noted, many of the enzymes in the glycolytic and pentose phosphate pathways are more similar to bacterial than to eukaryal homologs. Several ORFs had a highly significant match to an Entamoeba and/or Trichomonas protein, with the remaining matches to bacteria or archaea. Although some of these are recognized LGT relics, the rest warrant closer examination.

Cpn60, the iron-sulfur complex proteins, and DnaK are most similar to proteobacterial and mitochondrial homologs. The iron-sulfur cluster proteins and cpn60 are demonstrably targeted to the recently discovered mitosome, believed to be a relict mitochondrion (5). Other genes with homology to mitochondrially targeted genes are detectable, e.g., a mitochondrial protein peptidase homolog, but none have phylogenetic affinity specifically to the α-proteobacteria/mitochondrial lineage. Giardia is impoverished with respect to genes that are phylogenetically linked to α-proteobacteria, unlike other eukaryotes in which up to 20% of mitochondrially targeted proteins show such ancestry (38).

Phylogenetic inference alone cannot resolve Giardia's evolutionary history. Because so many of Giardia's genes may have been derived from horizontal transfer or be subject to accelerated evolution, only a subset can be used to infer phylogeny. Of the ∼1500 genes for which there are known homologs, only a handful included diverse eukaryotic taxa and generated robust trees, largely because the sequences could not be unambiguously aligned. We generated and examined trees for many conserved proteins, and selected ribosomal proteins for a multigene data set because they are an ancient family, whose nature—interaction with rRNAs and with all cellular proteins during their synthesis—constrains their divergence. Phylogenetic relationships were assessed with Bayesian and maximum-likelihood statistical procedures (Supporting Online Material).

The resulting tree (fig. S4) and an earlier analysis based on 100 genes (39) support the deep divergence of Giardia and Trichomonas in the eukaryotic tree. Only Encephalitozoon branches earlier in this tree. The preponderance of molecular data place microsporidia as derived relatives of fungi, on the basis of both gene trees and ultrastructural features (40). Giardia has no such affiliation with another eukaryotic lineage. Genome-scale data from other excavate taxa (41) are needed to resolve whether Giardia and Trichomonas branch deeply because that is their correct position or simply because of “long branch attraction.”

As discussed earlier, Giardia consistently shows a pattern of simplified molecular machinery, cytoskeletal structure, and metabolic pathways compared to later diverging lineages such as fungi and even Trichomonas or Entamoeba (Supporting Online Material; table S7 and fig. S5). A parsimonious explanation of this pattern is that Giardia never had many components of what may be considered “eukaryotic machinery,” not that it had and lost them through genome reduction as is evident for Encephalitozoon. Taking a whole-evidence approach, one sees that these data reflect early divergence, not a derived genome.

Because Giardia has two nuclei, a high level of heterozygosity could accumulate in the genome. Notably, heterozygosity in the genome was estimated to be less than 0.01%. We examined the two largest contigs, representing >1.2 Mbp (10% of the genome) containing 482 single-copy genes, for high-quality mismatches between individual reads and the consensus (table S8). We found only 25 in total, eight of which were in coding regions. This suggests that there may be a biological mechanism for maintaining genome fidelity and reducing heterozygosity between the four genome copies. Meiosis-associated proteins are present in Giardia (42), although they may have alternative functions.

Giardia is an excellent functional and genomic model for other intestinal protozoan parasites whose complete life cycles cannot be replicated in the laboratory. In many pathways that require multiprotein complexes, it is notable that Giardia has fewer recognizable components than other organisms. Whether due to early divergence or genomic reduction, the genome gives valuable clues to the minimal components needed for complex cellular processes. The genome sequence has revealed much but also raised intriguing questions for further study, e.g., the number and distribution of introns and the composition of the giardial spliceosome, how Giardia maintains homozygosity across the separated nuclei, and the function of the novel genes and gene families discovered. The anticipated release of a draft genome from the related Spironucleus vortens, a commensal or opportunistic parasite of angelfish, will enable comparative genomics within the diplomonads and reveal which features of the giardial genome result from its obligate parasitic life-style and which reflect its basal evolutionary position.

Supporting Online Material

www.sciencemag.org/cgi/content/full/317/5846/1921/DC1

Materials and Methods

Figs. S1 to S5

Tables S1 to S8

References

References and Notes

View Abstract

Stay Connected to Science

Navigate This Article