Are There Bugs in Our Genome?

See allHide authors and affiliations

Science  08 Jun 2001:
Vol. 292, Issue 5523, pp. 1848-1850
DOI: 10.1126/science.1062241

For evolutionary biologists working on the exchange of genes between species (lateral gene transfer) [HN1], the most exciting news from the human genome sequencing project [HN2] has been the claim by the “public effort” (1) that between 113 and 223 genes have been transferred from bacteria to humans (or to one of our vertebrate ancestors) over the course of evolution [HN3]. We, and probably many others wanting to test whether this result is really solid (2), have been beaten to the punch by Salzberg and colleagues [HN4] (3). Their analysis, appearing on page 1903 of this week's issue, suggests that the actual number of bacterial genes in our genome may be lower than the predicted 223. These authors argue that there are other biologically plausible explanations besides lateral gene transfer that could account for the presence of bacterial genes in our genome.

The claim for lateral gene transfer from bacteria into vertebrates (as exemplified by our own species' genome) was based on similarity searches [HN5]. Such searches involve screening vertebrate genomes for sequences that are very similar to bacterial genes but are absent from other eukaryotic genomes. Genes shared by vertebrates and bacteria that are not found in other eukaryotes are considered to be probable bacteria-to-vertebrate transfers (BVTs). The 113 to 223 BVTs in question have significant similarity to bacterial sequences but no “comparable similarity” to genes in other completed eukaryotic genomes (1).

Salzberg et al. (3) now provide a careful reanalysis of these data (1) with a similar but more conservative approach that includes the addition of data from Celera's version of the human genome [HN6] (4). As in the original study (1), the investigators' goal was to detect possible transfer of genes by analyzing gene distribution across taxa. They found 135 genes in the public effort's data set of 31,780 protein-encoding sequences (Ensembl proteome) [HN7] and 89 genes in the Celera proteome of 26,544 proteins that were possible BVTs (3). This is similar to the public effort's conservative estimate of 113 possible bacterial genes in the human genome (1). Lateral gene transfer is not the only factor that could explain these results. For instance, differential gene loss (that is, random independent loss of genes in different eukaryotic lineages) may yield similar gene distribution patterns. The Salzberg et al. reanalysis demonstrates that the calculation of the number of bacterial genes in the human genome is highly dependent on how many nonvertebrate genomes were screened against the human genome. These authors found a downward trend in the number of BVTs observed when the human genome was screened against an increasing number of nonvertebrate genomes. Such a pattern is indeed consistent with differential gene loss in eukaryotes, and it is reasonable to assume that the downward trend might continue as more nonvertebrate genomes become available for screening. Indeed, after removing possible BVTs that are more similar to genes in eukaryotic genomes for which only partial sequences are available, the number of possible BVTs drops to 114 and 68 for the Ensembl and Celera proteomes, respectively.

Another factor yielding false BVTs is differences in evolutionary rates among different vertebrate and nonvertebrate lineages. Salzberg et al. investigated the effect of differing evolutionary rates by relaxing the similarity criteria for inclusion of nonvertebrate homologs. Again, they found a reduction in the number of BVTs to 74 (Ensembl) and 56 (Celera). After additional trimming of the data—removing two mitochondrial genes, searching a further curated version of the Ensembl data, checking for annotation errors, and comparing the two data sets—the authors calculated the final number of possible BVTs to be 41 (Ensembl) or 46 (Celera).

So, the original description of 223 BVTs is probably overenthusiastic. But even 41 (or 46) BVTs is sufficient cause for excitement. The statistical arguments of the sort Salzberg et al. present can never eliminate the possibility that some of these BVT candidates (or others that they eliminated with their parsimonious broad-brush approach) really are true BVTs. In fact, although the Salzberg study criticizes conclusions about evolution based on similarity searches, this study, too, is based on a similarity search approach! The best way to determine whether some of the genes in the final list are real BVTs is to construct molecular phylogenetic trees [HN8] for each of the possible instances. If a vertebrate gene sequence is nested within a robust cluster of bacterial sequences, the most probable explanation is that the vertebrate gene was laterally transferred from bacteria. Salzberg and colleagues mention that they have constructed phylogenetic trees for some genes (one is depicted in figure 2 of their paper), but they state that “most did not show patterns consistent with bacterial-to-vertebrate gene transfer.” Yet they don't tell us how many “most” is, nor do they state which genes do show patterns consistent with BVTs.

We have prepared phylogenetic trees for seven of the genes listed in the supplementary information provided by Salzberg et al. Among these, we found one probable case of lateral gene transfer between bacteria and vertebrates: the gene encoding a putative N-acetylneuraminate lyase [HN9] (see the figure). This gene was previously shown (very convincingly) to have been transferred from bacteria into the protozoan parasite Trichomonas vaginalis [HN10] (5). Indeed, the gene found in Trichomonas is nested within a robust bacterial cluster, which would be expected for a gene that has undergone lateral transfer. The vertebrate version of this gene clusters unequivocally with Vibrio cholerae and Yersinia pestis [HN11] genes, indicative of lateral gene transfer involving bacteria and eukaryotes that is independent of the bacteria-to-Trichomonas transfer. In this case, however, it is not possible to infer the direction of the transfer because neither the bacterial genes nor the vertebrate genes are nested within the other.

Gene swapping among friends and neighbors.

Phylogeny of the gene encoding N-acetylneuraminate lyase (Ensembl ID IGI_M1_ctg1425_20). Phylogenetic relationships for this gene among prokaryotes, protozoans, and vertebrates were estimated using TREE-PUZZLE (13) from an amino acid alignment generated by CLUSTALX (14). Bullets on the nodes indicate that bootstrap support values—obtained using neighbor joining of PAM-based distances estimated in PHYLIP (15)—and puzzle support values were above 95% [HN17]. Bacteria donated this gene to the protozoan parasite T. vaginalis. Vertebrates (human, mouse, and pig) together with two bacterial lineages (Vibrio and Yersinia) also show a branching pattern indicative of gene transfer, although it is not possible to infer the direction of the transfer.

Is lateral gene transfer between prokaryotes and eukaryotes an extremely rare event? Vertebrates are multicellular organisms, and so any evolutionarily stable incorporation of foreign DNA must take place in the germ cells that give rise to eggs and sperm. Unicellular eukaryotes, on the other hand, often live close to prokaryotes and frequently use them as food, which means that they have a much greater exposure to prokaryotic DNA than do vertebrate germ cells. Inevitably, this might lead to a gradual replacement of ancient eukaryotic genes with bacterial homologs (6). Indeed, pathogenic protozoa that live in environments rich in prokaryotes demonstrate that this does happen. T. vaginalis, a parasite of vertebrate epithelial cells, no doubt acquired its N-acetylneuraminate lyase gene from bacteria in its environment (see the figure) (5). The protozoan Entamoeba histolytica possesses fermentation enzymes that must have come from different anaerobic prokaryotes, and another protozoan, Giardia lamblia, [HN12] encodes an enzyme in the mevalonate pathway that is unquestionably of bacterial origin (7, 8).

The fact that the ancestors of mitochondria and chloroplasts [HN13] (DNA-containing cellular organelles) have contributed genes to the eukaryotic nucleus poses a serious problem for the detection of prokaryote-to-eukaryote lateral gene transfer. The endosymbiont bacteria that gave rise to mitochondria and chloroplasts have been a major source of bacterial genes in eukaryotic nuclear genomes, and their ancestral lineages are the α-proteobacteria and cyanobacteria, respectively. It seems sensible to infer that nuclear genes of similar ancestry are of endosymbiont origin, whereas those that cluster within other bacterial groups result from independent lateral transfers. However, independent transfers from α-proteobacteria and cyanobacteria subsequent to the original endosymbioses [HN14] surely have also occurred, and these lineages themselves have been recipients of transferred genes (9). Reconstructing the history of prokaryote genes in the eukaryotic nucleus will be a formidable but exciting challenge (10).

To determine the extent to which lateral gene transfer is an important evolutionary force within eukaryotic evolution, we need to move beyond BLAST-based analysis to large-scale phylogenetic analysis [HN15]. This is a realistic task: The methods are available, and several eukaryotic genome sequencing projects from a relatively broad range of eukaryotes are well under way [HN16] (11, 12). We would not be surprised to see eukaryotes distributed in multiple places in a prokaryotic background. Such a finding could only be accounted for by multiple lateral gene transfer events (independent of genes donated by organelles). We expect, however, that the vast majority of these transfer events happened before the evolution of multicellularity. Our multicellularity probably saved us from participating in the dirty business of lateral gene transfer so beloved by microbes.

HyperNotes Related Resources on the World Wide Web

General Hypernotes

D. Glick's Glossary of Biochemistry and Molecular Biology is provided on the Web by Portland Press.

P. Gannon's Cell & Molecular Biology Online is a collection of annotated links to Internet resources.

The CMS Molecular Biology Resource is a compendium of electronic and Internet-accessible tools and resources for molecular biology, biotechnology, molecular evolution, biochemistry, and biomolecular modeling.

The Karolinska Institutet Library, Stockholm, Sweden, provides collections of Internet resources related to microbiology and cell biology and molecular biology and genetics.

The Internet Resource Guide for Zoology from BIOSIS includes links to Web resources on evolution and phylogeny.

Science's Functional Genomics Web site provides links to news, educational, and scientific resources in genomics and post-genomics.

BioWurld, hosted by the European Bioinformatics Institute, is a searchable index of resources in the fields of bioinformatics and molecular biology.

Deambulum, an Internet resource for molecular biology, biocomputing, medicine, and biology, is provided by Infobiogen.

Genomics: A Global Resource is a collection of links to Internet resources presented by the Pharmaceutical Research and Manufacturers of America. A genomics glossary is provided.

The U.S. National Center for Biotechnology Information (NCBI) is a resource for molecular biology information and databases.

The ExPASy Molecular Biology Server from the Swiss Institute of Bioinformatics provides protein databases, as well as a selection of proteomics tools and a large collection of WWW Links on proteins and molecular biology.

Celera's Genome News Network offers news articles and educational materials about genomes and genomic research. Primers on assembling and sequencing the genome are provided.

Cracking the Code of Life is a Web presentation from the Public Broadcasting Service's NOVA Online.

J. Kimball presents Kimball's Biology Pages, an online biology textbook and glossary.

U. Melcher, Department of Biochemistry and Molecular Biology, Oklahoma State University, provides a Web tutorial on molecular genetics.

The Weizmann Institute of Science, Rehovot, Israel, offers an introduction to bioinformatics and lecture notes for a course on bioinformatics and computational genomics.

M. Werner-Washburne, Department of Biology, University of New Mexico, offers lecture notes and Web links for a course on genomes and genomic analysis.

D. Smith, Division of Biology, University of California, San Diego, makes available lecture notes for a course on molecular biology. Lecture notes on whole genome analysis and bioinformatics are provided, as is an introduction to bioinformatics.

J. P. Gogarten, Department of Molecular and Cell Biology, University of Connecticut, provides lecture notes for a course on bioinformatics and lecture notes for a course on computer methods in molecular evolution.

S. Ward, Department of Molecular and Cellular Biology, University of Arizona, offers lecture notes for a course on bioinformatics and genomic analysis. Lecture notes on aligning sequences and constructing trees and lecture notes on phylogenetic analysis of sequences are included.

R. Davis, Department of Biology, College of Staten Island and City University of New York Graduate Center, provides lecture notes, readings, and Internet links for a course on bioinformatics and genomics.

M. Gribskov, Computational Biology Center, University of California, San Diego, provides lecture notes for a course on bioinformatics.

The Microbiology Group in the Division of Cell and Molecular Biology, Uppsala University, Sweden, makes available lecture notes on bioinformatics by P. Stolt for a course on microbial genetics.

D. Rand, Program in Biology, Brown University, provides lecture notes for a course on evolutionary biology. Presentations on molecular evolution, phylogenetic inference, and molecular systematics are included.

The Marine Biological Laboratory, Woods Hole, MA, makes available the papers presented at a workshop on molecular evolution.

The September-October 2000 issue of Emerging Infectious Diseases had an article by C. Fraser et al. titled “Comparative genomics and understanding of microbial biology.”

The 26 March 1999 issue of Science had a Perspective by J. Lake, R. Jain, and M. Rivera titled “Mix and match in the tree of life.” The 1 May 1998 issue had a Research News article by E. Pennisi titled “Genome data shake tree of life.”

Numbered Hypernotes

1. U. Melcher's molecular genetics tutorial includes an introduction to horizontal (lateral) gene transfer. The 22 July 2000 issue of Science News had an article by J. Travis titled “Pass the genes, please: Gene swapping muddles the history of microbes.” J. Brown, Department of Microbiology, North Carolina State University, offers lecture notes on lateral transfer for a course on microbial diversity. The VIRTUE Newsletter of the Virtual University Education program makes available a lecture presentation by K. Nelson on lateral gene transfer for a bioinformatics course. For a course on molecular evolution, A. Schurko, Department of Microbiology, University of Manitoba, provides lecture notes on horizontal gene transfer, its relation to phylogenetics, its impact on microbial evolution, and its relationship to the small subunit rRNA evolutionary tree. The 30 March 1999 issue of the Proceedings of the National Academy of Sciences had an article by R. Jain, M. Rivera, and J. Lake titled “Horizontal gene transfer among genomes: The complexity hypothesis.” The August 2000 issue of EMBO Reports had a viewpoint article by C. Kurland titled “Something for everyone: Horizontal gene transfer in evolution.” The 11 May 2001 Science special issue on the ecology and evolution of infection had a Viewpoint article by H. Ochman and N. Moran titled “Genes lost and genes found: Evolution of bacterial pathogenesis and symbiosis” that discussed lateral gene transfer.

2. The 16 February 2001 issue of Science and the 15 February 2001 issue of Nature were special issues on the human genome. NCBI provides a guide to online resource on the human genome, as well as an introduction to the draft human genome sequence. The Human Genome Program of the U.S. Department of Energy provides a Human Genome Project Information Web site.

3. The article by E. Lander et al. titled “Initial sequencing and analysis of the human genome” appeared in the 15 February 2001 issue of Nature; the section on gene content of the human genome included the discussion of transfer genes and a table of probable vertebrate-specific acquisitions of bacterial genes. The U.S. National Human Genome Research Institute issued a 12 February 2001 news release titled “International Human Genome Sequencing Consortium publishes sequence and analysis of the human genome” with a section about the transfer genes. GenomeWeb provides a 20 March 2001 news article by M. Jones titled “Did bacteria really transfer genes to humans? Scientists call HGP claim into question” and a 29 March 2001 article by S. Calvo titled “Lander defends HGP's horizontal gene transfer theory.”

4. S. Salzberg, O. White, J. Peterson, and J. Eisen are at The Institute for Genomic Research (TIGR), Rockville, MD. TIGR issued a press release about this research on 17 May 2001 when the report by Salzberg et al. was published on Science Express. Celera's Genome News Network makes available a 21 May 2001 article by E. Winstead titled “Researchers challenge recent claim that humans acquired 223 bacterial genes during evolution.” makes available a 17 May 2001 Reuters news article by W. Dunham titled “Study rejects claim of bacterial genes in humans.”

5. NCBI's BLAST (Basic Local Alignment Search Tool) Web site provides an introduction to similarity searching and a glossary. M. Werner-Washburne offers lecture notes and Web links on similarity searching for a course on genomes and genomic analysis. The Weizmann Institute of Science makes available lecture notes on similarity searching in databases for a course on bioinformatics and computational genomics. The Corso di Laurea in Biotecnologie at the Università degli Studi di Napoli Federico II makes available lecture notes on similarity searching for a bioinformatics course.

6. The Consensus Human Genome Web site is provided by Celera Genomics.

7. The Ensembl Human Genome Server is a joint project of the European Bioinformatics Institute and the Sanger Centre.

8. offers an Encyclopædia Britannica article on molecular evolution with a section on molecular phylogeny. Kimball's Biology Pages offers a presentation on phylogenetic trees. The University of California Museum of Paleontology (UCMP) provides an introduction to phylogeny, a glossary of phylogenetic terms, and a collection of phylogeny Web links, as well as the special exhibit Journey Into Phylogenetic Systematics. A. Vierstraete, Department of Biology, University of Ghent, Belgium, offers a presentation on phylogenetics and phylogenetic trees. Phylogenetics: Computing Evolution is a section of a course on using computers in molecular biology offered by the New York University Medical Center. M. Kuhner, Department of Genetics, University of Washington, provides lecture notes on phylogenetic trees (parts one and two) for a course on evolutionary genetics. The Tree of Life, coordinated by D. Maddison, Department of Entomology, University of Arizona, is a collaborative Internet project containing information about phylogeny and biodiversity.

9. ExPASy's ENZYME nomenclature database has an entry for N-acetylneuraminate lyase. SWISS-PROT has an entry for N-acetylneuraminate lyase. EcoCyc, an encyclopedia of E. coli genes and metabolism, provides summary information on N-acetylneuraminate lyase. The Protein Data Bank has an entry with structural images for N-acetylneuraminate lyase.

10. offers an Encyclopædia Britannica article on protozoa. P. Tatner, Department of Biological Sciences, University of Paisley, UK, provides a tutorial on protozoa as part of the Biomedia Web presentation. The Parasites and Parasitological Resources Web page, maintained by the College of Biological Sciences, Ohio State University, provides information about Trichomonas vaginalis. P. Johnson, Department of Microbiology and Immunology, University of California, Los Angeles, offers lecture notes on T. vaginalis for a course on molecular parasitology. Medical Microbiology, a Web textbook edited by S. Baron, Graduate School of Biomedical Sciences, University of Texas Medical Branch, includes a chapter on T. vaginalis. P. Keeling, Department of Botany, University of British Columbia, makes available (in Adobe Acrobat format) the article by A. de Koning et al. titled “Lateral gene transfer and metabolic adaptation in the human parasite Trichomonas vaginalis” (5) that appeared in the November 2000 issue of Molecular Biology and Evolution. A. de Koning, Genetics Graduate Program, University of British Columbia, presents a summary of her thesis research on lateral gene transfer from bacteria to protists.

11. D. Fix, Department of Microbiology, Southern Illinois University, provides information about Vibrio cholerae and Yersinia pestis for a medical microbiology course. S. Baron's Medical Microbiology includes chapters with information on V. cholerae and Y. pestis. D. Portnoy, Department of Biochemistry and Molecular Biology, University of California, Berkeley, provides lecture notes on V. cholerae and Y. pestis for a course on bacterial pathogenesis. The Division of Vector-Borne Infectious Diseases, Centers for Disease Control and Prevention, provides information about Y. pestis. The Sanger Centre offers a Yersinia pestis Web page. The Department of Medical Microbiology, St. Bartholomew's and the Royal London School of Medicine and Dentistry, provides a Yersinia pestis genome project Web site; a selection of Internet links about Y. Pestis and plague are included. K. Todar, Department of Bacteriology, University of Wisconsin, provides lecture notes on V. cholerae for a bacteriology course. TIGR provides information about the Vibrio cholerae genome.

12. P. Johnson offers lecture notes on Giardia and Entamoeba for a course on molecular parasitology. S. Baron's Medical Microbiology includes chapters with information on Entamoeba histolytica and Giardia lamblia. The Parasites and Parasitological Resources Web page provides information about E. histolytica and G. lamblia. The Foodborne Pathogenic Microorganisms and Natural Toxins Handbook (“Bad Bug Book”), made available by Center for Food Safety and Applied Nutrition of the U.S. Food and Drug Administration, provides information about G. lamblia and E. histolytica. The London School of Hygiene and Tropical Medicine hosts the Entamoeba Homepage. B. Soltys, Imaging Research Inc., makes available a presentation titled “Giardia lamblia: Cell biology and microscopy of one of the most primitive eukaryotes.” The Sogin Laboratory at the Marine Biological Laboratory provides information about the Giardia lamblia genome project. TIGR provides information about the Entamoeba histolytica genome project.

13. provides an Encyclopædia Britannica article on mitochondria and the chloroplasts. Kimball's Biology Pages include introductions to mitochondria and chloroplasts. The MIT Biology Hypertextbook includes a section on the structure and function of organelles with information on mitochondria and chloroplasts.

14. U. Melcher's molecular genetics tutorial includes sections on interorganellar gene transfer and the endosymbiont theory for the origin of organelles. Kimball's Biology Pages include a presentation titled “Endosymbiosis and the origin of eukaryotes.” V. Gooch, Division of Science and Mathematics, University of Minnesota, Morris, offers lecture notes on endosymbiotic theory for a cell biology course. D. Keats, Department of Botany, University of the Western Cape, South Africa, offers a presentation about endosymbiosis theory titled “The origin of the eukaryote chloroplast” for a course on the classification and phylogeny of photosynthetic organisms. Georgia Geoscience Online makes available a student project by J. Brand on the endosymbiotic theory of eukaryote evolution, which was prepared for course on historical geology. The 9 March 1999 issue of Science had a Review by M. Gray, G. Burger, and B. F. Lang titled “Mitochondrial evolution.” The presentation on eukaryotes in the Tree of Life includes a section titled “Symbioses, endosymbioses and the origin of the eukaryotic state.”

15. The March 1997 issue of Current Biology had an article by D. Hillis titled “Phylogenetic analysis.” B. Leander, Center for Ultrastructural Research, University of Georgia, makes available a student paper titled “The influence of molecular biology on phylogenetics,” which was prepared for a course on gene technology. The 18 July 2000 issue of the Proceedings of the National Academy of Sciences had an article by C. Woese titled “Interpreting the universal phylogenetic tree.” The National Human Genome Research Institute makes available a presentation (in Adobe Acrobat format) by E. Koonin titled “Comparison of complete genomes: Functional and evolutionary inferences,” which was prepared for a 1999 course covering important areas in the field of genome analysis. The 25 June 1999 issue of Science (a special issue on evolution) had a Review by W. F. Doolittle titled “Phylogenetic classification and the universal tree”; the 19 November 1999 issue had correspondence about the article titled “Lateral gene transfer, genome surveys, and the phylogeny of prokaryotes.”

16. TIGR provides a collection of links to genome sequencing projects completed and underway. GOLD (Genomes Online Database), sponsored by Integrated Genomics, Inc., is a Web resource providing links to information regarding complete and ongoing genome projects.

17. J. Felsenstein, Department of Genetics, University of Washington, provides a Web page for the PHYLIP package of programs for inferring phylogenies; also provided are an annotated collection of links to phylogeny programs. TREE-PUZZLE is a computer program to reconstruct phylogenetic trees from molecular sequence data by maximum likelihood. ClustalX is a Windows interface for the ClustalW multiple sequence alignment program. J. P. Gogarten provides lecture notes on using TREE-PUZZLE and CrustalW.

18. J. O. Andersson, W. F. Doolittle, and C. L. Nesbø are in the CIAR Program in Evolutionary Biology, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia.


Navigate This Article