Fermentation, Hydrogen, and Sulfur Metabolism in Multiple Uncultivated Bacterial Phyla

See allHide authors and affiliations

Science  28 Sep 2012:
Vol. 337, Issue 6102, pp. 1661-1665
DOI: 10.1126/science.1224041

This article has a correction. Please see:


BD1-5, OP11, and OD1 bacteria have been widely detected in anaerobic environments, but their metabolisms remain unclear owing to lack of cultivated representatives and minimal genomic sampling. We uncovered metabolic characteristics for members of these phyla, and a new lineage, PER, via cultivation-independent recovery of 49 partial to near-complete genomes from an acetate-amended aquifer. All organisms were nonrespiring anaerobes predicted to ferment. Three augment fermentation with archaeal-like hybrid type II/III ribulose-1,5-bisphosphate carboxylase-oxygenase (RuBisCO) that couples adenosine monophosphate salvage with CO2 fixation, a pathway not previously described in Bacteria. Members of OD1 reduce sulfur and may pump protons using archaeal-type hydrogenases. For six organisms, the UGA stop codon is translated as tryptophan. All bacteria studied here may play previously unrecognized roles in hydrogen production, sulfur cycling, and fermentation of refractory sedimentary carbon.

Sequencing of total DNA recovered directly from natural systems (metagenomics) often reveals previously unknown genes (1, 2) and has the potential to yield near-complete genomes suitable for metabolic and phylogenetic analyses (35). Numerous bacteria are known exclusively through cultivation-independent recovery of their ribosomal RNA (rRNA) genes and thus are important targets for this approach (6). Here, we sequenced DNA from three microbial communities from an acetate-amended aquifer to reconstruct genomes of organisms that may contribute to biogeochemical cycling in anoxic subsurface environments. We recovered 49 genomes from members of candidate phyla widely encountered in rRNA microbial surveys and found evidence for metabolic strategies not previously described in Bacteria. Overall, this study contributes new insights into the physiology and diversity across several major branches of the tree of life.

Groundwater samples (A, C, and D) were collected 5, 7, and 10 days after the start of addition of acetate to an anoxic aquifer in Colorado, USA (fig. S1) (7). From each sample, we recovered microbial cells that passed through a 1.2-μm prefilter to be retained on a 0.2-μm filter. The samples were immediately frozen on site for DNA extraction and for mass spectrometry–based proteomics to verify the activity of organisms in situ (7).

Illumina sequences from DNA extracted from each sample were coassembled by using strategies optimized for community genomics (7). Using EMIRGE (8), 16S rRNA genes were reconstructed and confirmed by clone library analysis (7). By linking 16S rRNA (Fig. 1 and fig. S2) to phylogenetically informative genes in the assembly (fig. S3 and table S1), we demonstrated genomic sampling of organisms affiliated with the phylum-level groups OD1, OP11, and BD1-5 (7). Another bacterial group, although it formed a monophyletic clade, did display wandering behavior in terms of relative phylogenetic position in protein-coding phylogenetic analyses; it is referred to as the Peregrines (PERs). On the basis of protein-coding trees and partial 16S rRNA gene information, we suggest that PERs may represent a previously unknown phylum-level branch within the bacterial domain (7).

Fig. 1

Maximum likelihood 16S rRNA gene phylogenetic tree showing the placement of the uncultivated phyla recovered by EMIRGE (bold text). The closest representative to each EMIRGE sequence is denoted in parentheses (clone library or Silva accession number). The bar chart indicates the relative abundance of each sequence in the A, C, and D samples (maximum is 11%), with bootstrap support >80 noted. The 16S rRNA tree from all organisms and details are provided in fig. S2 and (7).

Genome fragments from the coassembly (termed ACD) were clustered into 87 organism “bins” from emergent self-organizing map analysis of their tetranucleotide sequence composition (9). Organism abundance ratios between samples were used to further refine binning (7). Here, we focus on 49 genomes from BD1-5, OP11, OD1, and PER, relevant to carbon, sulfur, and hydrogen cycling (additional analyses of these and other genomes will be reported separately). From the inventory of conserved, single-copy genes (7, 10), we estimated that 21 of the 49 genomes were >90% complete (Table 1). Note that the majority of these organisms each represented ~1% of the assembled community (table S2).

Table 1

Overview of genome recovery. Individual genome completion information is in fig. S8.

View this table:

Previously, only 33 protein-coding genes have been reported for the OD1 phylum (11). We recovered more than 24,000 OD1 gene sequences (with an average of 1119 genes per genome) for 21 species, on genome fragments up to 358 kilobase pairs. Phylogenetically, we resolved multiple OD1 lineages and recognized one sublineage, OD1-i (fig. S3). For OP11, there has only been one fragmented partial sampling (~270 kilobases) from a single cell (16S rRNA OP11 class “unclassified”) (12). Here, we recovered more than 25,000 genes (with an average of 1337 genes per genome) for 19 organisms from OP11 classes I and WCBH1-64 (Fig. 1). There has been no previous genomic sampling of PER or BD1-5 (7).

BD1-5 genes predicted using the standard bacterial genetic code were anomalously short (7), with a low overall coding density. We deduced, and confirmed using proteomic analyses (Fig. 2), that they use genetic code 4, where the normal stop codon (UGA) is translated as tryptophan (W). Recoding of UGA to W in Bacteria is rare but has been noted within the Firmicutes and Proteobacteria phyla (some Mollicutes and Alphaproteobacteria). It is often associated with small genomes and low GC content and may be a consequence of genome reduction (13). Our code 4 genomes are estimated to be <2 Mb and have low but variable GC contents (27 to 43%). One genotype, ACD80 is phylogenetically very distantly related to the other BD1-5 organisms (fig. S3), and it may ultimately be recognized as a separate lineage (7). Future sampling may resolve whether code 4 usage is an ancestral trait or arose independently.

Fig. 2

Detection of alternative coding. (A) Histogram of average open reading frame (ORF) length achieved with ORF predictions using the standard bacterial genetic code. The peak with unusually small gene lengths is associated with ORFs that should have been predicted using code 4. (B) Peptides identified by proteomics were mapped onto proteins with code 4 predictions to verify that UGA codes for tryptophan (W).

Genomic analyses indicate that all 49 organisms were nonrespiring. Given the lack of a tricarboxylic acid (TCA) cycle, subunits of the reduced form of nicotinamide adenine dinucleotide (NADH) dehydrogenase, and most other electron-transport chain complexes including terminal oxidases, we infer a strictly anaerobic fermentation-based lifestyle. All have a glycolysis (Embden-Meyerhof-Parnas) pathway and convert pyruvate to acetyl–coenzyme A (acetyl-CoA) without pyruvate dehydrogenase, instead they use pyruvate-formate lyase (PFL in PER) or pyruvate ferredoxin oxidoreductase (PFOR in OD1, OP11, and BD1-5) (Fig. 3). Most generate adenosine triphosphate (ATP) by converting acetyl-CoA to acetate via two enzymes (acetate kinase and phosphate acetyltransferase) (Fig. 3B). However, the OD1-i are predicted to use a single enzyme, acetyl-CoA synthetase (EC: (Fig. 3A) for ATP generation, which is common in some fermentative sulfur-reducing Archaea (e.g., Pyrococcus spp.) but rare in Bacteria (14). OD1-i may reoxidize NADH produced during glycolysis by converting pyruvate to d-lactate and acetyl-CoA to ethanol (Fig. 3A). Excreted fermentation end products could sustain Geobacteraceae and sulfate-reducing bacteria that bloom when acetate is added to the aquifer to promote uranium bioremediation (fig. S2) (15).

Fig. 3

Metabolic models with detected genes (white box), genes with proteins confirmed by proteomics (green box), and genes missing from pathways (red box). For full gene information for box numbers see table S4. (A) OD1-i may produce acetate, ethanol, lactate, and hydrogen as fermentation end products. Hydrogen could be generated via NiFe type 4 membrane-bound hydrogenases (red outline) and cytoplasmic type 3b sulfhydrogenase (yellow outline). The expanded view shows type 4 hydrogenase-generated PMF and ATP synthesis (red line) and possible hydrogen cycling to type 3b hydrogenases (black dashed line). FDH, formate dehydrogenase; Fdox and Fdred, oxidized and reduced ferredoxin, respectively; G3P, glyceraldehyde-3-phosphate; MHC, multiheme c-type cytochrome; NADP+, nicotinamide adenine dinucleotide phosphate; Q/QH, ubiquinone–reduced ubiquinone; rhodobacter nitrogen-fixing (RNF) complex. (B) PERs produce acetate and formate from pyruvate (no formate dehydrogenases or hydrogenases were identified). PER genomes lack detectable mechanisms for recovering additional energy via membrane-potential coupled to ATP synthase but may use RuBisCO, analogous to Archaea, for salvaging ATP by shunting nucleotides through central-carbon metabolism. PEP, phosphoenolpyruvate.

The BD1-5, OP11, and OD1 phyla contain members that augment substrate-level phosphorylation from fermentation by coupling a membrane proton-motive force (PMF) (independent of a canonical electron-transport chain) to ATP generation via a F1Fo ATP synthase. In some BD1-5 and OP11, this appears to rely on proton-pumping by membrane-bound pyrophosphatases (H+-PPases). Similar H+-PPases studied in syntrophic bacteria (16) translocate protons across the membrane for energy conservation via ATP synthase. We find proteomic support for H+-PPase (BD1-5 and OP11), alcohol dehydrogenase (OD1), glycolysis, PFOR or PFL, and ATP synthase from all lineages; this confirmed fermentation in situ (Fig. 3 and table S3).

Some fermentative anaerobes produce H2 to dispose of excess reductant (17). In OD1 and OP11, we identified three Fe-only hydrogenases and 23 NiFe hydrogenases (table S5). Phylogenetic analyses of the NiFe hydrogenase catalytic subunits revealed that 17 are type 3b cytoplasmic hydrogenases most closely related to those of fermentative, sulfur-reducing Thermococcales Archaea (figs. S4 and S5) (18). The type 3b hydrogenases may produce H2 during fermentation or H2S when polysulfide is available. Alternatively, they may consume H2 to produce the reduced form of nicotinamide adenine dinucleotide phosphate (NADPH) for anabolic metabolism (18, 19). Proteomic detection of hydrogenase-related proteins in conjunction with proteins for fermentation, as well as the abundance of organisms with this capacity when sulfide was detected in groundwater (Figs. 1 and 3A), supports a role in H2 or H2S generation rather than H2 consumption. We also identified type 4 membrane-bound hydrogenase genes in some OD1 genomes (fig. S5). The dual hydrogenase system may function in intracellular hydrogen cycling, as in Thermococcales (7, 2022), where PMF and H2 are produced by the membrane-bound hydrogenase, and H2 is shuttled to the cytoplasmic hydrogenase where it is oxidized to produce NADPH (Fig. 3A).

In ACD80 and two PER genomes, we identified putative ribulose-1,5-bisphosphate carboxylase-oxygenase (RuBisCO) genes potentially involved in CO2 assimilation. All residues for catalytic activity are present, except one for substrate binding that is absent in the ACD80 version (fig. S6). This finding and structural modeling (7) suggest carboxylase-oxygenase activity. Our RuBisCO sequences clade separately from the bacterial type II sequences and are most closely related to a sequence from Methanococcoides burtonii (MBR) and a global ocean sampling (GOS)–derived sequence of unknown affiliation that may represent a II/III hybrid RuBisCO (Fig. 4) (23). We expanded the membership of this clade to eight sequences by identifying three genes in publicly available methanogenic archaeal genomes (Methanohalophilus zhilinae, Methanosaeta concilii, and Methanohalophilus mahii) (table S1C). Our data show that the RuBisCO hybrid occurs in Bacteria.

Fig. 4

Maximum likelihood phylogenetic tree constructed for the RuBisCO large subunit. Together, the new sequences resolve a novel intermediate II/III RuBisCO lineage (black). Bootstrap values >80 are shown. The position of the node for the II/III hybrid clade is strongly supported, as it is present in >92% of all trees examined during bootstrap analysis in this and prior analyses (>100%) (23).

The MBR type II/III RuBisCO function is comparable to the type III archaeal RuBisCO (23). It does not function in the Calvin-Benson-Bassham (CBB) pathway but fixes CO2 and contributes to adenosine monophosphate (AMP) recycling (7, 24). We predict that the bacterial type II/III RuBisCO genes share this function. Homologs of DeoA and E2b2, key enzymes of the CO2-fixing AMP-recycling pathway (24), were identified (Fig. 3B), and no essential CBB cycle enzymes (e.g., phosphoribulokinase) were detected. Salvaging purine-pyrimidine products to produce RuBisCO generates 3-phosphoglycerates and perhaps pyruvate (ACD80), which could ultimately be fermented for ATP production (7).

From this study and rRNA gene survey information indicating prevalence in anoxic, organic carbon-rich environments (25, 26), we predict widespread fermentation-based metabolism in the 49 OD1, OP11, BD1-5, and PER genomes sampled here. We find it intriguing that several pathways for anoxic carbon, hydrogen, and sulfur cycling in these organisms share features previously documented only in Archaea. Some OD1 may contribute to sulfur cycling, on the basis of their previous association with sulfur-rich environments (11, 12, 2628). Given the absence of genes for sulfur respiration in our near-complete OD1-i genomes, the link may involve hydrogenase-mediated sulfur-reductase activity. Notably, these insights were obtained through cultivation-independent analyses and have contributed more near-complete (>90%) genomic sampling for OD1 than is available for almost half of all of the genomically characterized bacterial phyla (29).

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S9

Tables S1 to S6

References (3058)

Database 1

References and Notes

  1. Materials and methods are available as supplementary materials on Science Online.
  2. Acknowledgments: We thank F. Larimer for RuBisCO analyses input; M. Shah for proteomic support; C. Thrash, P. Hugenholtz, and J. Eisen for manuscript suggestions; and the U.S. Department of Energy (DOE) Subsurface Biogeochemistry Program for funding, the DOE Knowledgebase Program for funding GGkbase, and EMBO for a fellowship to I.S. The Rifle, Colorado, Integrated Field Research Center Project is managed by Lawrence Berkeley National Laboratory for the U.S. DOE (contract no. DE-AC02-05CH11231). Portions of this work were performed in the Environmental Molecular Science Laboratory, a DOE national scientific user facility at Pacific Northwest National Laboratory. The sequences were deposited in the National Center for Biotechnology Information Sequence Read Archive (accession no. SRA050978.1). This Whole-Genome Shotgun project has been deposited at GenBank under the accession no. AMFJ00000000. The version described in this paper is the first version, AMFJ01000000. Genomic and proteomic data can be accessed via
View Abstract

Navigate This Article