Early-branching gut fungi possess a large, comprehensive array of biomass-degrading enzymes

See allHide authors and affiliations

Science  18 Feb 2016:
DOI: 10.1126/science.aad1431


The fungal kingdom is the source of almost all industrial enzymes in use for lignocellulose bioprocessing. We developed a systems-level approach that integrates transcriptomic sequencing (RNA-Seq), proteomics, phenotype and biochemical studies of relatively unexplored basal fungi. Anaerobic gut fungi isolated from herbivores produce a large array of biomass-degrading enzymes that synergistically degrade crude, untreated plant biomass, and are competitive with optimized commercial preparations from Aspergillus and Trichoderma. Compared to these model platforms, gut fungal enzymes are unbiased in substrate preference due to a wealth of xylan-degrading enzymes. These enzymes are universally catabolite repressed, and are further regulated by a rich landscape of noncoding regulatory RNAs. Additionally, we identified several promising sequence-divergent enzyme candidates for lignocellulosic bioprocessing.

Lignocellulosic biomass from plant matter is an abundant, renewable starting material for biofuel and industrial chemical production (1, 2). Industrial-scale processes require fungal enzymes to convert biomass into fermentable sugars. However, lignin must be removed from crude biomass with costly pretreatment processes (1) to permit enzymatic degradation and sugar release (3). The need for multiple enzyme production processes increases this cost further, as genetically modified fungal platforms such as Trichoderma reesei and Aspergillus nidulans over produce limited subsets of enzymes that are unable to independently digest even pretreated substrates completely to sugars (table S1) (4). Economical chemical production will require a versatile, unbiased platform to produce all enzymes needed to hydrolyze diverse lignocellulose feedstocks into fermentable sugars without pretreatment.

Microbes found in the digestive tract of large herbivores are attractive enzyme platforms for lignocellulose processing (5). Among these are Neocallimastigomycota (anaerobic gut fungi), the primary colonizers of biomass in ruminants and the earliest-branching non-parasitic fungi still living (6). Although they account for approximately 8% of the gut microflora, they degrade up to 50% of the untreated biomass through invasive growth and enzyme secretion (79). Neocallimastigomycota contain a diverse repertoire of biomass degrading enzyme (table S1) that degrade a range of feedstocks with equal efficiency (Fig. 1), making them rich untapped sources for new lignocellulolytic enzymes. However, their strict anaerobic lifestyle, complex nutritional requirements, and culture recalcitrance have severely hindered early attempts at isolation, exploitation, and molecular characterization (10).

Fig. 1 Anaerobic fungi degrade crude biomass.

(A) Relative growth of gut fungal isolates on crystalline cellulose and crude C3/C4 bioenergy crops (see table S3 for specific growth rates). (B) Relative xylan activity of cellulose precipitated gut fungal secretions and commercial Trichoderma (Celluclast) and Aspergillus (Viscozyme). (C) Relative hemicellulose:cellulose activity (xylan versus carboxymethylcellulose [CMC]) activity of cellulose precipitated gut fungal secretions and commercial preparations. Data represent mean ± SEM of > 3 samples.

We isolated three previously uncharacterized cultures from the feces of different herbivorous mammals with varied diets. Microscopy and ITS1 sequencing (11) verified that the isolates were distinct species, representing separate genera of Neocallimastigomycota (Anaeromyces robustus, Neocallimastix californiae, and Piromyces finnis). Each grew on C3/C4 grasses at comparable rates to soluble substrates (Fig. 1A). Anaeromyces had a clear preference for glucose and grew more slowly on switch grass (~20% glucose growth rate). In contrast, the monocentric fungi, Piromyces and Neocallimastix, displayed limited substrate preference with growth rates varying no more than 20% from the mean growth rate across all substrates. Similarly, these fungi had slight growth advantages on crude lignocellulose, growing up to 20% faster on reed canary grass (Phalaris arundinacea), an invasive species and bioenergy crop (12), when compared to glucose.

We collected and purified the biomass degrading enzymes from fungal supernatants by exploiting the ability of many cellulases to bind to cellulose. The purified extracts were then tested for hydrolytic activity against several cellulosic substrates and analogs (fig. S1). Gut fungal secretions were active against all tested substrates demonstrating cellulase, β-glucosidase, and hemicellulase activities comparable to those from engineered preparations of Trichoderma and Aspergillus. Neocallimastigomycota, and Piromyces in particular, displayed as much as a 300% increase in xylan degradation activity when compared to commercial Aspergillus enzyme formulations (Fig. 1B). Gut fungi degrade cellulose at similar rates, demonstrating little preference for cellulose or hemicellulose (Fig. 1C), in agreement with their enzymatic distribution from genomic sequencing (table S1). This comprehensive array of biomass degrading enzymes, and their inherent synergy, broadens the range of substrates that can be degraded effectively, making gut fungi better suited than later-diverging species to degrade diverse polymers found within crude plant biomass. More importantly, it is this synergy, and not enzyme diversity, that is responsible for the superior biomass degradation abilities of Piromyces, making it an intriguing model system for further study.

Transcripts encoding biomass degrading enzymes comprise ~2% of the gut fungal transcriptomes (data S1 to S3) containing diverse functions classified into distinct lignocellulolytic glycosyl hydrolase (GH) and other carbohydrate-active enzyme (CAZyme) domains (13) (Fig. 2A). The majority of these transcripts also encode non-catalytic dockerin domains thought to mediate self-assembly of an extracellular catalytic complex or cellulosome (Fig. 2, B and C) for synergistic degradation of lignocellulose (14). The hydrolytic capabilities of gut fungi on crude biomass are well explained by the functional expansions of many CAZyme families (table S1 and fig. S3). Neocallimastigomycota are rich in hemicellulases (notably GH10) and polysaccharide deacetylases, which allow gut fungi to remove hemicellulose and access the energy-rich cellulose core of plant biomass (15) in the absence of pretreatment. This process is greatly aided by pectin removal (16) with a number of polysaccharide lyases, carbohydrate esterases and GH88s, allowing the anaerobic fungi to readily degrade an array of lignin-rich C3/C4 bioenergy crops without pretreatment (Fig. 1A).

Fig. 2 Anaerobic fungi contain a wealth of biomass degrading machinery.

(A) Distribution of cellulolytic carbohydrate-active enzyme (CAZy) transcripts and their regulatory antisense in Piromyces. CAZymes are bolded while antisense are indicated in parentheses and plotted in a lighter shade. Other refers to pectinases and accessory enzymes that separate cellulose and hemicellulose from other cell wall constituents. (B) A proposed model for an extracellular catalytic complex for cellulose degradation. (C) CAZyme composition of the putative extracellular complex. Each square represents a single enzyme that encodes a CAZyme fused to at least one dockerin domain. PD = polysaccharide deacetylase (acetylxylan esterase), CE = carbohydrate esterase (excluding pectinesterases), RL = Rhamnogalacturonate lyase. (D) Identity of predominant secreted gut fungal CAZYmes in the cellulose-precipitated fraction. Bands were excised and mapped to the transcriptome by tandem MS (fig. S4).

Functional annotations of the transcriptome were validated within Piromyces, Anaeromyces, and Neocallimastix via a proteomic survey (data S5 to S7). Proteins secreted from Piromyces in the presence of reed canary grass were isolated by cellulose precipitation (Fig. 2D and fig. S4) and individually mapped using mass spectrometry (17) to over 50 cellulolytic transcripts including 25 GH families enriched in or specific to the anaerobic fungal lineage (GH9, GH45, GH48, GH10, GH11). Also present were the full complement of endoglucanases, exoglucanases and β-glucosidases needed to fully depolymerize cellulose (GH5, GH6, GH9, GH45, GH48) and hemicellulases (GH10, GH11) (table S2), with many transcripts containing dockerin domains for extracellular fungal cellulosome formation.

A pervasive feature of gut fungal transcriptomes is long noncoding antisense transcripts (asRNA) (data S1 to S3). At least 11% of the Piromyces transcriptome is noncoding and complementary to putative targets involved in a range of catalytic and developmental pathways, including biomass degradation (Fig. 2A and fig. S2). asRNA is functionally enriched (hypergeometric test) in a number of GO processes such as cellulose catabolic process (pval = 0.02), ribosome biogenesis (pval = 10−11), RNA-dependent DNA replication (pval = 6 × 10−6), and amino acid transmembrane transport (pval = 0.003) (data S4). These results infer a role for asRNA regulation in fungal cellulose catabolism, and suggest that noncoding asRNA may be as critical for function in early-branching Neocallimastigomycota as they are in higher fungal lineages (1820).

To assess how the activities of biomass degrading enzymes are coordinated, we grew Piromyces cultures on lignocellulose and perturbed the system with a small pulse of glucose to induce catabolite repression, collecting RNA samples until the glucose was consumed (Fig. 3A). 374 transcripts showed more than a 2-fold change in expression (p ≤ 0.01) with a third of these transcripts containing cellulolytic domains (Fig. 3B). Among these regulated cellulolytic transcripts were all the MS-validated proteins expressed under growth on reed canary grass (table S2), with the exception of GH45 and XylA. Transcripts associated with biomass degradation were almost exclusively repressed in response to glucose, as expected, and reflected activity trends from cellulose isolated secretions (21). Expression levels of these transcripts returned to initial baselines once glucose was fully consumed (Fig. 3C and fig. S5). The regulatory patterns of these transcripts also revealed coordinated expression signatures of biomass degradation through cluster analysis (22).

Fig. 3 Anaerobic fungal biomass-degrading machinery is catabolically repressed.

(A) Exponential cultures of Piromyces were pulsed with 5 mg glucose. mRNA and secretome samples were collected during glucose depletion (yellow region). (B) Cluster analysis of genes strongly regulated by glucose. Transcript abundance data were compared to uninduced samples at t = 0 to calculate the log2 fold change in expression. Transcripts with large, significant regulation are displayed (p ≤ 0.01 – negative binomial distribution, ≥2 fold change). Clusters were manually annotated based on the most common protein domains/BLAST hits. (C) Relative expression (FPKM) of biomass degrading enzymes (table S1) and their corresponding activity (cellulosome fraction) on carboxy methylcellulose (CMC) (21). Data represent the mean ± SEM of ≥2 replicates.

Hierarchical cluster analysis revealed that glucose-regulated genes performing a common function grouped into 21 distinct clusters or regulons (Fig. 3B). Due to the functional enrichment of these regulons, divergent transcripts of unknown function that co-regulate with biomass degrading transcripts may be novel biomass degrading enzymes for biotechnology – here, we identified 17 such candidates from Piromyces (table S4). Biomass-degrading regulons were either hemicellulose/pectin degrading and rapidly repressed within 40 min, or contained a broad array of biomass degrading enzymes that responded more slowly at 3.5 hours (Fig. 3B and data S8). The faster regulatory response of hemicellulases is conserved in higher fungi (23, 24) and thought to be an adaptation to lignocellulose structure. Hemicellulose and pectin surround cellulose; thus, cellulases act only after the hemicellulases and pectinases remove this outer coating. Coordinating this expression leads to quicker regulation of hemicellulases and pectinases than of cellulases given a common regulatory input, in agreement with observation. Candidate mediators of this response include conserved orthologs of the fungal master carbon regulator (CreABC), xylose-sensitive transcription factors (Xlr-1/XlnR) and other conserved cellulolytic activators such as ACE1-2, ClbR, and Clr1-2 (table S5). Upregulated clusters, in contrast, contained an array of metabolic and housekeeping genes consistent with logarithmic growth, and protein expression genes such as chaperonins and rRNA processing proteins that likely mediated the cellular response to the sugar pulse (data S8).

To better understand the regulation of key biomass degrading enzymes, we analyzed expression as a function of substrate. Piromyces showed substantial remodeling of the transcriptome as carbon source was varied (~10% of all transcripts) reflecting changes in the biomass degrading machinery and internal processes of gut fungal cultures (fig. S6 and data S9). Among these were 194 of the differentially regulated transcripts from the glucose perturbation experiment described above. Overall, a 2-fold change in the expression of biomass degrading enzymes occurred during the switch from glucose to complex reed canary grass. This trend was mirrored in the activity of cellulose-precipitated secretions (Fig. 4A). Discernible changes in the composition of the biomass degradation machinery also accompanied variations in expression level (fig. S7).

Fig. 4 Anaerobic fungi degrade complex substrates with increasingly diverse enzymes.

(A) Relative expression (FPKM) of biomass degrading enzymes (table S1) and their activity (cellulosome fraction) on carboxy methylcellulose (CMC). (B) Normalized enrichment scores of positively enriched specified gene sets relative to growth on glucose. Gene sets that contain genes that are expressed more highly in a given substrate are indicated (False Discovery Rate, FDR ≤ 10%; Kolmogorov–Smirnov distribution). Enrichment scores are directly proportional to expression level. Gene sets indicated in bold are analyzed in aggregate and in subsets (unbolded sets below). asRNA = antisense RNA that target CAZy domains (Fig. 2A), Cellulosome = dockerin tagged transcripts. Figures represent the mean ± SEM of ≥ 2 replicates.

Gene set enrichment analysis (GSEA) (25) of the transcriptomes confirmed that the number and functional diversity of CAZyme domains increased as a function of substrate complexity (Fig. 4B) with insoluble substrates (filter paper, Avicel and reed canary grass) inducing fungal cellulosomes for enhanced degradation. Non-hemicellulosic substrates (cellobiose, filter paper, and Avicel) upregulated unneeded hemicellulases such as GH10 suggesting a common regulatory network for diverse enzymes. Nonetheless, the additional enzymes needed to degrade crude reed canary grass are independently regulated. Our analyses also revealed shifts between enzyme types for similar reactions (e.g., GH5 to GH9 as a β-glucosidase) as a function of substrate, demonstrating a highly tailored catabolic response.

Among the gene sets tested were clusters identified in the glucose perturbation experiment (Fig. 4B). Protein expression clusters (Fig. 3B) that were regulated by glucose were enriched on insoluble substrates, reaffirming their role in mediating expression of lignocellulolytic enzymes. Another regulon encoding diverse hemicellulases and a handful of cellulases (2 – hemicellulases) was central to all growth phenotypes other than glucose. This enzyme prevalence, even on non-polymeric carbohydrates, suggests that they play an integral role in the recognition of insoluble substrates (26): in the absence of glucose these enzymes are expressed at low levels to partially solubilize available cellulosic materials that can be recognized to trigger a more specific catabolic response. Consistent with this hypothesis is the 6-fold upregulation (pval ~0.02, negative binomial test) of the conserved transcription factor XlnR on reed canary grass and Avicel to better recognize solubilized sugars and induce fungal xylan degradation. This response is further regulated by asRNA targeting CAZyme domains as evidenced by their functional enrichment on Avicel (pval = 0.003, FDR = 0.03) and reed canary grass cultures (pval ~ 0, FDR = 0.003). An independent analysis using a hypergeometric statistical test confirms that antisense transcripts targeting CAZyme domains (cellulose catabolic process GO annotation) are functionally enriched among the regulated transcripts (pval ≈ 0.01) (data S10). Identities of the expressed asRNA, however, are substrate-specific to modify the catabolic response through a number of mechanisms (27) to conserve cellular resources (table S6).

Overall, our results show that anaerobic gut fungi tailor their hydrolytic response to lignocellulose, implying a coordination in catalysis between all expressed enzymes that may inform industrial hydrolysis strategies. The clear transcriptional signatures of these biomass degrading enzymes provide a route to identify hundreds of novel, sequence-divergent enzyme candidates with commercial potential from anaerobic microbial communities (28).

Supplementary Materials /DC1

Materials and Methods

Figs. S1 to S7

Tables S1 to S6

References (3043)

Data S1 to S10

References and Notes

  1. Acknowledgments: We thank C. Ngan, C. Daum, E. Lindquist and K. Barry for supervising library construction, transcriptome sequencing and analysis, and project management. We thank K. Lee and L. Choe for proteomics support; P. Weimer for lignocellulosic substrates; the Broad Institute core facilities for sequencing and computational assistance. Sequence and cluster descriptions are included in data S1 to S3, S8, and S9 in the supplementary materials. Raw sequence data and transcriptomic profiles reported in this study are deposited under BioProject Accession No. PRJNA 291757 ( Expression data are deposited in the NCBI’s Gene Expression Omnibus (29) and are accessible through GEO Series accession number GSE64834 ( K. Solomon, C. Haitjema, J. Henske, and M. O’Malley are inventors on patent applications (UCSB 2014-075 and UCSB 2015-334) filed by The Reagents of the University of California related to production of anaerobic fungal enzymes. This work was supported by the Office of Science (BER), U.S. Department of Energy (DE-SC0010352), the U.S. Department of Agriculture (Award 2011-67017-20459), and the Institute for Collaborative Biotechnologies through grant W911NF-09-0001. A portion of this research was performed under the JGI-EMSL Collaborative Science Initiative and used resources at the DOE Joint Genome Institute and the Environmental Molecular Sciences Laboratory, which are DOE Office of Science User Facilities. Both facilities are sponsored by the Office of Biological and Environmental Research and operated under Contract Nos. DE-AC02-05CH11231 (JGI) and DE-AC05-76RL01830 (EMSL). Author Contributions: K.V.S., C.H.H, D.A.T., M.K.T. and M.A.O. planned the experiments. M.A.O, J.K.H, C.H. H, M.K.T, and K.V.S isolated pure cultures of gut fungi. K.V.S., C.H.H., J.K.H, and M.A.O. performed growth and transcriptomic experiments. A.L. analyzed transcriptome sequencing. C.H.H., H.M.B., S.O.P, and A.T.W. performed proteomic analyses, S.P.G. performed enzyme characterization. K.V.S., D.B.R., J.K.H., A.R., I.G., S.O.P. and S.P.G. facilitated bioinformatics analyses of the datasets. K.V.S., D.A.T., and M.A.O. wrote the manuscript.
View Abstract

Navigate This Article