Research Article

A metagenomic strategy for harnessing the chemical repertoire of the human microbiome

See allHide authors and affiliations

Science  13 Dec 2019:
Vol. 366, Issue 6471, eaax9176
DOI: 10.1126/science.aax9176

You are currently viewing the abstract.

View Full Text

Log in to view the full text

Log in through your institution

Log in through your institution

Prospecting for drugs in the microbiome

The microbiome is an important source of natural products that can profoundly influence health and disease in the host. Sugimoto et al. constructed a modular, probabilistic strategy called MetaBGC to uncover biosynthetic gene clusters (BGCs) in human microbiome samples (see the Perspective by Henke and Clardy). The authors found geographic and strain-specific distributions of BGCs. By zeroing in on two type II aromatic polyketides, the native organisms were identified, the BGCs were reconstructed in Streptomyces, and the products were characterized. When expressed in Bacillus subtilis, the products resembled currently used anticancer drugs and antibiotics. These polyketides were not cytotoxic but had inhibitory activity against oral Gram-positive bacteria, which may reflect the niche and ecology of the originating organisms.

Science, this issue p. eaax9176; see also p. 1309

Structured Abstract


The human microbiome has been correlated with several health and disease conditions, but the molecular mechanisms underlying these correlations remain largely unexplored. Biologically active small molecules that are produced by the human microbiome offer an important route for exploring these mechanisms because they often mediate important microbe-microbe and microbe-host interactions. In bacterial genomes, small-molecule biosynthetic genes are usually encoded in distinct clusters known as biosynthetic gene clusters (BGCs), which enables scientists to use computational tools for recognizing them and predicting their products. Here, we present a hybrid strategy that uses computational and synthetic biology tools for discovering microbiome-encoded small molecules.


Previous efforts to discover small-molecule BGCs from the human microbiome relied mainly on analyzing genomic data of sequenced bacterial isolates. Although this approach has revealed the enormous and largely untapped diversity of microbiome-encoded BGCs, it fails to report on the biosynthetic potential of members of the human microbiome that have not yet been cultured or isolated: the majority of species in metagenomic sequencing data. Therefore, we sought to develop a computational algorithm that discovers small-molecule BGCs directly in complex metagenomic sequencing data of the human microbiome: metagenomic identifier of biosynthetic gene clusters (MetaBGC). First, high-performance probabilistic models for identifying homologs of a biosynthetic enzyme of interest are built specifically for use with complex metagenomic datasets (MetaBGC-Build). Next, these models are used to identify biosynthetic genes in thousands of metagenomic datasets of the human microbiome at the single-read level (MetaBGC-Identify). Finally, identified biosynthetic reads are quantified in the entire cohort of samples (MetaBGC-Quantify) and clustered into biosynthetic read bins on the basis of their abundance profiles across samples (MetaBGC-Cluster). To evaluate the utility of this approach, we used it to discover BGCs for type II polyketides, a clinically relevant class of small molecules, directly from metagenomic sequencing data of the human microbiome.


We applied MetaBGC to 3203 metagenomic samples of the human microbiome originating from Western (subjects from the United States, Spain, and Denmark) and non-Western (subjects from China and Fiji) populations and from every major human body site (gut, mouth, skin, and vagina). Overall, we discovered 13 complete BGCs that potentially encode type II polyketides; eight of these were encoded by diverse bacterial isolates of the human microbiome in a strain-specific manner and five could not be assigned to any sequenced species. Type II polyketide BGCs are found in three major human body sites, gut, mouth, and skin, and at least six of them are transcribed under host-colonization conditions and widely distributed in different human populations (e.g., 46% of healthy subjects from the United States encode at least one BGC in their gut, oral, or skin microbiome). Next, we selected two of the identified BGCs for experimental characterization, one from the oral microbiome and another from the gut microbiome. We used a synthetic biology strategy in which metagenomically discovered BGCs are genetically engineered and expressed in various heterologous hosts without the need for cultivation of the native producer. Using this strategy, we successfully purified and solved the structures of five new type II polyketide molecules as the products of the two characterized BGCs. Finally, we show that two of the discovered molecules exert strong antibacterial activities against members of the human microbiome that occupy the same niche as their producer, implying a possible role in microbe-microbe competition.


We developed a hybrid strategy that combines computational and experimental techniques for discovering and characterizing small-molecule BGCs directly from complex datasets of the human microbiome. Using this strategy, we discovered that a clinically relevant class of molecules, type II polyketides, are widely encoded in the human microbiome and that human microbiome–derived polyketides resemble in structure and biological activity clinically used ones. Our approach is generally applicable to other classes of small molecules and can be used to systematically unveil the chemical potential of the human microbiome—a goal that is useful for both mechanistic microbiome explorations and drug discovery.

Small-molecule discovery from the human microbiome.

A hybrid computational and synthetic biology approach was developed to discover small molecules encoded by the human microbiome. Biosynthetic gene clusters were discovered at the single metagenomic read level from thousands of samples using a new algorithm, MetaBGC, and expressed in a synthetic biology platform to yield previously unknown small molecules.


Extensive progress has been made in determining the effects of the microbiome on human physiology and disease, but the underlying molecules and mechanisms governing these effects remain largely unexplored. Here, we combine a new computational algorithm with synthetic biology to access biologically active small molecules encoded directly in human microbiome–derived metagenomic sequencing data. We discover that members of a clinically used class of molecules are widely encoded in the human microbiome and that they exert potent antibacterial activities against neighboring microbes, implying a possible role in niche competition and host defense. Our approach paves the way toward a systematic unveiling of the chemical repertoire encoded by the human microbiome and provides a generalizable platform for discovering molecular mediators of microbiome-host and microbiome-microbiome interactions.

View Full Text

Stay Connected to Science