Research Article

Structure elucidation of colibactin and its DNA cross-links

See allHide authors and affiliations

Science  06 Sep 2019:
Vol. 365, Issue 6457, eaax2685
DOI: 10.1126/science.aax2685

You are currently viewing the abstract.

View Full Text

Log in to view the full text

Log in through your institution

Log in through your institution

Double warhead does DNA damage

Strains of the human gut bacterium Escherichia coli carrying the clb gene cluster produce a secondary metabolite dubbed colibactin and have been provocatively linked to colorectal cancer in some models. Colibactin has been difficult to isolate in full, but pieces of the structure have been worked out, including an electrophilic warhead. Xue et al. found that colibactin contains two conjoined warheads, which is consistent with its ability to alkylate and cross-link DNA. Chemical synthesis and comparison to cell coculture confirm the structure and properties of this unstable and potentially carcinogenic metabolite.

Science, this issue p. eaax2685

Structured Abstract


Research on the human microbiome has revealed extensive correlations between bacterial populations and host physiology and disease states. However, moving past correlations to understanding causal relationships between the bacteria in our bodies and our health remains a challenge. A well-studied human-bacteria relationship is that of certain gut Escherichia coli strains whose presence correlates with colorectal cancer in humans. These E. coli damage host DNA and cause tumor formation in animal models, and this genotoxic phenotype is thought to derive from a secondary metabolite—known as colibactin—that is synthesized by the bacteria. Because colibactin’s biosynthetic pathway is only partially resolved, the complete structure of colibactin has remained unknown for more than a decade. Similarly, because colibactin is unstable and is produced in vanishingly small quantities, it has yet to be isolated and characterized by means of standard spectroscopic methods.


Determining colibactin’s chemical structure and related biological activity will allow researchers to determine whether the metabolite is the causal agent underlying many colorectal cancers. To that end, we used an interdisciplinary approach to overcome the challenges that have impeded determination of colibactin’s structure. Inspired by an earlier study that showed that colibactin-producing bacteria cross-link DNA, we used DNA as a probe to isolate colibactin from bacterial cultures. Using a combination of isotope labeling and tandem mass spectrometry analysis, we deduced the structure of the colibactin residue when bound to two nucleobases. This information allowed us to then identify and characterize colibactin in bacterial extracts and to identify plausible biosynthetic precolibactin precursors. Last, we developed a method to recreate colibactin in the laboratory and thereby confirm these structure-function relationships.


Colibactin is formed through the union of two complex biosynthetic intermediates. This coupling generates a nearly symmetrical structure that contains two electrophilic cyclopropane warheads. We found that each of these residues undergoes ring-opening through nucleotide addition, a determination that is consistent with earlier studies of truncated colibactin derivatives and the observation that colibactin-producing bacteria cross-link DNA. Using genome editing techniques, we were able to show that the production of colibactin’s precursor, precolibactin 1489, requires every biosynthetic gene in the colibactin gene cluster, implicating it as being derived from the long-elusive and now completed biosynthetic pathway. Because natural colibactin remains nonisolable, the chemical synthetic route to colibactin we developed will allow researchers to probe for causal relationships between the metabolite and inflammation-associated colorectal cancer.


These studies reveal the structure of colibactin, which accounts for the entire gene cluster encoding its biosynthesis, a goal that has remained beyond reach for more than a decade. The complete identity of colibactin has been a missing link in determining whether and how often colibactin is the causal agent underlying colorectal cancers. The interdisciplinary approach we used—marrying chemical synthesis, metabolomics, and probe-mediated natural product capture—may be applicable toward other spectroscopically intractable metabolites that are implicated in disease phenotypes but are currently undetected in the enormous chemical space encoded by the microbiome. Our studies represent a substantial advance toward our understanding of causative rather than correlative relationships between the gut microbiome and human health.

Molecular basis for colibactin-associated colorectal cancers.

(Left) Parallel, complementary approaches of total synthesis and tandem mass spectrometry–guided labeled DNA analysis identified the colibactin metabolite responsible for DNA cross-links. Elements highlighted in red are the two electrophilic cyclopropane motifs that are the site of DNA adduction. (Right) With structural information in hand, we can now assess the molecular pharmacophores responsible for colibactin-associated inflammation and carcinogenesis.


Colibactin is a complex secondary metabolite produced by some genotoxic gut Escherichia coli strains. The presence of colibactin-producing bacteria correlates with the frequency and severity of colorectal cancer in humans. However, because colibactin has not been isolated or structurally characterized, studying the physiological effects of colibactin-producing bacteria in the human gut has been difficult. We used a combination of genetics, isotope labeling, tandem mass spectrometry, and chemical synthesis to deduce the structure of colibactin. Our structural assignment accounts for all known biosynthetic and cell biology data and suggests roles for the final unaccounted enzymes in the colibactin gene cluster.

View Full Text