Research Article

The human gut bacterial genotoxin colibactin alkylates DNA

See allHide authors and affiliations

Science  15 Feb 2019:
Vol. 363, Issue 6428, eaar7785
DOI: 10.1126/science.aar7785

Bacterial warhead targets DNA

The bacterial toxin colibactin causes double-stranded DNA breaks and is associated with the occurrence of bacterially induced colorectal cancer in humans. However, isolation of colibactin is difficult, and its mode of action is poorly understood. Wilson et al. studied Escherichia coli that contain the biosynthetic gene island called pks, which is associated with colibactin production (see the Perspective by Bleich and Arthur). They identified the DNA adducts that resulted from incubating pks+E. coli in human cells. To overcome the lack of colibactin for direct analysis, mimics of the pks product were synthesized. From the resulting synthetic adenine-colibactin adducts, it became evident that alkylation via a cyclopropane “warhead” breaks the DNA strands. Similar DNA adducts were then identified in the gut epithelia of mice infected with pks+E. coli.

Science, this issue p. eaar7785; see also p. 689

Structured Abstract


Members of the human gut microbiota have been implicated in the development and progression of colorectal cancer (CRC). These CRC-associated microorganisms may influence carcinogenesis through a variety of mechanisms, including the production of genotoxins. Colibactin is a genotoxic secondary metabolite made by organisms harboring the pks genomic island, including certain gut commensal Escherichia coli strains (pks+ E. coli). Transient infection of mammalian cells with pks+ E. coli causes cell cycle arrest, DNA double-strand breaks, and senescence. Moreover, colibactin-producing E. coli accelerate tumor progression in multiple mouse models of colitis-associated CRC and are overrepresented in patients with familial adenomatous polyposis and CRC. Despite colibactin’s strong links to cancer, the active genotoxic metabolite has eluded all isolation attempts, limiting our mechanistic understanding of this association.


Over the past decade, multiple complementary approaches have provided indirect information about colibactin’s chemical structure. Interestingly, the isolation and structural characterization of metabolites from mutant strains of pks+ E. coli revealed that colibactin likely contains a cyclopropane ring, a reactive structural motif found in DNA alkylating natural products. This led us and others to hypothesize that colibactin may covalently modify DNA. To obtain information about the active genotoxin’s chemical structure and its mode of action, we sought to identify and structurally characterize colibactin-DNA adducts from human cells infected with pks+ E. coli.


Using untargeted liquid chromatography–mass spectrometry–based DNA adductomics, we compared the DNA adducts present in mammalian cell lines transiently infected with either pks+ E. coli or a mutant strain missing the pks genes. We discovered two adenine adducts that were specific to the cells exposed to pks+ E. coli. These adducts were confirmed to be pks-associated by feeding isotopic labeled versions of known colibactin biosynthetic precursors to the E. coli–mammalian cell system. The pks-dependent adducts were also found in human cells exposed to clinical colibactin-producing E. coli isolates and in the colonic epithelial cells of mice monocolonized with pks+ E. coli. Chemical synthesis and in vitro DNA alkylation reactions enabled the preparation of an authentic standard of the adducts. Structural characterization revealed a mixture of two diastereomeric adducts that both contain a 5-hydroxypyrrolidin-2-one ring system with an attached N3-substituted adenine ring. These DNA adducts are generated from ring opening of a reactive, cyclopropane-containing electrophilic warhead, confirming the importance of this structural feature for colibactin’s in vivo activity. Because these adducts are too small to derive from the final colibactin structure, we hypothesize that they arise from decomposition of a larger, unstable colibactin-DNA interstrand cross-link. Using a CometChip assay, we detected interstrand cross-link formation in cells infected with pks+ E. coli at the same time point at which we identified the characterized DNA adducts, supporting this proposal.


Our results provide direct evidence that the gut bacterial genotoxin colibactin alkylates DNA in vivo, providing mechanistic insights into how colibactin may contribute to CRC. The ability of pks+ E. coli to generate DNA adducts in mammalian cells and in mice strengthens support for the involvement of colibactin in cancer development or progression. Bulky DNA adducts, especially interstrand cross-links, are often cytotoxic and can lead to mutations if not accurately repaired. Colibactin-mediated DNA damage and the ensuing genomic instability could thus potentially be an underlying mediator of colorectal carcinogenesis. The colibactin-derived DNA adducts we identified could serve as a biomarker of pks+ E. coli exposure and will ultimately help to address the question of whether DNA damage inflicted by colibactin-producing gut bacteria contributes to CRC development and progression in humans.

Gut commensal E. coli strains associated with CRC produce a DNA-alkylating genotoxin.

(Top) The cyclopropane ring found in pks-dependent metabolites led us to hypothesize that colibactin alkylates DNA. Me, methyl. (Bottom) Untargeted DNA adductomics revealed colibactin-derived DNA adducts in human cells exposed to colibactin-producing E. coli. These adducts also form in mice colonized with pks+ E. coli, confirming that colibactin alkylates DNA in vivo and strengthening its link to cancer.


Certain Escherichia coli strains residing in the human gut produce colibactin, a small-molecule genotoxin implicated in colorectal cancer pathogenesis. However, colibactin’s chemical structure and the molecular mechanism underlying its genotoxic effects have remained unknown for more than a decade. Here we combine an untargeted DNA adductomics approach with chemical synthesis to identify and characterize a covalent DNA modification from human cell lines treated with colibactin-producing E. coli. Our data establish that colibactin alkylates DNA with an unusual electrophilic cyclopropane. We show that this metabolite is formed in mice colonized by colibactin-producing E. coli and is likely derived from an initially formed, unstable colibactin-DNA adduct. Our findings reveal a potential biomarker for colibactin exposure and provide mechanistic insights into how a gut microbe may contribute to colorectal carcinogenesis.

The human gut harbors trillions of microorganisms capable of producing small molecules that mediate microbe-host interactions (1). For example, certain gut commensal and extraintestinal pathogenic strains of Escherichia coli and other Proteobacteria produce colibactin, a genotoxin of unknown structure implicated in colorectal cancer pathogenesis. These organisms harbor a 54-kb biosynthetic gene cluster that encodes a nonribosomal peptide synthetase–polyketide synthase (NRPS-PKS) assembly line (pks island), which has been implicated in colibactin biosynthesis (Fig. 1A) (2). E. coli containing the pks island (pks+ E. coli) cause DNA double-strand breaks (DSBs) in human cell lines and in animals (3), accelerate colon tumor growth under conditions of host inflammation (4, 5), and are found with increased frequency in inflammatory bowel disease, familial adenomatous polyposis, and colorectal cancer patients (68). Despite these intriguing links to human disease, our understanding of colibactin’s chemical structure and biological activity is limited because this natural product has eluded isolation.

Fig. 1 pks+ E. coli synthesize cyclopropane-containing metabolites that may alkylate DNA.

(A) The pks genomic island. Open reading frames encoding nonribosomal peptide synthetase (NRPS, purple), polyketide synthase (PKS, brown), hybrid NRPS-PKS (blue), peptidase (ClbP, green), aminomalonate synthesis and transfer (gray), and other (black) enzymes are highlighted. (B) Selected cyclopropane-containing candidate precolibactins isolated and structurally characterized from pks+ ΔclbP E. coli, which lacks clbP, including lactam, pyridone, and macrocyclic scaffolds. Me, methyl. (C) Illudin S and (+)-duocarmycin A are DNA alkylating metabolites that contain a cyclopropane ring.

Colibactin has been exceptionally challenging to isolate and structurally characterize. For example, colibactin’s genotoxic activity is contact-dependent and not observed when cells are treated with pks+ E. coli culture supernatants or cell lysates (2). It is also currently unknown how colibactin is transported into mammalian cells. Attempts to directly identify colibactin using comparative metabolite analyses have been unsuccessful, indicating that the active genotoxin may be unstable and/or recalcitrant to isolation. To gain information about colibactin’s structure, we and others have isolated and characterized nongenotoxic, pks-associated metabolites (915) from mutant strains of pks+ E. coli missing a critical peptidase enzyme (ClbP), which removes an N-myristoyl-d-asparagine “prodrug motif” from a late-stage biosynthetic precursor and is required for genotoxicity (1618) (Fig. 1B). These metabolites, termed “precolibactins,” are unlikely to be precursors to the mature colibactin because their synthesis requires only a subset of the biosynthetic machinery known to be essential for genotoxic activity. Notably, several precolibactins contain a cyclopropane ring, a structural feature found in DNA alkylating natural products, such as the illudins (19) and duocarmycins (20) (Fig. 1C).

DNA alkylating agents act as electrophiles toward DNA bases, forming covalent modifications known as DNA adducts (21). The discovery of cyclopropane-containing precolibactins has led to the hypothesis that colibactin’s mode of action involves DNA alkylation, but there is limited direct evidence to support this idea (10, 11). Reacting a cyclopropane-containing precolibactin with linearized plasmid DNA revealed small amounts of a putative higher–molecular weight adduct by gel electrophoresis, leading to an initial proposal that colibactin cross-links DNA (10). Recent in vitro work using synthetic “colibactin mimics,” compounds designed based on partial biosynthetic information, showed that the cyclopropane ring in a putative ClbP cleavage product can be attacked by a thiol nucleophile and is necessary for these molecules to shear purified DNA (22). When artificially dimerized, these colibactin mimics appear to cross-link DNA as assessed by gel electrophoresis (22). pks+ E. coli lacking both the nucleotide excision repair protein UvrB (23) and a self-resistance protein encoded in the pks island (ClbS) exhibit severe autotoxicity and impaired growth (24), providing indirect support for DNA alkylation and repair of the resulting lesions in colibactin-producing E. coli strains. ClbS can hydrolyze the cyclopropane ring of a synthetic colibactin mimic, further implicating this functional group in colibactin’s activity (25). However, experimental proof that colibactin itself alkylates DNA remains elusive, because colibactin-DNA adducts have not been structurally characterized or identified in biologically relevant settings.

Untargeted DNA adductomics can identify unknown DNA adducts

Owing to the challenges associated with isolating the active genotoxin from E. coli, we instead sought to identify the in vivo product(s) of colibactin-mediated DNA damage. We hypothesized that detecting and characterizing colibactin-DNA adducts generated in human cells treated with pks+ E. coli would yield direct information about the active genotoxin’s chemical structure and the molecular basis for its DNA damaging activity in a biologically relevant setting. Although targeted liquid chromatography–mass spectrometry (LC-MS)–based methodologies exist to identify previously characterized DNA adducts in cells, detecting unknown DNA adducts represents a considerable challenge because of the low abundance of these modifications and the extraneous false-positive ion signals that derive from the complex matrices of biological samples (26). Indeed, preliminary attempts to identify colibactin-DNA adducts using standard comparative metabolite profiling approaches failed to reveal differences in hydrolyzed DNA samples from HeLa cells treated with either E. coli BW25113 pBeloBAC (pks) or BACpks (pks+) strains. To overcome this difficulty, we envisioned exploiting a newly developed, untargeted MS-based DNA adductomics approach (26) to identify colibactin-DNA adducts in cells exposed to pks+ E. coli.

LC-MS3 DNA adductomics identifies adducts in hydrolyzed DNA samples using high-resolution accurate-mass data-dependent constant neutral-loss monitoring of 2′-deoxyribose [116.0474 unified atomic mass unit (u)] or one of the four DNA bases (guanine, 151.0494 u; adenine, 135.0545 u; thymine, 126.0429 u; and cytosine, 111.0433 u) (Fig. 2A) (27). Accurate mass measurement of an observed DNA adduct can allow for the determination of its elemental composition, and the triggered MS2 and MS3 fragmentation spectra provide additional structural information about the modified base. We first used this DNA adductomic approach to detect characterized DNA adducts induced by illudin S, a cytotoxic agent that alkylates DNA upon cellular metabolic activation (28). LC-MS3 DNA adductomic analysis of hydrolyzed DNA obtained from HeLa cells exposed to either illudin S or dimethyl sulfoxide (DMSO) identified a known illudin-derived adduct (28) with a mass/charge number ratio (m/z) of 384.2030 [M+H]+ only in the illudin-treated cells, confirming the utility of this approach for adduct detection in our model (fig. S1).

Fig. 2 High-resolution accurate-mass LC-MS3 DNA adductomic analysis identifies DNA adducts in HeLa cells and mice exposed to pks+ E. coli.

(A) Structural features of DNA adducts and detection by neutral-loss monitoring. (B) Full scan extracted ion chromatogram (EIC) of DNA adducts 1 and 2 (m/z 540.1772) in HeLa cells exposed to colibactin-producing E. coli and negative controls (HeLa cells exposed to non-colibactin-producing pBeloBAC E. coli, HeLa cells alone, or when no DNA was present). (C) DNA adductomic analysis of adducts 1 and 2. (1) Full scan EIC of DNA adducts 1 and 2 (m/z 540.1772). (2) Signal corresponding to the data dependent MS2 events [retention time (RT) = 16.87 and 17.45 min]. (3) Signal corresponding to MS3 events (RT = 16.88 and 17.46 min) triggered by the neutral loss of adenine. (4) MS2 mass spectrum resulting from fragmentation of m/z 540.1772, which triggered the MS3 event. (D) Flowchart of the experiment detecting DNA adducts 1 and 2 in mouse colonic epithelial cells. (E) Bacterial load in the feces of mice colonized with pBelo (n = 3) or pks+ E. coli (n = 8) for 2 weeks. CFU, colony forming units. (F) EIC counts of DNA adducts 1 and 2 per μg of DNA in colonic epithelial cells isolated from mice colonized with pBelo (n = 3) or pks+ E. coli (n = 8) for 2 weeks. EIC counts were determined by area-under-the-curve integrations of the most abundant MS2 fragmentation ion (m/z 387.1118 ± 0.0008) of the adducts 1 and 2 precursor ion (m/z 540.1772). (G) Representative MS/MS EIC of DNA adducts 1 and 2 (m/z 387.1118 [M+H-Ade-H2O]+), the most abundant fragment ion of m/z 540.1772. Each symbol in (E) and (G) represents an individual mouse; error bars represent mean ± SEM; ****P < 0.0001 (unpaired Student’s t test).

Discovery of DNA adducts in mammalian cells and mice exposed to pks+E. coli

We next investigated whether this method could identify putative colibactin-DNA adducts. First, we isolated DNA from HeLa cells transiently infected with either pks or pks+ E. coli. After DNA hydrolysis, we performed a comparative, untargeted LC-MS3 DNA adductomic screen of these samples. This analysis revealed two putative DNA adducts (1 and 2) that accumulated only in the cells treated with pks+ E. coli (Fig. 2B). We also detected these adducts in a colonic epithelial cell line exposed to pks+ E. coli and in HeLa cells exposed to native colibactin-producing strains (figs. S2 and S3). Adducts 1 and 2 eluted at 16.92 and 17.50 min, respectively, and exhibited a m/z of 540.1765 [M+H]+ (Fig. 2C). Both peaks triggered MS3 fragmentation events upon observation of the neutral loss of adenine (135.0542 u) in the MS2 fragmentation spectra (Fig. 2C), indicating that these compounds were adenine adducts. To confirm adducts 1 and 2 were pks-associated, we repeated the cell infection assays described above but included individual stable isotope–labeled amino acids known to be used by the pks NRPS-PKS assembly line and integrated into pks-associated metabolites (10, 12, 13). These experiments revealed that the expected building blocks l-[2,3-13C2]Ala, l-[1-13C]Met, [1,2-13C2]Gly, and l-[1-13C]Cys were incorporated into the two adducts (figs. S4 to S8). While this work was in revision, Herzon and co-workers identified a putative DNA adduct with the same mass in plasmid DNA exposed to pks+ E. coli in vitro, thus further confirming our findings (29).

Next, we sought to determine whether adducts 1 and 2 could be detected in mice exposed to colibactin-producing E. coli. Germ-free wild-type C57BL/6J mice were inoculated with either pks or pks+ E. coli. After 2 weeks, colonic epithelial cells were harvested, DNA was isolated from the cells, and adduct formation was assessed using LC–tandem mass spectrometry (LC-MS/MS) (Fig. 2D). Both strains colonized the mice to a similar extent as assessed by fecal colony counts (Fig. 2E and table S1). We detected adducts 1 and 2 only in the mice colonized with pks+ E. coli (Fig. 2, F and G, fig. S9, and table S2). These results show that the colibactin-mediated DNA damage observed in human cell lines also occurs within a genetically intact host in the absence of exogenous carcinogens or inflammatory mediators. Furthermore, these data suggest that these adducts are biomarkers for pks+ E. coli exposure. Overall, this experiment provides the first direct support for DNA alkylation playing a critical role in colibactin’s genotoxicity in vivo.

Structural characterization of the colibactin-derived DNA adducts

Further analysis of the LC-MS3 data revealed preliminary information about the structure(s) of adducts 1 and 2. The high-resolution accurate-mass measurement of m/z 540.1765 [M+H]+ yielded a molecular formula of C23H25N9O5S (calculated, 540.1772) with 16 degrees of unsaturation. MS2 fragmentation of 1 and 2 in high-resolution mode displayed major fragment ions of m/z 522.1665 [M+H-H2O]+, 387.1118 [M+H-Ade-H2O]+, 344.1060, and 229.0970. The shared fragmentation spectra indicated that these compounds were likely stereoisomeric (fig. S10). Using this information and our MS2 fragmentation data, we proposed potential structures for the in vivo–derived colibactin-DNA adducts that were analogous to a recently characterized, chemically unstable “model colibactin”–thiol adduct but that contained an extra hydroxyl group (fig. S11) (25). However, these adducts’ low abundance in cells precluded further isolation and structural characterization efforts.

To elucidate the structures of adducts 1 and 2, we accessed authentic standards by chemically synthesizing new colibactin mimics and reacting them with calf-thymus DNA (ctDNA) (Fig. 3A). We prepared carboxylic acid–containing cyclopropane 3 in seven steps using a route developed to access other colibactin mimics (figs. S12 to S28 and 29A) (22). On the basis of the proposed structures of the adducts identified in pks+ E. coli–treated HeLa cells, 3 could react with ctDNA to give 1 and 2 directly. However, cyclopropane 3 generated only trace amounts of detectable adducts when incubated with ctDNA (fig. S30) and minimally sheared DNA at 1 mM concentration (fig. S31A). Hypothesizing that an unfavorable electrostatic interaction between the carboxylate of 3 and the negatively charged phosphate backbone of DNA greatly reduced its reactivity, we masked the carboxylic acid as an ethyl ester (figs. S12 to S19, S29B, and S32 to S35). Cyclopropane 4 was ~100-fold more potent than 3 in a DNA shearing assay and induced both G2/M cell cycle arrest and DNA DSBs in treated HeLa cells (fig. S31B and S36 to S38). LC-MS analysis of a ctDNA reaction with 4 revealed three major products of m/z 552.2125 [M+H]+ (5) and m/z 568.2076 [M+H]+ (6 and 7) (figs. S30 and S39). MS2 fragmentation confirmed that these compounds were adenine adducts that only differed by the presence of one oxygen atom (figs. S40 and S41). The “nonoxidized” adduct 5 (m/z 552.2135) was unstable at room temperature and slowly converted to a pair of “oxidized” adducts (6 and 7) (m/z 568.2081) over the course of 2 days (figs. S42 and S43). Further analysis of the fragmentation data indicated that 6 and 7 contained a hydroxyl group and a fragment ion (m/z 229.0970) identical to that of the in vivo–derived adducts 1 and 2 (fig. S44). Because 6 and 7 were stable and possessed an MS2 fragmentation pattern matching those of the adducts detected in pks+ E. coli–treated HeLa cells, we targeted these compounds for isolation and structural characterization.

Fig. 3 Comparing DNA adducts generated in vivo with a synthetic standard confirms their chemical structures.

(A) In vitro DNA alkylation reaction used to generate a synthetic standard of adducts 1 and 2. PLE, pig liver esterase. (B) Chemical structures of diastereomeric DNA adducts 6 and 7 showing key two-dimensional NMR correlations that support the hemiaminal and N3-adenine assignments. (C) EICs of the l-[1-13C]Cys–labeled in vivo adducts 1 and 2 (m/z 541.1805), co-injection of synthetic standard (m/z 540.1772) with in vivo adducts 1 and 2 (m/z 541.1805 and residual unlabeled m/z 540.1772), and synthetic standard (m/z 540.1772).

We isolated and purified 6 and 7 from multiple small-scale ctDNA alkylation reactions to give approximately 1 mg of pure material. Analysis by one-dimensional (1H) and two-dimensional (gCOSY, gHSQC, gHMBC, and ROESY) nuclear magnetic resonance (NMR) (figs. S45 to S52 and tables S3 and S4), as well as DP4 computational analysis (30) (fig. S53 and tables S5 to S7), revealed a 1:1 diastereomeric mixture of a single adduct that contains a 5-hydroxypyrrolidin-2-one ring system with an attached N3-substituted adenine ring (Fig. 3B). The N3-adenine and hemiaminal substitution assignments were supported by key 1H-13C heteronuclear multiple bond (HMBC) and through-space 1H-1H correlations. Interestingly, the observed preference for N3-adenine alkylation resembles that of other cyclopropane-containing DNA alkylating agents (28, 31).

We hypothesized that the structures of the monoadducts obtained in vitro (6 and 7) and the adducts identified from pks+ E. coli–treated cells (1 and 2) differed only in the presence of an ester versus carboxylic acid functional group on the basis of their shared MS2 fragmentation patterns. To confirm the structure of the adducts generated in vivo (1 and 2), ester-containing adducts 6 and 7 were hydrolyzed using pig liver esterase to give an authentic standard of the corresponding carboxylic acids (fig. S54). LC-MS1 and LC-MS2 analysis revealed that this standard possessed the same m/z, retention time, and MS2 fragmentation pattern as the adducts (1 and 2) identified in pks+ E. coli–treated mammalian cells (Fig. 3C and fig. S55), thereby confirming their chemical structures. On the basis of the observed reactivity of the nonoxidized adduct 5 in vitro and the reactivity of related synthetic compounds (22, 25), we propose that the 5-hydroxypyrrolidin-2-one ring found in adducts 1 and 2 likely arises from oxidation of an initial, chemically unstable enamide-containing adduct (fig. S56).

Successful characterization of monoadducts 1 and 2 confirms that exposure of host cells to pks+ E. coli results in DNA alkylation. This finding provides direct information about colibactin’s structure, resolves questions surrounding the active genotoxin’s electrophilic cyclopropane and mode of activation, and allows us to propose a mechanism for how exposure to colibactin leads to DNA damage. The precolibactins, isolated to date, possess a variety of cyclopropane-containing scaffolds, including linear structures (13), an unsaturated lactam (1012), a pyridone (13, 14), and a macrocycle (15). The structures of 1 and 2 strongly suggest that the cyclopropane in colibactin is conjugated to an α,β-unsaturated imine and is not embedded within a linear framework or pyridone. Furthermore, our findings provide in vivo evidence that cleavage of precolibactin by peptidase ClbP generates the active colibactin genotoxin by triggering an intramolecular cyclodehydration to form an α,β-unsaturated imine, enhancing the reactivity of the cyclopropane toward DNA (Fig. 4A) (10, 11, 22).

Fig. 4 The characterized DNA adducts may derive from a colibactin-DNA interstrand cross-link.

(A) Proposed model for colibactin DNA alkylation and formation of DNA adducts 1 and 2. Ade, adenine. (B) Arrayed microwell comets from untreated HeLa cells and HeLa cells treated with pBelo or pks+ E. coli for 1 hour at 37°C [multiplicity of infection (MOI) = 1000]. Scale bar indicates 100 μm. (C) Quantification of the percentage of DNA in tail from untreated HeLa cells and HeLa cells treated with pBelo or pks+ E. coli for 1 hour at 37°C (MOI = 1000). Data points and error bars represent mean ± SEM, respectively, of three independent experiments. Student’s t test was performed to compare each treated dose to the corresponding negative controls (*P < 0.05).

Exposure to pks+E. coli generates interstrand cross-links in cells

Multiple lines of evidence suggest that adducts 1 and 2 are unlikely to represent the immediate product of DNA alkylation with the mature colibactin genotoxin and instead arise from degradation of a larger monoadduct and/or cross-linked adduct. First, the carboxylic acid–containing model colibactin 3 is a poor DNA alkylating agent. Second, although 3 could be generated using a subset of the colibactin biosynthetic enzymes, the entire NRPS-PKS assembly line is essential for genotoxicity (2). Finally, recent work reported the accumulation of a putative interstrand cross-link in DNA incubated with pks+ E. coli as assessed by gel electrophoresis and found that cell lines deficient in interstrand cross-link repair were more sensitive to colibactin-producing E. coli (32).

To explore the correlation between DNA cross-linking and the formation of 1 and 2, we used a modified alkaline single-cell gel electrophoresis assay (comet assay) (33) to assess the presence of interstrand cross-links in cells exposed to pks+ E. coli. This assay utilizes the high degree of strand breaks induced by γ radiation to measure interstrand cross-link formation. Unlike monoadducts, interstrand cross-links inhibit the denaturation of DNA under alkaline conditions and therefore decrease the level of DNA migration, reducing the ability to detect radiation-induced strand breaks. We first tested whether CometChip, a high-throughput platform and more robust version of the comet assay (34, 35), could detect interstrand cross-links generated by cisplatin, a bifunctional cross-linking agent. Cells were exposed to varying concentrations (0 to 200 μg/ml) of cisplatin and then analyzed for cross-links 6 hours after drug treatment. As expected, cisplatin caused a significant decrease in DNA migration in treated cells, thus confirming the utility of this assay (fig. S57).

Next, we applied CometChip to investigate whether interstrand cross-links are formed in HeLa cells exposed to colibactin at the same time point at which we detected adducts 1 and 2. HeLa cells were infected with pks+ E. coli for 1 hour, and cross-link formation was measured immediately afterward. In pks+ E. coli–treated HeLa cells exposed to γ irradiation [8 gray (Gy)], we detected a significant level of cross-links as indicated by the 32% decrease in DNA tail moment compared to both controls (Fig. 4, B and C). By contrast, we observed minimal strand breaks in non-γ-irradiated pks+ E. coli–treated cells. Thus, these results indicate that interstrand cross-links are present in the pks+ E. coli–treated HeLa cells from which we isolated 1 and 2. However, we could not identify any masses corresponding to putative interstrand cross-links in our untargeted DNA adductomics datasets, suggesting that they may be unstable to our isolation, purification, or MS conditions.

Possible origins of colibactin-derived DNA adducts

Knowledge of colibactin biosynthesis suggests a mechanism for how DNA adducts or cross-links degrade to form 1 and 2. Bioinformatic analyses of the NRPS-PKS assembly line and isolation of precolibactins have indicated that colibactin likely contains a bithiazole (13, 14) and/or a related ring system with an α-aminoketone inserted between two thiazole rings (Fig. 1B) (15). However, searching our untargeted DNA adductomics datasets did not reveal masses corresponding to putative bithiazole or α-aminoketone–containing colibactin-DNA adducts. Because adducts 1 and 2 contain only one thiazole heterocycle, we propose that they derive from oxidative C–C cleavage of a larger, α-aminoketone–containing colibactin-DNA monoadduct or cross-link (Fig. 4A). Whereas bithiazole rings are stable, α-aminoketones undergo oxidative C–C bond cleavage in the presence of reactive oxygen species to give carboxylic acids (36). Therefore, the structures of 1 and 2 strongly suggest that the active genotoxin contains an α-aminoketone, a positively charged functional group that may enhance colibactin’s affinity for DNA and thus increase its potency (22). Further experiments will be needed to clarify the nature of the interstrand cross-link, the origin and timing of the proposed oxidative degradation event, and how the specific lesions(s) generated by colibactin lead to the formation of DBSs.


We have presented direct evidence that the gut bacterial genotoxin colibactin alkylates DNA in vivo. The ability of pks+ E. coli to generate DNA adducts in mammalian cells and in mice strengthens support for the involvement of colibactin in cancer development or progression because misrepaired monoadducts and cross-linked adducts may generate mutations in oncogenes or tumor suppressor genes, contributing to tumorigenesis (37, 38). Our findings will enable efforts to decipher the molecular details of this process. Importantly, this work has also uncovered a candidate metabolite biomarker of colibactin exposure and cancer risk. The ability to directly assess whether exposure to colibactin has occurred in animal models and human patients will help address the critical question of whether pks+ E. coli contribute to colorectal carcinogenesis in patient cohorts (39). Finally, this study showcases the use of untargeted DNA adductomics for the identification and elucidation of unknown gut microbial-derived DNA modifications, highlighting the power of emerging analytical techniques in studying human microbiota metabolites and host-microbiota interactions.

Materials and methods summary

Our methods for the preparation of E. coli strains for cell infection and isotope labeling, bacterial monocolonization of germ-free mice, identification of colibactin-derived DNA adducts by DNA adductomics, synthetic preparation of model colibactins and synthetic standards of the colibactin-derived DNA adduct, structural characterization of compounds by one- and two-dimensional NMR methods and DP4 computational analysis, and assessment of interstrand cross-link formation using the CometChip assay are provided in the supplementary materials. Additional information about our protocols, including references to the supplementary materials, can be found throughout the main text.

Supplementary Materials

Materials and Methods

Figs. S1 to S57

Tables S1 to S7

References (40, 41)

References and Notes

Acknowledgments: We thank G. Heffron and C. Sheehan (Harvard Medical School, Boston, MA) for help with NMR experiments; M. Volpe and N. Braffman (Balskus lab, Harvard University) for assistance with DP4 computational analysis; L. Zhang (Balskus lab, Harvard University) for helping with Sephadex LH-20 chromatography; and L. Zha, V. M. Rekdal, A. Waldman, T. Ng, and N. Koppel (Harvard University) for helpful discussions. We thank W. G. Tilly (MIT) for providing the TK6 cell line and C. Woo (Harvard University) for providing the HT29 cell line. Funding: Financial support was provided by the Packard Fellowship for Science and Engineering (E.P.B.), the Damon Runyon-Rachleff Innovation Award (E.P.B.), and National Institutes of Health grants R01 CA208834 (E.P.B.), R01 CA154426 (W.S.G.), and R01 ES022872 (L.D.S.). This work was also supported by National Institute of Environmental Health Sciences grant R44 ES024698 (B.P.E.) and Center for Environmental Health Sciences grant P30 ES002109 (B.P.E.). Salary support for P.W.V. was provided by the U.S. National Institutes of Health and National Cancer Institute (grant R50-CA211256). Mass spectrometry was carried out in the Analytical Biochemistry Shared Resource of the Masonic Cancer Center, supported in part by the U.S. National Institutes of Health and National Cancer Institute (Cancer Center Support Grant CA-77598). M.R.W. acknowledges support from the American Cancer Society–New England Division Postdoctoral Fellowship PF-16-122-01-CDD. C.A.B. is the Dennis and Marsha Dammerman fellow of the Damon Runyon Cancer Research Foundation (DRG-2205-14). Mention of commercial products does not constitute endorsement. Author contributions: M.R.W., Y.J., S.B., and E.P.B. conceived of the study. M.R.W., Y.J., P.W.V., C.A.B., L.N., B.P.E., W.S.G., S.B., and E.P.B. designed the experiments. M.R.W., Y.J., P.W.V., A.S., P.D.B., A.C., C.A.B., E.C., and L.N. performed the experiments. L.D.S., B.P.E., W.S.G., S.B., and E.P.B. supervised the research. All authors contributed to data analysis and the writing and editing of the manuscript. Competing interests: E.P.B. has consulted for Merck, Novartis, and Kintai Therapeutics and is on the Scientific Advisory Board of Kintai Therapeutics. W.S.G. has consulted for Pfizer Consumer Health, Merck, Janssen, and BiomX and is on the Scientific Advisory Boards of Evelo Biosciences, Kintai Therapeutics, and Leap Therapeutics. B.P.E. is an inventor on a patent describing the CometChip assay (WO 2009/134768 A1). Data and materials availability: The experimental data obtained in this study are available as supplementary materials.

Stay Connected to Science

Navigate This Article