Metagenome Mining Reveals Polytheonamides as Posttranslationally Modified Ribosomal Peptides

See allHide authors and affiliations

Science  19 Oct 2012:
Vol. 338, Issue 6105, pp. 387-390
DOI: 10.1126/science.1226121


It is held as a paradigm that ribosomally synthesized peptides and proteins contain only l-amino acids. We demonstrate a ribosomal origin of the marine sponge–derived polytheonamides, exceptionally potent, giant natural-product toxins. Isolation of the biosynthetic genes from the sponge metagenome revealed a bacterial gene architecture. Only six candidate enzymes were identified for 48 posttranslational modifications, including 18 epimerizations and 17 methylations of nonactivated carbon centers. Three enzymes were functionally validated, which showed that a radical S-adenosylmethionine enzyme is responsible for the unidirectional epimerization of multiple and different amino acids. Collectively, these complex alterations create toxins that function as unimolecular minimalistic ion channels with near-femtomolar activity. This study broadens the biosynthetic scope of ribosomal systems and creates new opportunities for peptide and protein bioengineering.

The marine sponge Theonella swinhoei, a composite organism containing numerous uncultivated bacterial symbionts, is a rich source of bioactive metabolites (1). Among these, polytheonamides A and B (Fig. 1A) are particularly noteworthy for their structural complexity (2). Of the 19 different amino acids that constitute these unusual 48-residue peptides, 13 are nonproteinogenic. The compounds were therefore assumed to be products of a nonribosomal peptide synthetase (NRPS)—a large multifunctional protein complex that can generate peptides with unusual residues (3, 4). However, polytheonamides are larger than other known NRPS-synthesized secondary metabolites, and the size of an NRPS biosynthetic machinery required to assemble 48 residues prompted us to speculate whether, alternatively, a ribosomal pathway (5, 6) could be involved. This would require a ribosomal pathway that could introduce multiple d-configured and C-methylated residues (68). To test the ribosomal hypothesis, a seminested polymerase chain reaction (PCR) protocol (fig. S1) was used with primers designed on the basis of a hypothetical precursor peptide consisting of proteinogenic l-configured amino acids (Fig. 1A). Sequencing revealed a succession of codons that precisely corresponded to an unprocessed polytheonamide precursor; this supports a ribosomal origin. To identify the surrounding DNA region, 920,000 clones of a library of T. swinhoei total DNA (9) were screened in a pool-dilution strategy (10), yielding a single cosmid pTSMAC1. The few other clones detected were repeatedly lost during isolation. To expand the upstream sequence, we amplified a 7-kilobase portion directly from the partially enriched pool by long-range PCR, using primers based on sequences of the cosmid vector and the pTSMAC1 insert. The authenticity of the amplified region was subsequently confirmed by repeated PCR and sequencing with metagenomic DNA.

Fig. 1

Polytheonamides: structures, genes, and biosynthetic model. (A) Polytheonamides A and B (shown) differ in the configuration of the sulfoxide moiety in residue 44. The sulfoxide arises from spontaneous oxidation during polytheonamide isolation. Residues are numbered on the basis of the typical notation for polytheonamides (2). The core peptide sequence is indicated by bold letters, with the color red denoting posttranslational epimerization. All other biosynthetic transformations during maturation of the core peptide are colored as follows: orange, C-methylation; purple, N-methylation; blue, hydroxylation; and green, dehydration (C). (B) Map of the polytheonamide (poy) biosynthetic gene cluster. (C) Model for the formation of the N-acyl terminus from threonine.

The assembled DNA region contained 11 additional genes, clustered around the initially identified open reading frame (ORF) (Fig. 1B). Nine ORFs, which we termed poy genes (poyA-I), form an operon, as apparent from the short or often absent intergenic regions. This polycistronic architecture, as well as the presence of Shine-Dalgarno motifs and lack of detectable introns, suggests a bacterial endosymbiont as the origin of the cloned region. Beyond the gene cluster, the presence of an upstream prokaryotic hicAB-type toxin-antitoxin system, numerous genes and gene fragments resembling bacterial transposition elements, and two downstream genes encoding a polyketide synthase of as-yet-unknown function further support this hypothesis (table S1). The 3′ terminus of poyA consists of 48 codons that match a complete polytheonamide precursor. It is noteworthy that the encoded sequence suggests that the three 3-hydroxyvaline units originate from two different residues: residue 16 from threonine (Thr) by C-methylation and residues 23 and 31 from valine (Val) by hydroxylation (Fig. 1A). In addition to the propeptide-encoding core region, we identified an unusually long 5′ leader sequence in poyA that exhibits homology to nitrile hydratases. This region does not resemble any component of characterized ribosomal pathways. However, Haft and co-workers (11) recently discovered similar leaders in several taxonomically diverse bacteria by in silico genome analysis and postulated the existence of a new natural-product family. Polytheonamides are likely the first characterized members of such a family for which we propose the name proteusins (from Proteus, a Greek shape-shifting sea god).

To generate polytheonamides from proteinogenic residues, four hydroxylations, 18 epimerizations, and at least 21 methylations are necessary (Fig. 1A). Considering the large number of posttranslational modifications, astonishingly few enzyme candidates were identified for these steps. These are PoyB, PoyC, and PoyD, homologous to members of the radical S-adenosylmethionine (rSAM) superfamily (12, 13); PoyE, homologous to SAM-dependent methyltransferases; PoyI, homologous to Fe(II)/α-ketoglutarate oxidoreductases; and PoyF, homologous to the dehydratase domain (14) of LanM-type lantibiotic synthetases. Besides these six enzymes, the cluster encodes other proteins likely involved in regulation, transport, and proteolytic removal of the leader region (PoyJGH) on the basis of homologies. No homology was found for PoyK, and it is unclear whether it belongs to the pathway. The limited number of maturation factors suggests that individual enzymes convert positionally and structurally diverse residues. For example, C-methylation occurs on at least five different units, whereas at least four types of residues are epimerized. Conversely, identical residues are processed in different ways, such as Val and asparagine (Asn), which each appear as three structural variants. Thus, the intriguing question arises of how this biosynthetic machinery reconciles substrate promiscuity with regiospecificity.

To obtain initial insights into poy gene functions and to test whether enzymes indeed act iteratively, individual ORFs were expressed in Escherichia coli. Numerous attempts to produce the precursor PoyA resulted in, at best, minute amounts of insoluble protein. However, after codon-optimization and coexpression trials using various gene combinations, we found that protein yields and solubility of PoyA dramatically improved in the presence of the rSAM protein PoyD (fig. S2). PoyD exhibits close similarity to only a small number of uncharacterized proteins, mostly from hypothetical proteusin gene clusters. Mass spectrometric (MS) analysis of purified PoyA did not reveal a mass shift or apparent modification of the protein sequence. Further analysis of PoyA involving acid hydrolysis, derivatization, and chromatographic separation of MS-verified amino acids revealed the presence of epimerized asparagines and valines within the PoyA core sequence, confirming that PoyD is capable of epimerizing most, and perhaps all, of observed d-amino acids present in polytheonamides A and B (figs. S3 to S5). The unidirectional l- to d-amino acid epimerization observed with PoyD is in contrast to known nonradical amino acid racemases, which generate an equilibrium mixture of epimers (6). Next, we constructed five triple expression strains by adding either poyB, C, E, F, or I to poyA and poyD. Unlike all other triple expressions, the strain coexpressing poyF produced PoyA with a mass of 18 D less than expected (observed 17,090 D, calculated 17,108 D) (Fig. 2A and figs. S6 to S7), which suggested a loss of water and supported the dehydratase function of PoyF. The modified residue was subsequently identified as Thr97 by liquid chromatography–tandem mass spectrometry (LC-MS/MS) (Fig. 2B and figs. S8 to S9).

Fig. 2

Mass spectrometric identification of poyF-catalyzed dehydration. To identify the modified residue, PoyA purified from coexpression with poyDF was trypsinized and the resulting peptides analyzed by LC-MS/MS. The C-terminal tryptic peptide corresponding to residues 76 to 145 of PoyA was found to contain the dehydration site at residue Thr97. (A) Deconvoluted electrospray ionization (ESI)–MS spectrum of PoyA exhibiting a mass shift of –18 D. (B) ESI-MS/MS (ESI–tandem MS) spectrum of tryptic peptide 76-145 from PoyA [precursor ion [M+6H]6+ at mass/charge ratio (m/z) 1130.560; calculated m/z 1130.562 (monoisotopic)] showing a series of b-type fragment ions.

The identification of the Thr residue as part of the PoyA core peptide sequence is corroborated by the presence of an N-terminally adjacent, highly conserved GG motif that was previously proposed as the cleavage site in homologous precursors (11). Comparison of the polytheonamide structure with the peptide core suggests that the Thr residue is converted to the unusual N-terminal acyl unit by a remarkable biosynthetic sequence involving PoyF-catalyzed dehydration, formal t-butylation, and spontaneous formation of the 2-oxo moiety (15) after hydrolytic cleavage of the enamide (Fig. 1C). We are not aware of precedent for the introduction of a t-butyl group at nonactivated carbon positions in biology or synthetic chemistry. This transformation may be accomplished by four successive methylations catalyzed by one or more of the rSAM candidates PoyB and PoyC, both of which contain a cobalamin-binding motif sometimes found within rSAM methyltransferases (12). Although radical methylation has been previously observed (16), this use for extensive modification of peptide structures is notable. In addition, close homologs of poyB and poyC occur in several other ribosomal peptide gene clusters of unknown function (17, 18), which suggests similar modifications in other natural products.

The activity of a third gene encoded in the polytheonamide gene cluster, poyE, was detected only after codon optimization and coexpression harboring two copies of the gene in either 3- or 7-day inductions at 16°C. In-depth MS analysis of the coexpressed PoyA revealed a suite of peptides increasing by 14 mass units (0 to 8 modifications) correlating perfectly with all expected positions for asparagine N-methylation, indicative of iterative N-methyltransferase activity (figs. S10 to S11). Unlike with NRPS-derived peptides, N-methylation of ribosomal natural products is rare—only N-terminal methylation of the cytotoxin cypemycin has been reported with genetic and biochemical verification (6, 19). N-methylation of a single Asn has also been observed in cyanobacterial phycobiliproteins; however, PoyE bears little sequence homology to these enzymes (20). These data highlight the iterative activities of tailoring enzymes to generate a complex natural-product architecture. Convincing gene candidates for the remaining two transformation types are present in the poy cluster, which suggests that C-methylation and hydroxylation are also iterative. Determining whether the three remaining clustered biosynthetic genes are sufficient to complete the structural maturation of polytheonamides will require further study.

Mature polytheonamides form minimalistic unimolecular ion channels in cell membranes (21). This mode of action motivated us to investigate possible antibacterial effects in more detail. We found polytheonamide B to be active against Gram-positive bacteria with minimal inhibitory concentrations in the microgram per milliliter range of concentrations (table S3). The peptide rapidly depolarized the bacterial cytoplasmic membrane, simultaneously decreasing the membrane potential and intracellular K+ contents, which is consistent with the formation of transmembrane ion channels (fig. S12) (21, 22).

In conclusion, we provide evidence for a bacterial origin of a sponge-derived peptide natural product. This discovery supports the hypothesis that many, if not most, bioactive natural products from sponges are endosymbiont-derived (23) and highlights the value of symbiotic bacteria as a rich source of unusual biochemistry (2428). Although polytheonamides are currently the only attributed proteusin members, a small number of compounds exhibit structures that suggest a close biosynthetic relation. These are the sponge-isolated yaku’amides (29) and discodermins (30), all of which contain residues with additional C-methyl groups and d-configured α-carbon atoms. The use of ribosomal machinery to generate products containing d-amino acids and other modifications offers promise for the artificial engineering of peptides, peptidomimetics, and proteins with new structural and functional properties.

Supplementary Materials

Materials and Methods

Figs. S1 to S12

Tables S1 to S3

References (3134)

References and Notes

  1. Acknowledgments: We thank M. Josten for antibiotic activity testing, M. Engeser and C. Sondag for MALDI measurements, A. Schneider for high-performance liquid chromatography measurements, U. Deppenmeier and P. Schweiger for Western blot analysis, and T. Gulder for comments on the manuscript. This work was supported by grants from the German Federal Ministry of Education and Research (BMBF) (GenBioCom: 0315581I) to J.P., the Deutsche Forschungsgemeinschaft to J.P. (PI 430/8-1, PI 430/9-1, FOR 854, SFB 813) and H.-G.S. (FOR 854), the Japan Society for the Promotion of Science to J.P. and S.M., and fellowships by the German Academic Exchange Service (DAAD) to A.R.U., the Humboldt Foundation to B.I.M., and the Human Frontier Science Program to M.F.F. The data reported in this paper are reported in the supplementary materials and archived in GenBank under accession no. JX456532. J.P., C.G., M.F.F., A.R.U., and M.J.H. are inventors on European Patent Application no. 11 180 107.2. filed by Rheinische Friedrich-Wilhelms-Universität Bonn through PROvendis GmbH titled “Biosynthetic gene cluster for the production of peptide-protein analogs.”
View Abstract

Navigate This Article