Synthetic Genetic Polymers Capable of Heredity and Evolution

See allHide authors and affiliations

Science  20 Apr 2012:
Vol. 336, Issue 6079, pp. 341-344
DOI: 10.1126/science.1217622


Genetic information storage and processing rely on just two polymers, DNA and RNA, yet whether their role reflects evolutionary history or fundamental functional constraints is currently unknown. With the use of polymerase evolution and design, we show that genetic information can be stored in and recovered from six alternative genetic polymers based on simple nucleic acid architectures not found in nature [xeno-nucleic acids (XNAs)]. We also select XNA aptamers, which bind their targets with high affinity and specificity, demonstrating that beyond heredity, specific XNAs have the capacity for Darwinian evolution and folding into defined structures. Thus, heredity and evolution, two hallmarks of life, are not limited to DNA and RNA but are likely to be emergent properties of polymers capable of information storage.

The nucleic acids DNA and RNA provide the molecular basis for all life through their unique ability to store and propagate information. To better understand these singular properties and discover relevant parameters for the chemical basis of molecular information encoding, nucleic acid structure has been dissected by systematic variation of nucleobase, sugar, and backbone moieties (17).

These studies have revealed the profound influence of backbone, sugar, and base chemistry on nucleic acid properties and function. Crucially, only a small subset of chemistries allows information transfer through base pairing with DNA or RNA, a prerequisite for cross-talk with extant biology. However, base pairing alone cannot conclusively determine the capacity of a given chemistry to serve as a genetic system, because hybridization need not preserve information content (8). A more thorough examination of candidate genetic polymers’ potential for information storage, propagation, and evolution requires a system for replication that would allow a systematic exploration of the informational, evolutionary, and functional potential of synthetic genetic polymers and would open up applications ranging from biotechnology to materials science.

In principle, informational polymers can be synthesized and replicated chemically (9), with advances in the nonenzymatic polymerization of mononucleotides (10) and short oligomers (11, 12) enabling model selection experiments (13). Nevertheless, chemical polymerization remains relatively inefficient. On the other hand, enzymatic polymerization has been hindered by the stringent substrate selectivity of polymerases. Despite progress in understanding the determinants of polymerase substrate specificity and in engineering polymerases with expanded substrate spectra (7), most unnatural nucleotide analogs are poor polymerase substrates at full substitution, as both nucleotides for polymer synthesis and templates for reverse transcription. Notable exceptions are 2'OMe-DNA and α-l-threofuranosyl nucleic acid (TNA). 2'OMe-DNA is present in eukaryotic ribosomal RNAs, is well tolerated by natural reverse transcriptases (RTs), and has been shown to support heredity and evolution at near full substitution (14). TNA allowed polymer synthesis and evolution in a three-letter system (15) but only limited reverse transcription (16).

Here, we describe a general strategy to enable enzymatic replication and evolution of a broad range of synthetic genetic polymers based on: (i) a chemical framework [generically termed xeno-nucleic acid (XNA)] capable of specific base pairing with DNA, (ii) the engineering of polymerases that can synthesize XNA from a DNA template, and (iii) the engineering of polymerases that can reverse transcribe XNA back into DNA. We chose six different XNAs in which the canonical ribofuranose ring of DNA and RNA is replaced by five- or six-membered congeners comprising 1,5-anhydrohexitol nucleic acids (HNAs), cyclohexenyl nucleic acids (CeNAs), 2'-O,4'-C-methylene-β-d-ribonucleic acids [locked nucleic acids (LNAs)], arabinonucleic acids (ANAs), 2'-fluoro-arabinonucleic acids (FANAs), and TNAs (46, 17, 18).

To enable discovery of polymerases capable of processive XNA synthesis, we developed a selection strategy called compartmentalized self-tagging (CST) (fig. S1). CST selections were performed on libraries of TgoT, a variant of the replicative polymerase of Thermococcus gorgonarius comprising mutations to the uracil-stalling [Val93→Gln93 (V93Q)] (19, 20) and 3′-5′ exonuclease (D141A, E143A) functions, as well as the “Therminator” mutation (A485L) (21). TgoT libraries were created from both random and phylogenetic diversity targeted to 22 short sequence motifs within a 10 Å shell of the nascent strand (fig. S2).

CST selections with HNA and CeNA nucleotide triphosphates (hNTPs/ceNTPs) yielded rapid adaptation toward HNA and CeNA polymerase activity. One polymerase, Pol6G12 (TgoT: V589A, E609K, I610M, K659Q, E664Q, Q665P, R668K, D669Q, K671H, K674R, T676R, A681S, L704P, E730G) (Fig. 1A), displayed general DNA-templated HNA polymerase activity dependent on the presence of all four hNTPs (fig. S4) and enabled the synthesis of HNAs long enough to encode meaningful genetic information such as tRNA genes. HNA synthesis was further investigated by mass spectrometry (MS), confirming the expected molecular mass, composition, and sequence of HNA polymers (Fig. 2C and fig. S6).

Fig. 1

Engineering XNA polymerases. (A) Sequence alignments showing mutations from Tgo consensus in polymerases Pol6G12 (red), PolC7 (green), and PolD4K (blue). (B) Mutations are mapped on the structure of Pfu (Protein Data Bank identification code: 4AIL). Yellow, template; dark blue, primer; orange, mutations present in the parent polymerase TgoT.

Fig. 2

HNA synthesis, MS analysis, and reverse transcription. (A) Structure of 1,5-anhydrohexitol (HNA) nucleic acids (B, nucleobase). (B) Pol6G12 extends the primer (p) incorporating 72 hNTPs against template T1 (table S3) to generate a full-length hybrid molecule with a 37,215-dalton expected molecular mass (27). MW, ILS 600 molecular weight marker. P, primer-only reactions. (C) Matrix-assisted laser desorption/ionization–time-of-flight spectrum of a full-length HNA molecule showing a measured HNA mass of 37,190 ± 15 daltons (n = 3 measurements). a.u., arbitrary units; m/z, mass-to-charge ratio. (D) HNA reverse transcription (DNA synthesis from an HNA template). Polymerase-synthesized HNA (from template YtHNA4) (table S3) is used as template by RT521 for HNA-RT (-* denotes a no HNA synthesis control to rule out template contamination).

Having established HNA synthesis, we sought to discover a reverse transcriptase for HNA (HNA-RT), capable of synthesizing complementary DNA from an HNA template, to retrieve the genetic information encoded in HNA and enable both analysis and evolution. As no available polymerase displayed this activity, we engineered an HNA-RT de novo. Because HNA adopts RNA-like A-form helical conformations (5), we hypothesized that an HNA-RT might be found in the structural neighborhood of an RNA-RT. Starting from TgoT, we used statistical correlation analysis (SCA) (22) of the polB family (fig. S7) to uncover potential allosteric interaction networks involved in template recognition. Random mutagenesis and screening by a polymerase activity assay (fig. S3) of four SCA “hits” (F405, Y520, I521, L575) in the vicinity of L408 [a residue implicated in RNA-RT activity in the related Pfu DNA polymerase (23)] identified a mutant, TgoT: E429G, I521L, K726R (RT521), as a proficient HNA-RT (Fig. 2D). Together with Pol6G12, the evolved HNA polymerase, RT521 enables the transfer of genetic information from DNA to HNA and its retrieval back into DNA (fig. S11).

Next, we explored if other polymerases derived by CST and SCA might enable synthesis and reverse transcription of other synthetic genetic polymers. Screening identified PolC7 (TgoT: E654Q, E658Q, K659Q, V661A, E664Q, Q665P, D669A, K671Q, T676K, R709K) and PolD4K (TgoT: L403P, P657T, E658Q, K659H, Y663H, E664K, D669A, K671N, T676I) (Fig. 1) as efficient synthetases for CeNA (C7), LNA (C7), ANA (D4K), and FANA (D4K) (Fig. 3, A to C, E, and F). Therminator (9°N exo-: A485L) polymerase has previously been shown to support TNA synthesis (16), but TNA-RTs were lacking. RT521 proved capable of both efficient TNA synthesis and reverse transcription (Fig. 3D). In addition, RT521 is an efficient RT for both ANA and FANA (Fig. 3, B and C). Another polymerase variant, RT521K (RT521: A385V, F445L, E664K), was found to enhance CeNA-RT activity and enable reverse transcription of LNA (Fig. 3, A and E, and fig. S8). Together, these engineered polymerases support the synthesis and reverse transcription of six synthetic genetic polymers and thus enable replication of the information encoded therein (Fig. 3G).

Fig. 3

XNA genetic polymers. Structures, polyacrylamide gel electrophoresis (PAGE) of synthesis (+72 xnt), and reverse transcription (+93 nt) of (A) CeNA, (B) ANA, (C) FANA, and (D) TNA. (E) PAGE of LNA synthesis [primer (41 nt) + 72 lnt] and LNA-RT (red) resolved by alkali agarose gel electrophoresis (AAGE). LNA synthesis (green) migrates at its expected size (113 nt) and comigrates with reverse transcribed DNA (red) synthesized from primer PRT2 (20 nt) (fig. S8 and table S3). (F) AAGE of XNA and DNA polymers of identical sequence. MW, ILS 600 molecular weight markers. Equivalent PAGE is shown in fig. S5. (G) XNA RT–polymerase chain reaction (MW, New England Biolabs low molecular weight marker; NT, no template control). Amplification products of expected size (133 base pairs) are obtained only with both XNA forward synthesis and RT (RT521 or RT521K) (fig. S12).

Mutations enabling DNA-templated XNA synthesis were found to cluster at the periphery of the primer-template interaction interface in the polymerase thumb subdomain, >20 Å from the active site (Fig. 1B), and, in one case, allowed direct XNA-templated XNA replication (FANA, fig. S9). In contrast, broad XNA-RT activity was mostly effected by a mutation (I521L) in proximity to a catalytic aspartate (D542) and the polymerase active site. Its identification by SCA points to potential allosteric interaction networks involved in template recognition.

As previously observed for TNA (16), noncognate polymer synthesis can come at a cost of reduced fidelity as polymerase structures are poorly adapted to detect mismatches or aberrant geometry in the noncanonical XNA•DNA (or DNA•XNA) duplexes. We determined aggregate fidelities (as the probability of errors per position) of a full DNA → XNA → DNA replication cycle ranging from 4.3 × 10−3 (CeNA) to 5.3 × 10−2 (LNA), with HNA, CeNA, ANA, and FANA superior to LNA and TNA (figs. S11 and S12 and table S8).

Synthesis and reverse transcription establish heredity (defined as the ability to encode and pass on genetic information) in all six XNAs. We next sought to explore the capacity of such genetic polymers for Darwinian evolution. As a stringent test for evolution and for acquisition of higher-order functions such as folding and specific ligand binding, we initiated aptamer selections directly from diverse HNA sequence repertoires. We used a modification of the standard aptamer selection protocol comprising magnetic beads for capture and isolation of all-HNA aptamers against two targets that had previously been used to generate both DNA and RNA aptamers (24, 25): the HIV trans-activating response RNA (TAR) and hen egg lysozyme (HEL).

After eight rounds (R8) of selection with a biotinylated [27-nucleotide (nt)] version of the TAR RNA motif (sTAR) used as bait, clear consensus motifs emerged (fig. S13) from which we identified an HNA aptamer (T5–S8-7) that bound specifically to sTAR with a dissociation constant (KD) between 28 and 67 nM, as determined by surface plasmon resonance (SPR), bio-layer interferometry (BLI), and enzyme-linked oligonucleotide assay (ELONA) titration (Fig. 4C, fig. S14, and table S6). Other anti-TAR HNA aptamers from the same selection experiment displayed similar affinities but distinctive fine specificities with regard to binding TAR loop or bulge regions (Fig. 4A and fig. S14). We initiated selection against HEL from an N40 random sequence repertoire and again observed the emergence of consensus motifs after R8 (fig. S15). We identified specific HEL binders with KD of 107 to 141 nM, as determined by SPR, BLI, and fluorescence polarization (Fig. 4C, fig. S16, and table S7). Anti-HEL HNA aptamers cross-reacted with human lysozyme and, to a minor degree (<10%), with the highly positively charged cytochrome C (isoelectric point = 9.6), but did not show binding to unrelated proteins such as bovine serum albumin and streptavidin (Fig. 4B). Fluorescently labeled HNA aptamers allowed direct detection of surface HEL expression by flow cytometry [fluorescence-activated cell sorting (FACS)] in a transfected cell line, demonstrating specificity in a complex biological environment (Fig. 4D).

Fig. 4

Characterization of HNA aptamers. Anti-TAR aptamer T5-S8-7 (HNA: 6’-AGGTAGTGCTGTTCGTTCATCTCAAATCTAGTTCGCTATCCAGTTGGC-4’) and anti-HEL aptamer LYS-S8-19 (HNA: 6’-AGGTAGTGCTGTTCGTTTAAATGTGTGTCGTCGTTCGCTATCCAGTTGGC-4’) were characterized by ELONA (27). (A and B) Aptamer binding specificity against TAR variants (red, sequence randomized but with base-pairing patterns maintained) and different protein antigens (human lysozyme, HuL; cytochrome C, CytC; streptavidin, sAV; biotinylated-HEL bound to streptavidin, sAV-bHEL). OD, optical density. (C) Affinity measurements of aptamer binding by SPR. RU, response units. (D) FACS analysis of fluorescein isothiocyanate (FITC)–labeled aptamers binding to plasmacytoma line J558L with and without expression of membrane-bound HEL (mHEL) (27). wt, wild type.

Our work establishes strategies for the replication and evolution of synthetic genetic polymers not found in nature, providing a route to novel sequence space. The capacity of synthetic polymers for both heredity and evolution also shows that DNA and RNA are not functionally unique as genetic materials. The methodologies developed herein are readily applied to other nucleic acid architectures and have the potential to enable the replication of genetic polymers of increasingly divergent chemistry, structural motifs, and physicochemical properties, as shown here by the acid resistance of HNA aptamers (fig. S17). Thus, aspects of the correlations between chemical structure, evolvability, and phenotypic diversity may become amenable to systematic study. Such “synthetic genetics” (26)—that is, the exploration of the informational, structural, and catalytic potential of synthetic genetic polymers—should advance our understanding of the parameters of chemical information encoding and provide a source of ligands, catalysts, and nanostructures with tailor-made chemistries for applications in biotechnology and medicine.

Supplementary Materials

Materials and Methods

Figs. S1 to S17

Tables S1 to S7

References (2864)

References and Notes

  1. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.
  2. Materials and methods are available as supplementary materials on Science Online.
  3. Acknowledgments: This work was supported by the MRC (U105178804) (P. Holliger, V.B.P., C.C.) and by grants from the European Union Framework [FP6-STREP-029092 NEST (P. Holliger, V.B.P., M.A., M.R., P. Herdewijn)], the European Science Foundation and the Biotechnology and Biological Sciences Research Council (BBSRC) UK (09-EuroSYNBIO-OP-013) (A.I.T.), the European Research Council (ERC-2010-AdG_20100317) (J.W.), and Katholieke Universiteit Leuven (GOA/IDO programs) (P. Herdewijn). MRC has filed a patent continuation in part (U.S. 2010/018407 A1) and a patent application (WO 2011/135280 A2) on the CST selection system and the polymerases for XNA synthesis and reverse transcription. Polymerases are available for noncommercial purposes from P. Holliger on request subject to a material transfer agreement.
View Abstract

Navigate This Article