Genomically Recoded Organisms Expand Biological Functions

See allHide authors and affiliations

Science  18 Oct 2013:
Vol. 342, Issue 6156, pp. 357-360
DOI: 10.1126/science.1241459

Changing the Code

Easily and efficiently expanding the genetic code could provide tools to genome engineers with broad applications in medicine, energy, agriculture, and environmental safety. Lajoie et al. (p. 357) replaced all known UAG stop codons with synonymous UAA stop codons in Escherichia coli MG1655, as well as release factor 1 (RF1; terminates translation at UAG), thereby eliminating natural UAG translation function without impairing fitness. This made it possible to reassign UAG as a dedicated codon to genetically encode nonstandard amino acids while avoiding deleterious incorporation at native UAG positions. The engineered E. coli incorporated nonstandard amino acids into its proteins and showed enhanced resistance to bacteriophage T7. In a second paper, Lajoie et al. (p. 361) demonstrated the recoding of 13 codons in 42 highly expressed essential genes in E. coli. Codon usage was malleable, but synonymous codons occasionally were nonequivalent in unpredictable ways.


We describe the construction and characterization of a genomically recoded organism (GRO). We replaced all known UAG stop codons in Escherichia coli MG1655 with synonymous UAA codons, which permitted the deletion of release factor 1 and reassignment of UAG translation function. This GRO exhibited improved properties for incorporation of nonstandard amino acids that expand the chemical diversity of proteins in vivo. The GRO also exhibited increased resistance to T7 bacteriophage, demonstrating that new genetic codes could enable increased viral resistance.

The conservation of the genetic code permits organisms to share beneficial traits through horizontal gene transfer (1) and enables the accurate expression of heterologous genes in nonnative organisms (2). However, the common genetic code also allows viruses to hijack host translation machinery (3) and compromise cell viability. Additionally, genetically modified organisms (GMOs) can release functional DNA into the environment (4). Virus resistance (5) and biosafety (6) are among today’s major unsolved problems in biotechnology, and no general strategy exists to create genetically isolated or virus-resistant organisms. Furthermore, biotechnology has been limited by the 20 amino acids of the canonical genetic code, which use all 64 possible triplet codons, limiting efforts to expand the chemical properties of proteins by means of nonstandard amino acids (NSAAs) (7, 8).

Changing the genetic code could solve these challenges and reveal new principles that explain how genetic information is conserved, encoded, and exchanged (fig. S1). We propose that genomically recoded organisms (GROs, whose codons have been reassigned to create an alternate genetic code) would be genetically isolated from natural organisms and viruses, as horizontally transferred genes would be mistranslated, producing nonfunctional proteins. Furthermore, GROs could provide dedicated codons to improve the purity and yield of NSAA-containing proteins, enabling robust and sustained incorporation of more than 20 amino acids as part of the genetic code.

We constructed a GRO in which all instances of the UAG codon have been removed, permitting the deletion of release factor 1 (RF1; terminates translation at UAG and UAA) and, hence, eliminating translational termination at UAG codons. This GRO allows us to reintroduce UAG codons, along with orthogonal translation machinery [i.e., aminoacyl–tRNA synthetases (aaRSs) and tRNAs] (7, 9), to permit efficient and site-specific incorporation of NSAAs into proteins (Fig. 1). That is, UAG has been transformed from a nonsense codon (terminates translation) to a sense codon (incorporates amino acid of choice), provided the appropriate translation machinery is present. We selected UAG as our first target for genome-wide codon reassignment because UAG is the rarest codon in Escherichia coli MG1655 (321 known instances), prior studies (7, 10) demonstrated the feasibility of amino acid incorporation at UAG, and a rich collection of translation machinery capable of incorporating NSAAs has been developed for UAG (7).

Fig. 1 Engineering a GRO with a reassigned UAG codon.

Wild-type E. coli MG1655 has 321 known UAG codons that are decoded as translation stops by RF1 (for UAG and UAA). (1) Remove codons: converted all known UAG codons to UAA, relieving dependence on RF1 for termination. (2) Eliminate natural codon function: abolished UAG translational termination by deleting RF1, creating a blank codon. (3) Expand the genetic code: introduced an orthogonal aminoacyl–tRNA synthetase (aaRS) and tRNA to reassign UAG as a dedicated sense codon capable of incorporating nonstandard amino acids (NSAAs) with new chemical properties.

We used an in vivo genome-editing approach (11), which is more efficient than de novo genome synthesis at exploring new genotypic landscapes and overcoming genome design flaws. Although a single lethal mutation can prevent transplantation of a synthetic genome (12), our approach allowed us to harness genetic diversity and evolution to overcome any potential deleterious mutations at a cost considerably less than de novo genome synthesis (supplementary text section B, “Time and cost”). In prior work, we used multiplex automated genome engineering [MAGE (13)] to remove all known UAG codons in groups of 10 across 32 E. coli strains (11), and conjugative assembly genome engineering [CAGE (11)] to consolidate these codon changes in groups of ~80 across four strains. In this work, we overcome technical hurdles (supplementary text) to complete the assembly of the GRO and describe the biological properties derived from its altered genetic code.

The GRO [C321.ΔA, named for 321 UAG→UAA conversions and deletion of prfA (encodes RF1, Table 1)] and its RF1+ precursor (C321) exhibit normal prototrophy and morphology (fig. S2), with 60% increased doubling time compared with E. coli MG1655 (table S1). Genome sequencing [GenBank accession CP006698] confirmed that all 321 known UAGs were removed from its genome and that 355 additional mutations were acquired during construction (10–8 mutations per base pair per doubling over ~7340 doublings; fig. S3 and tables S2 to S4). Although maintaining the E. coli MG1655 genotype was not a primary goal of this work, future applications requiring increased genome stability could exploit reversible switching of mutS function (14) to reduce off-target mutagenesis. CAGE improved the fitness of several strains in the C321 lineage (fig. S3), implicating off-target mutations in the reduced fitness.

Table 1 Recoded strains and their genotypes.
View this table:

C321.ΔA exhibited improved performance compared with previous strategies for UAG codon reassignment (15, 16), permitting the complete reassignment of UAG from a stop codon to a sense codon capable of incorporating NSAAs into proteins. One previous strategy used a variant of release factor 2 (RF2) that exhibits enhanced UAA termination (16) and weak UAG termination (17). The second strategy substituted a UAA stop codon in each of the seven essential genes naturally terminating with UAG (table S5) and reduced ribosome toxicity by efficiently incorporating amino acids at the remaining 314 UAGs (15). For comparative purposes, we used MAGE to create strains C0.B*.ΔA::S [expresses enhanced RF2 variant (16)], C7.ΔA::S (UAG changed to UAA in seven essential genes), and C13.ΔA::S [UAG changed to UAA in seven essential genes plus six nonessential genes (table S5)] (Table 1). C refers to the number of codon changes, while A and B refer to prfA (RF1) and prfB (RF2) manipulations, respectively. In contrast to previous work (15), we deleted RF1 in these strains without introducing a UAG suppressor, perhaps because near-cognate suppression is increased in E. coli MG1655 (18). Nevertheless, these strains exhibited a strong selective pressure to acquire UAG suppressor mutations (see below).

To assess the fitness effects of RF1 removal and UAG reassignment, we measured the doubling time and maximum cell density of each strain (table S1 and fig. S4). We found that C321 was the only strain for which RF1 removal and UAG reassignment was not deleterious (Fig. 2). Because we did not modify RF2 to enhance UAA termination (16), this confirms that RF1 is essential only for UAG translational termination and not for UAA termination or other essential cellular functions. By contrast, RF1 removal significantly impaired fitness for C0.B*.ΔA::S, and codon reassignment exacerbated this effect (Fig. 2 and fig. S5), probably because NSAA incorporation outcompeted the weak UAG termination activity (17) exerted by the RF2 variant (16). C7.ΔA::S and C13.ΔA::S also exhibited strongly impaired fitness, likely due to more than 300 nonessential UAG codons stalling translation in the absence of RF1-mediated translation at UAG codons (15); accordingly, p-acetylphenylalanine (pAcF) incorporation partially alleviated this effect (Fig. 2). However, not all NSAAs improved fitness in partially recoded strains; phosphoserine (Sep) impairs fitness in similar strains (19), perhaps by causing proteome-scale misfolding. Together, these results indicate that only the complete removal of all instances of the UAG codon overcomes these deleterious effects; therefore, it may be the only scalable strategy for sustained NSAA translation and for complete reassignment of additional codons.

Fig. 2 Effects of UAG reassignment at natural UAG codons.

Ratios of maximum cell densities (horizontal axis) and doubling times (vertical axis) were determined for RF1+ strains versus their corresponding RF1 strains (n = 3) in the presence or absence of UAG suppression. Symbol color specifies genotype: UAA is the number of UAG→UAA mutations, and RF2 is “WT” (wild type) or “sup” [RF2 variant that can compensate for RF1 deletion (16)]. Symbol shape specifies NSAA expression: aaRS (aminoacyl–tRNA synthetase) is “none” (genes for UAG reassignment were absent), “–” [pEVOL-pAcF (9) is present but not induced, so only the constitutive aaRS and tRNA are expressed], or “+” (pEVOL-pAcF is fully induced using l-arabinose), and pAcF is “–” (excluded) or “+” (supplemented). Strains that do not rely on RF1 are expected to have a RF1+/RF1 ratio at (1,1). RF1 strains exhibiting slower growth are below the horizontal gray line, and RF1 strains exhibiting lower maximum cell density are to the right of the vertical gray line. The doubling-time error bars are too small to visualize.

We tested the capacity of our recoded strains to efficiently incorporate NSAAs [pAcF, p-azidophenylalanine (pAzF), or 2-naphthalalanine (NapA)] into green fluorescent protein (GFP) variants containing zero, one, or three UAG codons (Fig. 3 and fig. S6). In the presence of NSAAs, the RF1+ strains efficiently read through variants containing three UAGs, demonstrating that the episomal pEVOL translation system, which expresses an aaRS and tRNA that incorporate a NSAA at UAG codons (9), is extremely active and strongly outcompetes RF1. In the absence of NSAAs, the RF1 strains exhibited detectable amounts of near-cognate suppression (18) of a single UAG. C321.ΔA::S exhibited strong expression of UAG-containing GFP variants only in the presence of the correct NSAA, whereas C7.ΔA::S and C13.ΔA::S displayed read-through of all three UAG codons even in the absence of NSAAs, suggesting efficient incorporation of natural amino acids at native UAGs (17). Mass spectrometry indicated that C13.ΔA::S incorporated Gln, Lys, and Tyr at UAG codons. DNA sequencing in C7.ΔA::S and C13.ΔA::S revealed UAG suppressor mutations in glnV, providing direct genetic evidence of Gln suppression observed by Western blot (Fig. 3A) and mass spectrometry (table S13). C0.B*.ΔA::S displayed truncated GFP variants corresponding with UAG termination in the absence of RF1 (17) (Fig. 3A).

Fig. 3 NSAA incorporation in GROs.

(A) Western blots demonstrate that C0.B*.ΔA::S terminates at UAG in the absence of RF1 and that C7.ΔA::S and C13.ΔA::S have acquired natural suppressors that allow strong NSAA-independent read-through of three UAG codons. When pAcF was omitted, one UAG reduced the production of full-length GFP, and three UAGs reduced production to undetectable levels for all strains except C7.ΔA::S and C13.ΔA::S, demonstrating that undesired near-cognate suppression (18) is weak for most strains even when RF1 is inactivated. However, all strains show efficient translation through three UAG codons when pAcF is incorporated. Western blots were probed with an antibody to GFP that recognizes an N-terminal epitope. UAA is the number of UAG→UAA mutations; RF2 is “WT” (wild type) or “sup” [RF2 variant that can compensate for RF1 deletion (16)]; RF1 is “WT” (wild type) or “S” (ΔprfA::specR). “GFP” is full-length GFP; “trunc” is truncated GFP from UAG termination and is enriched in the insoluble fraction; “ns” indicates a nonspecific band. (B) Venn diagram representing NSAA-containing peptides detected by mass spectrometry in C0.B*.ΔA::S when UAG was reassigned to incorporate p-acetylphenylalanine (pAcF, red) or phosphoserine (Sep, blue). No NSAA-containing peptides were identified in C321.ΔA::S. Asterisk (*) indicates coding DNA sequence possessing two tandem UAG codons. (C) Extracted ion chromatograms are shown for UAG suppression of the SpeG peptide to investigate Sep incorporation in natural proteins. Peptides containing Sep were only observed in C0.B*.ΔA::S, C7.ΔA::S, and C13.ΔA::S, as Sep incorporation was below the detection limit in EcNR2 (RF1+), and speG was recoded in C321.ΔA::S.

We directly investigated the impact of pAcF and Sep incorporation on the proteomes (Fig. 3B) (20) of our panel of strains (Table 1) using mass spectrometry (tables S6 to S12). No Sep-containing peptides were observed for EcNR2, illustrating that RF1 removal is necessary for NSAA incorporation by the episomal phosphoserine system (21), which is an inefficient orthogonal translation machinery (19) (Fig. 3C and table S10). By contrast, we observed NSAA-containing peptides in unrecoded (C0.B*.ΔA::S) and partially recoded (C13.ΔA::S) strains, and not the GRO (C321.ΔA::S), which lacks UAGs in its genome (Fig. 3, B and C, fig. S7, and tables S6 to S12). Such undesired incorporation of NSAAs (or natural amino acids) likely underlies the fitness impairments observed for C0.B*.ΔA::S, C7.ΔA::S, and C13.ΔA::S. In contrast to the other RF1 strains, C321.ΔA::S demonstrated equivalent fitness to its RF1+ precursor (Fig. 2) and efficiently expressed all GFP variants without incorporating NSAAs at unintended sites (Figs. 2 and 3 and fig. S6). Therefore, complete UAG removal is the only strategy that provides a devoted codon for plug-and-play NSAA incorporation without impairing fitness (Figs. 2 and 3).

To determine whether this GRO can obstruct viral infection, we challenged RF1 strains with bacteriophages T4 and T7. Viruses rely on their host to express proteins necessary for propagation. Because hosts with altered genetic codes would mistranslate viral proteins (3), recoding may provide a general mechanism for resistance to all natural viruses. Given that UAG codons occur rarely and only at the end of genes, we did not expect UAG reassignment to result in broad phage resistance. Although the absence of RF1 did not appear to affect T4 (19 of 277 stop codons are UAG), it significantly enhanced resistance to T7 (6 of 60 stop codons are UAG) (Fig. 4).

Fig. 4 Bacteriophage T7 infection is attenuated in GROs lacking RF1.

RF1 (prfA) status is denoted by symbol shape: (■) wt prfA (WT); (★) ΔprfA::specR (ΔA::S); () ΔprfA::tolC (ΔA::T); and () a clean deletion of prfA (ΔA). (A) RF1 status affects plaque area (Kruskal-Wallis one-way analysis of variance, P < 0.001), but strain doubling time does not (Pearson correlation, P = 0.49). Plaque areas (mm2) were calculated with ImageJ, and means ± 95% confidence intervals are reported (n > 12 for each strain). In the absence of RF1, all strains except C0.B*.ΔA::S yielded significantly smaller plaques, indicating that the RF2 variant (16) can terminate UAG adequately to maintain T7 fitness. A statistical summary can be found in table S14. (B) T7 fitness (doublings/hour) (22) is impaired (P = 0.002) and mean lysis time (min) is increased (P < 0.0001) in C321.ΔA compared to C321. Significance was assessed for each metric by using an unpaired t test with Welch’s correction.

RF1 hosts produced significantly smaller T7 plaques independent of host doubling time (Fig. 4A and fig. S8). The only exception was C0.B*.ΔA::S, which produced statistically equivalent plaque sizes regardless of whether RF1 was present (Fig. 4A and table S14). Consistent with the observation that the modified RF2 variant could weakly terminate UAG [(17) and herein], our results suggest that C0.B*.ΔA::S terminates UAG codons well enough to support normal T7 infection.

Given that plaque area and phage fitness (doublings per hour) do not always correlate, we confirmed that T7 infection is inhibited in RF1 hosts by comparing T7 fitness and lysis time in C321 versus C321.ΔA (Fig. 4B). Phage fitness (doublings per hour) is perhaps the most relevant measure for assessing phage resistance because it indicates how quickly a log-phase phage infection expands (22). We found that T7 fitness was significantly impaired in strains lacking RF1 (P = 0.002), and kinetic lysis curves (fig. S9) confirmed that lysis was significantly delayed in the absence of RF1 (P < 0.0001, Fig. 4B). Meanwhile, one-step growth curves (fig. S10) indicated that burst size (average number of phages produced per lysed cell) in RF1 hosts was also reduced by 59% (±9%), and phage packaging was delayed by 30% (±2%) (table S15). We hypothesize that ribosome stalling at the gene 6 (T7 exonuclease) UAG explains the T7 fitness defect in RF1 hosts, whereas T4 may not possess a UAG-terminating essential gene with a similar sensitivity (supplementary text). Abolishing the function of additional codons could block the translation of viral proteins and prevent infections entirely.

Using multiplex genome editing, we removed all instances of the UAG codon and reassigned its function in the genome of a living cell. The resulting GRO possesses a devoted UAG sense codon for robust NSAA incorporation that is suitable for industrial protein production. GROs also establish the basis for genetic isolation and virus resistance, and additional recoding will help fully realize these goals—additional triplets could be reassigned, unnatural nucleotides could be used to produce new codons (23), and individual triplet codons could be split into several unique quadruplets (8, 24) that each encode their own NSAA. In an accompanying study (25), we show that 12 additional triplet codons may be amenable to removal and eventual reassignment in E. coli. However, codon usage rules are not fully understood, and recoded genome designs are likely to contain unknown lethal elements. Thus, it will be necessary to sample vast genetic landscapes, efficiently assess phenotypes arising from individual changes and their combinations, and rapidly iterate designs to change the genetic code at the genome level.

Supplementary Materials

Materials and Methods

Figs. S1 to S22

Tables S1 to S37

References (2670)

References and Notes

  1. Acknowledgments: We dedicate this paper to the memory of our friend, colleague, and gifted scientist, Tara Gianoulis. We thank R. Kolter for JC411, D. Reich for help with sequencing libraries, and C. and J. Seidman for Covaris E210; J. Aach, S. Kosuri, and U. Laserson for bioinformatics; T. Young, F. Peters, and W. Barnes for NSAA incorporation advice; I. Molineux and S. Kosuri for phage advice; S. Vassallo and P. Mali for experimental support; and D. Söll, A. Forster, T. Wu, K. Oye, C. Gregg, M. Napolitano, U. Laserson, A. Briggs, D. Mandell, and R. Chari for helpful comments. Funding was from the U.S. Department of Energy (DE-FG02-02ER63445), NSF (SA5283-11210), NIH (NIDDK-K01DK089006 to J.R.), Defense Advanced Research Projects Agency (N66001-12-C-4040, N66001-12-C-4020, N66001-12-C-4211), Arnold and Mabel Beckman Foundation (F.J.I.), U.S. Department of Defense National Defense Science and Engineering Graduate Fellowship (M.J.L.), NIH-MSTP-TG-T32GM07205 (A.D.H.), NSF graduate fellowships (H.H.W. and D.B.G.), NIH Director’s Early Independence Award (1DP5OD009172-01 to H.H.W), and the Assistant Secretary of Defense for Research and Engineering (Air Force Contract no. FA8721-05-C-0002 to P.A.C.). Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the U.S. government.
View Abstract

Navigate This Article