Research Article

Genomically encoded analog memory with precise in vivo DNA writing in living cell populations

See allHide authors and affiliations

Science  14 Nov 2014:
Vol. 346, Issue 6211, 1256272
DOI: 10.1126/science.1256272

Structured Abstract

Introduction

The conversion of transient information into long-lasting responses is a common aspect of many biological processes and is crucial for the design of sophisticated synthetic circuits. Genomic DNA provides a rich medium for the storage of information in living cells. However, current cellular memory technologies are limited in their storage capacity and scalability.

Embedded Image

SCRIBE enables distributed genomically encoded memory. In the presence of an input, ssDNAs (orange curved lines) are produced from a plasmid-borne cassette (gray circles) and recombined into specific genomic loci (orange circles) that are targeted on the basis of sequence homology. This results in the accumulation of precise mutations (stars in green cells) as a function of the magnitude and duration of exposure to the input.

Rationale

We converted genomic DNA into a “tape recorder” for memorizing information in living cell populations. This was achieved via SCRIBE (Synthetic Cellular Recorders Integrating Biological Events), a programmable and modular architecture for generating single-stranded DNA (ssDNA) inside of living cells in response to gene regulatory signals. When coexpressed with a recombinase, these ssDNAs address specific target loci on the basis of sequence homology and introduce precise mutations into genomic DNA, thus converting transient cellular signals into genomically encoded memory. This distributed biological memory leverages the large number of cells in bacterial cultures and encodes information into their collective genomic DNA in the form of the fraction of cells that carry specific mutations.

Results

We show that SCRIBE enables the recording of arbitrary transcriptional inputs into DNA storage registers in living cells by translating regulatory signals into ssDNAs. In Escherichia coli, we expressed ssDNAs from engineered retrons that use a reverse transcriptase protein to produce hybrid RNA-ssDNA molecules. These intracellularly expressed ssDNAs are targeted into specific genomic loci where they are recombined and converted into permanent memory. We show that genomically stored information can be readily reprogrammed by changing the ssDNA template and controlled via both chemical and light inputs. We demonstrate that genomically encoded memory can be read with a variety of techniques, including reporter genes, functional assays, and high-throughput DNA sequencing.

SCRIBE enables the recording of analog information such as the magnitude and time span of exposure to an input. This convenient feature is facilitated by the intermediate recombination rate of our current system (~10–4 recombination events per generation), which we validated via a mathematical model and computer simulations. For example, we stored the overall exposure time to chemical inducers in the DNA memory of bacterial populations for 12 days (~120 generations), independently of the induction pattern. The frequency of mutants in these populations was linearly related to the total exposure time.

Furthermore, we demonstrate that SCRIBE-induced mutations can be written and erased and can be used to record multiple inputs across the distributed genomic DNA of bacterial populations. Finally, we show that SCRIBE memory can be decomposed into independent “input,” “write,” and “read” operations and used to create genetic “logic-and-memory” circuits, as well as “sample-and-hold” circuits.

Conclusion

We describe a scalable platform that uses genomic DNA for analog, rewritable, and flexible memory distributed across living cell populations. We anticipate that SCRIBE will enable long-term cellular recorders for environmental and biomedical applications. Future optimization of recombination efficiencies achievable by SCRIBE could lead to more efficient single-cell digital memories and enhanced genome engineering technologies. Furthermore, the ability to regulate the generation of arbitrary targeted mutations with other gene-editing technologies should enable genomically encoded memory in additional organisms.

Abstract

Cellular memory is crucial to many natural biological processes and sophisticated synthetic biology applications. Existing cellular memories rely on epigenetic switches or recombinases, which are limited in scalability and recording capacity. In this work, we use the DNA of living cell populations as genomic “tape recorders” for the analog and distributed recording of long-term event histories. We describe a platform for generating single-stranded DNA (ssDNA) in vivo in response to arbitrary transcriptional signals. When coexpressed with a recombinase, these intracellularly expressed ssDNAs target specific genomic DNA addresses, resulting in precise mutations that accumulate in cell populations as a function of the magnitude and duration of the inputs. This platform could enable long-term cellular recorders for environmental and biomedical applications, biological state machines, and enhanced genome engineering strategies.

Record your memories with a DNA recorder

DNA-based memory devices are not optimal for recording analog information, such as the magnitude of inputs over time. Farzadfard and Lu converted genomic DNA into a “tape recorder” within living bacterial populations (see the Perspective by Ausländer and Fussenegger). Specific DNAs were used to introduce precise mutations into genomic DNA. The stored information could be read out via reporter genes, functional assays, and DNA sequencing. This approach allowed the memorization of multiple inputs at the population level. The record could also be erased when required.

Science, this issue 10.1126/science.1256272; see also p. 813

Due to its high storage capacity, durability, ease of duplication, and high-fidelity maintenance of information, DNA has garnered much interest as an artificial storage medium (1, 2). However, existing technologies for in vivo autonomous recording of information in cellular memory are limited in their storage capacity and scalability (3). Epigenetic memory devices such as bistable toggle switches (47) and positive-feedback loops (8) require orthogonal transcription factors and can lose their digital state due to environmental fluctuations or cell death. Recombinase-based memory devices enable the writing and storage of digital information in the DNA of living cells (912), where binary bits of information are stored in the orientation of large stretches of DNA. However, these devices do not efficiently exploit the full capacity of DNA for information storage: Recording a single bit of information with these devices often requires at least a few hundred base pairs of DNA, overexpression of a recombinase protein to invert the target DNA, and engineering recombinase recognition sites into target loci in advance. The scalability of this type of memory is further limited by the number of orthogonal recombinases that can be used in a single cell. Finally, epigenetic and recombinase-based memory devices described to date store digital information, and their recording capacity is exhausted within a few hours of induction. Thus, these devices have not been adapted to record analog information, such as magnitude and time course of inputs over extended periods of time (i.e., multiple days or more).

Here we introduce SCRIBE (Synthetic Cellular Recorders Integrating Biological Events), a compact, modular strategy for producing single-stranded DNA (ssDNA) inside of living cells in response to a range of regulatory signals, such as small chemical inducers and light. These ssDNAs address specific target loci on the basis of sequence homology and introduce precise mutations into genomic DNA. The memory device can be easily reprogrammed to target different genomic locations by changing the ssDNA template. SCRIBE memory does not just record the absence or presence of arbitrary inputs (digital signals represented as binary “0s” or “1s”). Instead, by encoding information into the collective genomic DNA of cell populations, SCRIBE can track the magnitude and long-term temporal behavior of inputs, which are analog signals that can vary over a wide range of continuous values. This analog memory architecture leverages the large number of cells in bacterial cultures for distributed information storage and archives event histories in the fraction of cells in a population that carry specific mutations.

Single-stranded DNA expression in living cells

Previously, it was shown that synthetic oligonucleotides delivered by electroporation into cells that overexpress Beta recombinase (from bacteriophage λ) in Escherichia coli are specifically and efficiently recombined into homologous genomic sites (1316). Thus, oligonucleotide-mediated recombination offers a powerful way to introduce targeted mutations in a bacterial genome (17, 18). However, this technique requires the exogenous delivery of ssDNAs and cannot be used to couple arbitrary signals into genetic memory. To overcome these limitations, we developed a genome editing platform based on expressing ssDNAs inside of living cells by taking advantage of a widespread class of bacterial reverse transcriptases (RTs) called retrons (19, 20).

The wild-type (WT) retron cassette encodes three components in a single transcript: a RT protein and two RNA moieties, msr and msd, which act as the primer and the template for the reverse transcriptase, respectively (Fig. 1A, left). The msr-msd sequence in the retron cassette is flanked by two inverted repeats. Once transcribed, the msr-msd RNA folds into a secondary structure guided by the base pairing of the inverted repeats and the msr-msd sequence. The RT recognizes this secondary structure and uses a conserved guanosine residue in the msr as a priming site to reverse transcribe the msd sequence and produce a hybrid RNA-ssDNA molecule called msDNA (i.e., multicopy single-stranded DNA) (20, 21). To couple the expression of ssDNA to an external input, the WT Ec86 retron cassette from E. coli BL21 (21) was placed under the control of an isopropyl β-d-1-thiogalactopyranoside (IPTG)–inducible promoter (PlacO) in E. coli DH5αPRO cells (22), which express high levels of the LacI and TetR repressors (Fig. 1A). The WT retron ssDNA [ssDNA(wt)] was readily detected in IPTG-induced cells, whereas no ssDNA was detected in noninduced cells (Fig. 1B). The identity of the detected ssDNA band was further confirmed by DNA sequencing (fig. S1). To verify that ssDNA expression depended on RT activity, point mutations [Asp197→Ala197 (D197A) and D198A] were introduced to the active site of the RT to make a catalytically dead RT (dRT) (23). This modification completely abolished ssDNA production (Fig. 1B).

Fig. 1 SCRIBE system for recording inputs in the distributed genomic DNA of bacterial populations.

(A) Synthetic ssDNA (red lines) generation inside of living cells by retrons. (B) Visualization of retron-mediated ssDNAs produced in living bacteria. The amount of ssDNA in each sample (shown in brackets) was calculated by densitometry. (C) A kanR reversion assay was used to measure the efficiency of DNA writing within living cells, where the msd(kanR)ON cassette and the bet gene were inducible by IPTG and aTc, respectively. (D) Demonstration of analog memory achieved via SCRIBE to record the magnitude of an input into genomic DNA. The green line is a linear regression fit. The red dashed brackets marked with asterisks connect the closest data points that are statistically significant with respect to each other (P < 0.05 based on one-tailed Welch’s t test). Error bars indicate SEM for three independent biological replicates.

To engineer the msd template to express synthetic ssDNAs of interest, we initially tried to replace the whole msd sequence with a desired template. However, no ssDNA was detected, suggesting that some features of msd are required for ssDNA expression, as was previously noted for another retron (24). A variant in which the flanking regions of the msd stem remained intact (Fig. 1A, right) produced detectable amounts of ssDNA when induced by IPTG (Fig. 1B, PlacO_msd(kanR)ON + IPTG). The correct identity of the detected ssDNA band was further confirmed by DNA sequencing (fig. S1). Thus, the lower part of the msd stem is essential for reverse transcription, whereas the upper part of the stem and the loop are dispensable and can be replaced with desired templates to produce ssDNAs of interest in vivo.

Regulated genome editing with in vivo ssDNAs

To demonstrate that intracellularly expressed ssDNAs can be recombined into target genomic loci by concomitant expression of Beta, we developed a selectable marker reversion assay (Fig. 1C). The kanR gene, which encodes neomycin phosphotransferase II and confers resistance to kanamycin (Kan), was integrated into the galK locus. Two stop codons were then introduced into the genomic kanR to make a Kan-sensitive kanROFF reporter strain (DH5αPRO galK::kanRW28TAA, A29TAG). These premature stop codons could be reverted back to the WT sequence via recombination with engineered ssDNA(kanR)ON, thus conferring kanamycin resistance (Fig. 1C). Specifically, ssDNA(kanR)ON contains 74 base pairs of homology to the regions of the kanROFF locus flanking the premature stop codons and replaces the stop codons with the WT kanR gene sequence (Fig. 1C).

We cloned the Beta gene (bet) into a plasmid under the control of the anhydrotetracycline (aTc)–inducible PtetO promoter and introduced it along with the IPTG-inducible msd(kanR)ON construct into the kanROFF strain (Fig. 1C). Induction of cultures harboring these two plasmids with either IPTG (1 mM) or aTc (100 ng/ml) resulted in a slight increase in the frequency of Kan-resistant cells within the population (Fig. 1C). However, coexpression of both ssDNA(kanR)ON and Beta with IPTG and aTc resulted in a >104-fold increase in the recombinant frequency relative to the noninduced cells. This corresponded to a >103-fold increase relative to cells induced with IPTG only and a 60-fold increase relative to cells induced with aTc only. This increase in the recombinant frequency was dependent on the RT activity, as it was largely abolished with dRT. The genotypes of randomly selected Kan-resistant colonies were further confirmed by DNA sequencing to contain precise reversions of the two codons to the WT sequence (fig. S1). No Kan-resistant colonies were detected when a nonspecific ssDNA [ssDNA(wt)] was coexpressed with Beta in the kanROFF reporter cells, confirming that Kan-resistant cells were not produced due to spontaneous mutations. In additional experiments, we used high-throughput sequencing (Illumina HiSeq) on the bacterial populations to analyze the genomically encoded memory (see supplementary materials and fig. S2). Comparable recombinant frequencies were obtained from both the plating assay and sequencing, confirming that genomically encoded memory can be read without the need for functional assays and reporters.

Recording input magnitudes into genomic memory

We reasoned that the rate of recombination between engineered ssDNAs and genomic DNA could be effectively modulated by changing expression levels of the engineered retron cassette and Beta. This feature would enable the recording of analog information, such as the magnitude of an input signal, in the proportion of cells in a population with a specific mutation in genomic DNA. To demonstrate this, both the ssDNA(kanR)ON expression cassette and bet were placed into a single synthetic operon [hereafter referred to as the SCRIBE(kanR)ON cassette] under the control of PlacO (Fig. 1D). The kanROFF reporter cells harboring this synthetic operon were induced with different concentrations of IPTG. The fraction of Kan-resistant recombinants increased linearly with the input inducer concentration on a log-log plot over a range of ~10−7 to ~10−5 (Fig. 1D). Statistical tests showed that at least four different concentrations of the inducer (including 0 mM IPTG) could be resolved in this experiment. Thus, the efficiency of genome writing in a population can be quantitatively tuned with external inputs.

Writing and rewriting genomic memory

We next created a complementary set of SCRIBE cassettes to write and erase (rewrite) information in the genomic galK locus using two different chemical inducers. Cells expressing galK can metabolize and grow on galactose as the sole carbon source. However, these galK-positive (galKON) cells cannot metabolize 2-deoxy-galactose (2DOG) and cannot grow on plates containing glycerol (carbon source) + 2DOG. On the other hand, galK-negative (galKOFF) cells cannot grow on galactose as the sole carbon source but can grow on glycerol + 2DOG plates (25). We transformed DH5αPRO galKON cells with plasmids encoding IPTG-inducible SCRIBE(galK)OFF and aTc-inducible SCRIBE(galK)ON cassettes (Fig. 2A). Induction of SCRIBE(galK)OFF by IPTG resulted in the writing of two stop codons into galKON, leading to galKOFF cells that could grow on glycerol + 2DOG plates (Fig. 2B). Induction of SCRIBE(galK)ON in these galKOFF cells with aTc reversed the IPTG-induced modification, leading to galKON cells that could grow on galactose plates (Fig. 2C). These results show that writing on genomic DNA with SCRIBE is reversible and that distinct information can be written and rewritten into the same locus.

Fig. 2 SCRIBE can write multiple different DNA mutations into a common target loci (galK).

(A) Schematic of the procedure (see text for details). (B) galKON cells harboring the circuits shown in (A) were induced with either IPTG (1 mM) or aTc (100 ng/ml) for 24 hours, and the galKOFF frequencies in the population were determined by plating the cells on appropriate selective conditions. (C) galKOFF cells [obtained from the experiment described in (B)] were induced with IPTG (1 mM) or aTc (100 ng/ml) for 24 hours, and the galKON frequencies in the population were determined by plating the cells on appropriate selective conditions. Error bars indicate SEM for three independent biological replicates.

Writing multiple mutations into independent loci

Scaling the capacity of previous memory devices is challenging because each additional bit of information requires additional orthogonal proteins (e.g., recombinases or transcription factors). In contrast, orthogonal SCRIBE memory devices are potentially easier to scale because they can be built by simply changing the ssDNA template (msd). To demonstrate this, we used SCRIBE to record multiple independent inputs into different genomic loci of bacterial population. We integrated the kanROFF reporter gene into the bioA locus of DH5αPRO to create a kanROFF galKON strain. These cells were then transformed with plasmids encoding IPTG-inducible SCRIBE(kanR)ON and aTc-inducible SCRIBE(galK)OFF cassettes (Fig. 3A). Induction of these cells with IPTG or aTc resulted in the production of cells with phenotypes corresponding to kanRON galKON or kanROFF galKOFF genotypes, respectively (Fig. 3, B and C). Comparable numbers of kanRON galKON and kanROFF galKOFF cells (~2 × 10−4 and ~3 × 10−4 recombinant/viable cells, respectively) were produced when the cultures were induced with both aTc and IPTG (Fig. 3C, left panel). Furthermore, very few individual colonies (~3 × 10−7 recombinant/viable cells) containing both writing events (kanRON galKOFF) were obtained in the cultures that were induced with both aTc and IPTG (Fig. 3C, right panel). These data suggest that although multiplexed writing at single-cell level is rare with SCRIBE’s current level of recombination efficiency, multiple independent inputs can be successfully recorded into the distributed genomic DNA of bacterial subpopulations.

Fig. 3 Writing multiple mutations into independent target loci within population.

(A) Constructs used to target genomic kanROFF and galKON loci with IPTG-inducible and aTc-inducible SCRIBE cassettes, respectively. (B) Induction of kanROFF galKON cells with IPTG or aTc generates cells with the kanRON galKON or kanROFF galKOFF genotypes, respectively. Induction of kanROFF galKON cells with both IPTG and aTc generates cells with the kanRON galKOFF genotype. (C) kanROFF galKON reporter cells containing the circuits in (A) were induced with different combinations of IPTG (1 mM) and aTc (100 ng/ml) for 24 hours at 30°C, and the fraction of cells with the various genotypes were determined by plating the cells on appropriate selective media. Error bars indicate SEM for three independent biological replicates.

Optogenetic genome editing for light-to-DNA memory

In SCRIBE, the expression of each individual ssDNA can be triggered by any endogenous or exogenous signal that can be coupled into transcriptional regulation, thus recording these inputs into long-lasting DNA storage. In addition to small-molecule chemicals, we showed that light can be used to trigger specific genome editing for genomically encoded memory. We placed the SCRIBE(kanR)ON cassette under the control of a previously described light-inducible promoter (PDawn) (26) within kanROFF cells (Fig. 4A). These cultures were then grown for 4 days in the presence of light or in the dark (Fig. 4A). As Beta-mediated recombination is reportedly replication-dependent (2729), dilutions of these cultures were made into fresh media at the end of each day to maintain active replication in the cultures. At the end of each day, samples were taken to determine the number of Kan-resistant and viable cells (Fig. 4A). Cultures grown in the dark yielded undetectable levels of Kan-resistant cells (Fig. 4A). In contrast, the frequency of Kan-resistant cells increased steadily over time in the cultures that were grown in the presence of light, indicating the successful recording of light input into long-lasting DNA memory. The analog memory faithfully stored the total time of light exposure, rather than just the digital presence or absence of light.

Fig. 4 Optogenetic genome editing and analog memory for long-term recording of input signal exposure times in the genomic DNA of living cell populations.

(A) We coupled expression of SCRIBE(kanR)ON to an optogenetic system (PDawn). The yf1/fixJ synthetic operon was expressed from a constitutive promoter: In dark conditions, YF1 interacts with and phosphorylates FixJ. Phosphorylated FixJ activates the PfixK2 promoter, which drives λ repressor (cI) expression, which subsequently represses the SCRIBE(kanR)ON cassette. Light inhibits the interaction between YF1 and FixJ, leading to the generation of ssDNA(kanR)ON and Beta expression and, thus, the conversion of kanROFF to kanRON. Cells harboring this circuit were grown overnight at 37°C in the dark, diluted 1:1000, and then incubated for 24 hours at 30°C in the dark (no shading) or in the presence of light (yellow shading). Subsequently, cells were diluted by 1:1000 and grown for another 24 hours at 30°C in the dark or in the presence of light. The dilution-regrowth cycle was performed for four consecutive days. The kanR allele frequencies in the populations were determined by sampling the cultures after each 24-hour period. (B) SCRIBE analog memory records the total time exposure to a given input, regardless of the underlying induction pattern. Cells harboring the circuit shown in Fig. 1C were grown in four different patterns (I to IV) over a 12-day period, where induction by IPTG (1 mM) and aTc (100 ng/mL) is represented by dark gray shading. At the end of each 24-hour incubation period, cells were diluted by 1:1000 into fresh media. The frequency of Kan-resistant cells in the cultures was determined at the end of each day. Dashed lines represent the recombinant allele frequencies predicted by the model (see supplementary materials). Error bars indicate SEM for three independent biological replicates.

Recording the exposure time of inputs

The linear increase in the frequency of Kan-resistant colonies over time due to exposure to light indicates that the duration of inputs can be recorded into population-wide DNA memory using SCRIBE. To further explore population-wide genomically encoded memory whose state is a function of input exposure time, we used the kanROFF strain harboring the constructs shown in Fig. 1C, where expression of ssDNA(kanR)ON and Beta are controlled by IPTG and aTc, respectively. These cells were subjected to four different patterns of the inputs for 12 successive days (patterns I to IV, Fig. 4B). Kan-resistant cells did not accumulate in the negative control (pattern I), which was never exposed to the inducers. The fraction of Kan-resistant cells in the three other patterns (II, III, and IV) increased linearly over their respective induction periods and remained relatively constant when the inputs were removed. These data indicate that the genomically encoded memory was stable in the absence of the inputs over the course of the experiment. The recombinant frequencies in patterns III and IV, which were induced for the same total amount of time but with different temporal patterns, reached comparable levels at the end of the experiment. These data demonstrate that the genomic memory integrates over the total induction time and is independent of the input pattern, and therefore can be used to stably record event histories over many days.

The linear increase in the fraction of recombinants in the induced cell populations over time was consistent with a deterministic model (dashed lines in Fig. 4B, also see supplementary materials). Specifically, when triggered by inputs, SCRIBE can substantially increase the rate of recombination events at a specific target site above the WT rate [which is reportedly <10−10 events per generation in recA background (30)]. When recombination rates are ~10−4 events per generation, which is consistent with the recombination rate estimated for SCRIBE from data in Fig. 4B, a simple deterministic model and a detailed stochastic simulation both predict a linear increase in the frequency of recombinant alleles in a population over time, as long as this frequency is less than a few percent and cells in the population are equally fit over the time scale of interest (see supplementary materials and figs. S3 and S4). These models enable one to determine the ideal range of recombination rates for a given application, which depends on parameters such as the frequency of dilution, the sensitivity of the method used for reading the memory, the desired input duration to be recorded, and so forth. For example, recombination rates that are too low would be challenging to quantify and could result in loss of memory if the cultures were diluted. Moreover, higher recombination rates lead to more rapid saturation of memory capacity in which the system is unable to provide a straightforward linear relation between the input exposure time and the state of the memory (fig. S3). Thus, intermediate levels of recombination rates are desirable for population-level analog memory units that can record the time span of exposure to inputs (see supplementary materials).

Decoupling memory operations

SCRIBE memory can be used to create more complex synthetic memory circuits. To demonstrate this, we first built a synthetic gene circuit that can record different input magnitudes into DNA memory. The memory state can then be read out later (after the initial input is removed) upon addition of a secondary signal. Specifically, we built an IPTG-inducible lacZOFF (lacZA35TAA, S36TAG) reporter construct in DH5αPRO cells (Fig. 5A). Expression of this reporter is normally repressed except when IPTG (“read” signal, Fig. 5A) is added thus enabling a convenient and switchable population-level readout of the memory based on total LacZ activity (Fig. 5B). The lacZOFF reporter cells were transformed with a plasmid encoding an aTc-inducible SCRIBE(lacZ)ON cassette (Fig. 5A). Overnight cultures were diluted and induced with various amounts of aTc to write the genomic memory (Fig. 5B). These cells were grown up to saturation and then diluted into fresh media in the presence or absence of IPTG to read the genomic memory (Fig. 5B). In the absence of IPTG, the total LacZ activity remained low, regardless of the aTc concentration. In the presence of IPTG, cultures that had been exposed to higher aTc concentrations had greater total LacZ activity. These results show that population-level reading of genomically encoded memory can be decoupled from writing and controlled externally. Furthermore, this circuit enables the magnitude of the inducer (aTc) to be stably recorded in the distributed genomic memory of a cellular population. Independent control over the read memory operation as shown in this experiment could help to minimize fitness costs associated with the expression of reporter genes until needed.

Fig. 5 SCRIBE memory operations can be decoupled into independent “input,” “write,” and “read” operations, thus facilitating greater control over addressable memory registers in genomic tape recorders and the creation of sample-and-hold circuits.

(A) We built a circuit where information about the first inducer (aTc) is recorded in the population, which can then be read later upon addition of a second inducer (IPTG) that triggers a read operation. We created an IPTG-inducible lacZOFF locus in the DH5αPRO background, which contains the full-length lacZ gene with two premature stop codons inside the open-reading frame. Expression of ssDNA(lacZ)ON from the aTc-inducible SCRIBE(lacZ)ON cassette results in the reversion of the stop codons inside lacZOFF to yield the lacZON genotype. (B) Cells harboring the circuit shown in (A) were grown in the presence of different levels of aTc for 24 hours at 30°C to enable recording into genomic DNA. Subsequently, cell populations were diluted into fresh media without or with IPTG (1 mM) and incubated at 37°C for 8 hours. (C) Total LacZ activity in these cultures was measured using a fluorogenic lacZ substrate (FDG) assay. The red dashed brackets marked with asterisks connect the closest data points of IPTG-induced samples that are statistically significant (P < 0.05 based on one-tailed Welch’s t test). A.U., arbitrary units. (D) We extended the circuit in (A) to create a sample-and-hold circuit where input, write, and read operations are independently controlled. This feature enables the creation of addressable read/write memory registers in the genomic DNA tape. Induction of cells with the input signal (AHL) produces ssDNA(lacZ)ON, which targets the genomic lacZOFF locus for reversion to the WT sequence. In the presence of the write signal (aTc), which expresses Beta, ssDNA(lacZ)ON is recombined into the lacZOFF locus and produces the lacZON genotype. Thus, the write signal enables the input signal to be sampled and held in memory. The total LacZ activity in the cell populations is retrieved by adding the read signal (IPTG). (E) Cells harboring the circuit shown in (D) were induced with different combinations of aTc (100 ng/ml) and AHL (50 ng/ml) for 24 hours, after which the cultures were diluted in fresh media with or without IPTG (1 mM). These cultures were then incubated at 37°C for 8 hours and assayed for total LacZ activity with the FDG assay. (F) Cell populations that received both the input and write signals followed by the read signal exhibited enhanced levels of total LacZ activity. Error bars indicate SEM for three independent biological replicates.

We have shown that (i) both ssDNA expression and Beta are required for writing into genomic memory (Fig. 1C), (ii) multiple ssDNAs can be used to independently address different memory units (Fig. 3), and (iii) genomic memory is stably recorded into DNA and can be used to modify functional genes whose expression can be controlled by external inducers (Figs. 1 to 4). Thus, SCRIBE memory units can be conceptually decomposed into separate “input,” “write,” and “read” operations to facilitate greater control and the integration of logic with memory. The separation of these signals could enable master control over the writing of multiple independent inputs into genomic memory. To achieve this, we placed the msd(lacZ)ON cassette under the control of an acyl homoserine lactone (AHL)–inducible promoter (PluxR) (31) and cotransformed this plasmid with an aTc-inducible Beta-expressing plasmid into the lacZOFF reporter strain (Fig. 5D). Using this design, information on the input (ssDNA expression via addition of AHL) can be written into DNA memory only in the presence of the write signal (Beta expression via addition of aTc). The information recorded in the memory register (i.e., the state of lacZ across the population) can be retrieved by adding the read signal (IPTG).

To demonstrate this, overnight lacZOFF cultures harboring the circuit shown in Fig. 5D were diluted and then grown to saturation in the presence of all four possible combinations of AHL and aTc (Fig. 5E). The saturated cultures were then diluted into fresh media in the absence or presence of IPTG. As shown in Fig. 5F, only cultures that had been exposed to both the input and write signals simultaneously showed substantial LacZ activity, and only when they were induced with the read signal. These results indicate that short stretches of DNA of living organisms can be used as addressable read/write memory registers to record transcriptional inputs. Furthermore, SCRIBE memory can be combined with logic, such as the AND function between the input and write signals shown here. The logic in Fig. 5D enables this circuit to act as a “sample-and-hold” system in which information about an input can be recorded in the presence of another signal and read out at will. Additional inputs in the form of orthogonal ssDNAs under the control of other inducible promoters (e.g., Fig. 3), could be written into genomic memory only when the write signal (Beta expression) is present. Thus, SCRIBE memory units can be readily reprogrammed, integrated with logic circuits, and decomposed into independent input, write, and read operations. We anticipate that more complex logic circuits could be combined with SCRIBE-based memory to create analog memory and computation systems capable of storing the results of multi-input calculations (32, 33).

Discussion

We described a scalable platform that uses genomic DNA for analog, rewritable, and flexible memory distributed across living cell populations. One current limitation is the number of orthogonal inducible promoters that can be used as inputs, but this could be addressed by the development of additional inducible transcriptional regulatory devices (34). Additionally, ssDNA expression can be coupled to endogenous promoters to sense and record native cellular events. Although we primarily targeted mutations into functional genes to facilitate convenient functional and reporter assays, natural or synthetic noncoding DNA segments could also be used to record memory within genomic DNA. The recorded memory could then be read by high-throughput sequencing (fig. S2). A potential benefit of using synthetic DNA segments as memory registers is the ability to introduce mutations for memory storage that are neutral in terms of fitness costs.

SCRIBE enables conditional increases in the recombination rate at specific loci beyond background levels. The maximum observed recombination rate of the current SCRIBE platform (~10−4 recombination events per generation) is suitable for long-term recording of analog memory distributed across the collective genomes of cellular populations (fig. S3). However, it is not high enough to allow recording of digital information and efficient genome editing at the single-cell level. In principle, population-level analog memory could be achieved by other types of DNA memory switches, such as site-specific recombinases, if they were tuned to achieve intermediate recombination rates. Further investigation is required to determine the exact mechanisms involved in processing retron-based ssDNAs for recombination into genomic DNA and the effects of different growth conditions on SCRIBE memory. Because Beta-mediated recombination is replication-dependent (2729) and ssDNA is believed to be recombined into the genome during passage of the replication fork (27), we speculate that only actively dividing cells are likely to participate in the described population-level memory. Future optimization of SCRIBE [e.g., by modulating the mismatch repair system (14) and cellular exonucleases (35)] could lead to more efficient single-cell digital memories. This could enable other useful applications, including recording extracellular and intracellular events at the single-cell level for biological studies, dynamic engineering of cellular phenotypes, experimental evolution and population dynamics studies, single-cell computation and memory, the construction of complex cellular state machines and biological Turing machines, and enhanced genome engineering techniques.

Additionally, because retrons have been found in a diverse range of microorganisms (20), in vivo ssDNA expression could potentially be extended to hard-to-transform organisms in which SCRIBE plasmids could be introduced by conjugation or transduction. Because retrons have also been shown to be functional in eukaryotes (24, 36, 37), they could potentially be used with other genome editing tools for memory. Moreover, by using error-prone RNA polymerases (38) and reverse transcriptases (39, 40), we anticipate that mutagenized ssDNA libraries could be generated inside cells for in vivo continuous evolution (41) and cellular barcoding applications. Finally, in vivo ssDNA generation could potentially be used to create DNA nanosystems (4248) and ssDNA-protein hybrid nanomachines in living cells (49) or could be optimized and scaled-up to create an economical source of ssDNAs for DNA nanotechnology (50). In summary, we envision that in vivo ssDNA production and SCRIBE platforms will open up a broad range of new capabilities for engineering biology.

Materials and methods

Strains and plasmids

Conventional cloning methods were used to construct the plasmids. Lists of strains and plasmids used in this study and the construction procedures are provided in tables S1 and S2, respectively. The sequences for the synthetic parts and primers are provided in tables S3 and S4.

Cells and antibiotics

Chemically competent E. coli DH5α was used for cloning. Unless otherwise noted, antibiotics were used at the following concentrations: carbenicillin (50 μg/ml), kanamycin (20 μg/ml), chloramphenicol (30 μg/ml), and spectinomycin (100 μg/ml). In the experiment shown in Fig. 2, kanamycin (15 μg/ml) and chloramphenicol (15 μg/ml) were used.

Detection of single-stranded DNA

Overnight cultures harboring IPTG-inducible plasmids encoding msd(wt), msd(wt) with deactivated RT [msd(wt)_dRT], or msd(kanR)ON were grown overnight with or without IPTG (1 mM). Total RNA samples were prepared from noninduced or induced cultures using TRIzol reagent (Invitrogen) according to the manufacturer’s protocol. Total RNA (10 μg) from each sample was treated with ribonuclease A (37°C, 2 hours) to remove RNA species and the msr moiety. The samples were then resolved on 10% tris-borate EDTA–urea denaturing gel and visualized with SYBR-Gold. A polyacrylamide gel electrophoresis–purified synthetic oligo (FF_oligo347, 50 pmol) with the same sequence as ssDNA(wt) was used as a molecular size marker. The band intensities were measured by Fiji software (51). The intensities were normalized to the intensity of the marker oligo, and normalized intensities were used to calculate the amount of ssDNA in each sample.

Induction of cells and plating assays

For each experiment, three transformants were separately inoculated in Luria broth (LB) media plus appropriate antibiotics and grown overnight [37°C, 700 revolutions per minute (RPM)] to obtain seed cultures. Unless otherwise noted, inductions were performed by diluting the seed cultures (1:1000) in 2 ml of prewarmed LB plus appropriate antibiotics with or without inducers followed by 24 hours incubation (30°C, 700 RPM). Aliquots of the samples were then serially diluted, and appropriate dilutions were plated on selective media to determine the number of recombinants and viable cells in each culture. For each sample, the recombinant frequency was reported as the mean of the ratio of recombinants to viable cells for three independent replicates.

In all experiments, the number of viable cells was determined by plating aliquots of cultures on LB-plus-spectinomycin plates. LB-plus-kanamycin plates were used to determine the number of recombinants in the kanR reversion assay. For the galK reversion assay (Fig. 2), the numbers of galKON recombinants were determined by plating the cells on MOPS EZ rich–defined media (Teknova) plus galactose (0.2%). The numbers of galKOFF recombinants were determined by plating the cells on MOPS EZ rich–defined media plus glycerol (0.2%) plus 2-DOG (2%). For the experiment shown in Fig. 3, the numbers of kanRON galKON and kanROFF galKOFF cells were determined by using LB-plus-kanamycin plates and MOPS EZ rich–defined media plus glycerol (0.2%), 2-DOG (2%), and d-biotin (0.01%), respectively. The numbers of kanRON galKOFF cells were determined by plating the cells on MOPS EZ–rich defined media plus glycerol (0.2%), 2-DOG (2%), kanamycin, and d-biotin (0.01%).

For the light-inducible SCRIBE experiment (Fig. 4A), induction was performed with white light (using the built-in fluorescent lamp in a VWR 1585 shaker incubator). The “dark” condition was achieved by wrapping aluminum foil around the tubes. Growth of and sampling from these cultures were performed as described earlier.

LacZ assay

Overnight seed cultures were diluted (1:1000) in prewarmed LB plus appropriate antibiotics and inducers [with different concentrations of aTc or without aTc (Fig. 5, A to C) and with all the four possible combinations of aTc (100 ng/ml) and AHL (50 ng/ml) (Fig. 5, D to F)] and incubated for 24 hours (30°C, 700 RPM). These cultures were then diluted (1:50) in prewarmed LB plus appropriate antibiotics with or without IPTG (1 mM) and incubated for 8 hours (37°C, 700 RPM). To measure LacZ activity, 60 μl of each culture was mixed with 60 μl of B-PER II reagent (Pierce Biotechnology) and fluorescein di-β-d-galactopyranoside (FDG) (0.05 mg/ml final concentration). The fluorescence signal (absorption/emission: 485/515) was monitored in a plate reader with continuous shaking for 2 hours. The LacZ activity was calculated by normalizing the rate of FDG hydrolysis (obtained from fluorescence signal) to the initial optical density. For each sample, LacZ activity was reported as the mean of three independent biological replicates.

Supplementary Materials

www.sciencemag.org/content/346/6211/1256272/suppl/DC1

Supplementary Text

Figs. S1 to S6

Tables S1 to S4

References (5261)

References and Notes

  1. Acknowledgments: This work was supported by the NIH New Innovator Award (1DP2OD008435), NIH National Centers for Systems Biology (1P50GM098792), the U.S. Office of Naval Research (N000141310424), and the Defense Advanced Research Projects Agency. Sequencing data have been deposited in GenBank with accession numbers KM923743 to KM923754.
View Abstract

Navigate This Article