Essays on Science and SocietyCell and Molecular Biology

Creating the protein version of DNA base pairing

See allHide authors and affiliations

Science  22 Nov 2019:
Vol. 366, Issue 6468, pp. 965
DOI: 10.1126/science.aaz7777

Specific interactions between molecules give rise to life. Nucleic acid is a good example: The pairing of adenine (A) to thymine (T) and cytosine (C) to guanine (G) ensures accurate passage of genetic materials over generations. Thanks to the programmable and modular nature of A-T/C-G base pairing, DNA has been repurposed to build nanoscopic smiley faces (1) and robots that sort molecular cargoes (2), giving rise to the field of DNA nanotechnology.

Embedded Image

As an undergraduate student, I became intrigued by the possibility of engineering protein-protein interactions (PPIs) in a similarly modular manner. However, I found that there was no straightforward way to artificially encode specificity in PPIs, which are often mediated by multiple side chains in a structurally nonmodular fashion. [Two features contribute to the modular specificity of DNA base pairing: buried hydrogen bonds between the bases (specificity), and the repetitive double helical backbone (modularity).]

As a first-year graduate student working in the labs of David Baker and Frank DiMaio at the University of Washington, I contributed to a project that aimed to design specific hydrogen bonds (mediated by polar amino acid side chains) at the binding interfaces between de novo designed protein α-helical bundles. We computationally designed and experimentally validated homodimers, trimers, and tetramers with such structural features (3). This work showcased the engineering of a new interaction modality—the protein equivalent of DNA base pairing. I went on to pursue three projects in parallel. Each project (described below) represented a key aspect of programmable protein interactions.

Programmable Design of Orthogonal Protein Heterodimers

To push the limit of designed specificity in proteins, I decided to design a big set of mutually orthogonal protein heterodimers (see the figure). Heterodimers are more difficult to create, as the designer needs to not only account for structurally accurate heterodimerization, but also disfavor each monomer's tendency to homodimerize. I began this project with the hypothesis that two monomers could only heterodimerize if their hydrogen-bonding patterns match exactly at the binding interface.

Using large-scale computational sampling, I demonstrated that protein heterodimeric interaction specificity can be achieved using extensive hydrogen-bond networks (see the figure). The specificity and modularity of these networks recapitulates those of Watson-Crick base pairing, but in the context of proteins. Furthermore, I mixed 14 pairs of heterodimers in a single test tube and found that only the designed pairs associated, with almost no spurious cross-talk—a property that is taken for granted with DNA but had been missing in designed proteins (4).

De Novo Design of Protein Logic Gates

Until now, the lack of available orthogonal protein pairs has made it difficult to control cell signaling at the posttranslational level (5, 6) and has limited the field of synthetic biology to mainly transcriptional regulation (79). Once we have a big set of mutually orthogonal proteins at hand, as Star Trek's Spock would no doubt agree, “it was only logical” to design protein-based logic circuits in cells.

By modularly combining the set of orthogonal protein heterodimers, I designed two-input AND, OR, NAND, NOR, XNOR, and NOT logic gates that could regulate the association of arbitrary protein units ranging from split enzymes to transcriptional machinery both in vitro and in vivo. I further demonstrated the modularity of this approach by extending it to various three-input logic gates.

While experimentally characterizing these logic gates, I discovered that they operate in a cooperative manner: One protein does not bind another without a third protein that opens up a binding interface. Binding interaction cooperativity makes the gates largely insensitive to stoichiometric imbalance in the inputs, making them ideal for cellular applications. The modularity and cooperativity of these control elements, coupled with the ability to de novo design an essentially unlimited number of protein components, should enable the creation of sophisticated posttranslational control logic over a wide range of biological functions.

(A) An illustration of the protein version of DNA base pairing. Interactions among color-coded protein monomers are mediated by hydrogen bonds much like those found in DNA. Two monomers bind only if there is perfect pairing of hydrogen bonding patterns at the binding interface. (B) A heterodimer design model with monomers colored green and purple, superimposed on its crystal structure (white); colored cross-sections of backbones (red and green) indicate the locations of designed hydrogen-bond networks.


Self-Assembling 2D Materials with De Novo Protein Building Blocks

Creating two-dimensional (2D) materials is straightforward with DNA but has been difficult to achieve using proteins, owing to the lack of modular specificity in their interactions. I developed a general computational algorithm to design 2D materials using de novo designed protein building blocks.

By optimizing the sequences at binding interfaces, I showed that the same protein building block can be reconfigured into two different array geometries. The single-layered 2D arrays assembled to micrometer scale under electron microscopy and atomic force microscopy, and displayed perfect agreement with the designed lattice geometry (10).

This work, describing how a de novo designed protein is used as a building block to construct 2D materials, opens the door to the creation of programmable protein-based materials.

Toward A Fully Synthetic Future

More than 60 years have passed since the discovery of Watson-Crick base pairing, and we are beginning to engineer such features into proteins. Now that we have proteins that behave in a manner similar to that of DNA molecules, the timing could not be better to start exploring synthetic biology using de novo designed proteins.

As a start, we designed proteins that are sensitive to pH changes and capable of endosomal escape (11); proteins that insert into cell membranes and can potentially become synthetic ion channels (12); and proteins that change conformation to send out a customizable signal upon binding to small peptides (13). I'm tremendously excited to see what comes next.

Embedded Image


Zibo Chen

Zibo Chen received his undergraduate degree from the National University of Singapore and his Ph.D. in biochemistry from the University of Washington. He is currently a postdoctoral scholar at the California Institute of Technology, where he is programming mammalian cells using proteins designed from scratch.

References and Notes

Acknowledgments: I am grateful to my Ph.D. advisers D. Baker and F. DiMaio. I also thank S. Boyken for his mentorship.
View Abstract

Stay Connected to Science

Navigate This Article