Research Article

Global analysis of protein folding using massively parallel design, synthesis, and testing

See allHide authors and affiliations

Science  14 Jul 2017:
Vol. 357, Issue 6347, pp. 168-175
DOI: 10.1126/science.aan0693

You are currently viewing the abstract.

View Full Text

Exploring structure space to understand stability

Understanding the determinants of protein stability is challenging because native proteins have conformations that are optimized for function. Proteins designed without functional bias could give insight into how structure determines stability, but this requires a large sample size. Rocklin et al. report a high-throughput protein design and characterization method that allows them to measure thousands of miniproteins (see the Perspective by Woolfson et al.). Iterative rounds of design and characterization increased the design success rate from 6 to 47%, which provides insight into the balance of forces that determine protein stability.

Science, this issue p. 168; see also p. 133


Proteins fold into unique native structures stabilized by thousands of weak interactions that collectively overcome the entropic cost of folding. Although these forces are “encoded” in the thousands of known protein structures, “decoding” them is challenging because of the complexity of natural proteins that have evolved for function, not stability. We combined computational protein design, next-generation gene synthesis, and a high-throughput protease susceptibility assay to measure folding and stability for more than 15,000 de novo designed miniproteins, 1000 natural proteins, 10,000 point mutants, and 30,000 negative control sequences. This analysis identified more than 2500 stable designed proteins in four basic folds—a number sufficient to enable us to systematically examine how sequence determines folding and stability in uncharted protein space. Iteration between design and experiment increased the design success rate from 6% to 47%, produced stable proteins unlike those found in nature for topologies where design was initially unsuccessful, and revealed subtle contributions to stability as designs became increasingly optimized. Our approach achieves the long-standing goal of a tight feedback cycle between computation and experiment and has the potential to transform computational protein design into a data-driven science.

View Full Text