Computational design of water-soluble α-helical barrels

See allHide authors and affiliations

Science  24 Oct 2014:
Vol. 346, Issue 6208, pp. 485-488
DOI: 10.1126/science.1257452

Building with alphahelical coiled coils

Understanding how proteins fold into well-defined three-dimensional structures has been a longstanding challenge. Increased understanding has led to increased success at designing proteins that mimic existing protein folds. This raises the possibility of custom design of proteins with structures not seen in nature. Thomson et al. describe the design of channelcontaining α-helical barrels, and Huang et al. designed hyperstable helical bundles. Both groups used rational and computational design to make new protein structures based on α-helical coiled coils but took different routes to reach different target structures.

Science, this issue p. 485, p. 481


The design of protein sequences that fold into prescribed de novo structures is challenging. General solutions to this problem require geometric descriptions of protein folds and methods to fit sequences to these. The α-helical coiled coils present a promising class of protein for this and offer considerable scope for exploring hitherto unseen structures. For α-helical barrels, which have more than four helices and accessible central channels, many of the possible structures remain unobserved. Here, we combine geometrical considerations, knowledge-based scoring, and atomistic modeling to facilitate the design of new channel-containing α-helical barrels. X-ray crystal structures of the resulting designs match predicted in silico models. Furthermore, the observed channels are chemically defined and have diameters related to oligomer state, which present routes to design protein function.

Defining protein sequences that fold into specified three-dimensional structures is called the “inverse protein-folding problem” (1). Mostly, this has been applied to mimic existing folds (2). However, the design of structures not yet seen in nature can also be considered (3). Repeat proteins are of interest here, as extrapolation from known structures can provide geometric parameters and sequence-to-structure relations to guide design (4, 5), and proteins with cyclic symmetry offer possibilities for systematic variation of repeating elements to produce families of proteins (6). One example is the α-helical coiled coil (7, 8). Classical coiled coils comprise bundles of two to four α helices, account for >98% of known coiled-coil structures (9, 10), and have well-understood sequence-to-structure relations (7, 8). Unusually for proteins, the conformations of coiled-coil backbones are well described by a small number of parameters (1114). Consequently, a relatively large number of successful coiled-coil structures have been designed (8), although, with a few exceptions (8, 13), these have largely mimicked natural precedents.

The α-helical barrels present an intriguing subset of coiled coils to move beyond known structures (15). These have more-complex helical packing (15, 16), which results in the assembly of five or more α helices into cylindrical bundles with central channels or pores (15). The few current examples include natural parallel 5- and 10-helix structures, and antiparallel 10- and 12-helix bundles (1720); a de novo parallel hexamer, achieved partly serendipitously (21); and a mutant leucine-zipper peptide that forms an unusual staggered parallel 7-helix arrangement (22). For these, there is a near-linear relation between lumen size and oligomer state (Fig. 1A), which opens possibilities for designing channel proteins. However, because of the scarcity of α-helical barrels and because these are usually parts of larger membrane-spanning proteins, it is difficult to derive rules to design new examples. To overcome this, we describe a geometrical and computational framework for designing α-helical barrels from first principles and apply this to deliver discrete, water-soluble assemblies with five to seven parallel and identical helices.

Fig. 1 Interfaces, packing and scoring in coiled-coil design.

(A) Relation between oligomer state and pore diameter for existing α-helical barrels (red) and the de novo structures described herein (blue); data are given as means ± SD; the dotted line shows linear regression (R2 = 0.86). (B) Helical-wheel diagram representation of a type N heterodimer. (C) Section through a coiled-coil heterodimer crystal structure (PDB ID: 1fos). (D) Complete helical-wheel diagram for a type II α-helical barrel (pentamer shown with M = 1), illustrating the heterotypic interfaces between cdga and deab. Compared with classical parallel dimers (B), which make interhelical gade contacts g→e′, a→a′, d→d′, and e→g′, the approximately equivalent primary contacts in helical α-barrels are c→b′, d→e′, g→a′, and a→d′, respectively. In addition, three of the four geometric parameters required to describe α-helical barrels are shown: coiled-coil radius (r), oligomer state (N = M + 4), and helical offset (ω1). Superhelical pitch, P, is not represented. (E) Section through a coiled-coil pentamer crystal structure (PDB ID: 1mz9). (F) Helical wheel for an isolated heterodimer-like interface in α-helical barrels. (G) Heterodimer-like interfaces in a single sequence with a type II repeat, showing the shared a and d positions. (H) Calculation of fitness score. (I) Designed sequence classes following filtering. Numbers of interaction pairs and/or sequences are shown in brackets at each step for (F) to (I). PyMol ( was used to create panels (C) and (E).

First, we required a means to map structural and sequence relations between the new targets and the plentiful examples of classical coiled coils. The latter have heptad sequence repeats, (hxxhxxx)n, where h and x are hydrophobic and polar residues, respectively; often labeled abcdefg, this places h-type residues at a and d. A resulting hydrophobic seam mediates helix association and packing, with the interface often buttressed by polar interactions between e and g (Fig. 1, B and C). These are type-N interfaces, and the residues at the “gade” positions determine oligomer state and partner selection (8, 23).

In an α-helical barrel, each helix interacts with two neighbors via independent hydrophobic seams (Fig. 1, D and E). There are three ways to achieve this within a heptad repeat: The two seams can share one residue (type I interfaces); be adjacent (type II) (Fig. 1D); or be separated by an intervening residue (type III) (15, 16). We hypothesized that type II and III interfaces can lead to α-helical barrels, with the oligomer state determined by the angular offset between the two interfaces. Because differences in this contact angle between helices become smaller with increasing oligomeric state (fig. S1A), we anticipated that controlling barrel size would be more tractable for smaller assemblies. Therefore, we concentrated on type II interfaces and “hhxxhhx” repeats (read gf), which should define oligomers with five to seven helices.

Also with increasing oligomer state, the helix-helix interfaces in α-helical barrels become more like those of classical coiled-coil dimers, specifically heterodimeric interfaces (fig. S1C and Fig. 1, B and D). Therefore, we devised a scoring system to select sequences encoding two heterotypic hydrophobic seams. This treated each seam as one-half of a heterodimer, with the seams sharing two residues (Fig. 1, F and G). In terms of traditional heptads, these seams comprise residues at deab and cdga—which are each equivalent to gade positions in classical parallel dimers—and combine to give a gabcde repeat. We used the bZIP scoring function (24) to assess many deab plus cdga interfaces and to identify potential heterotypic pairs. We considered all combinations of A, E, I, K, L, N, Q, R, S, and V residues, which are commonly found in parallel dimers (10). Because of the shared residues, the initial screen had 106 sequences. For each deab plus cdga pair, a “fitness score” was calculated by subtracting the highest homo-paired score from that for the hetero pair (Fig. 1H). The identified pairs were filtered further by the raw bZIP pairing score, to give 7578 hits.

Further screening removed sequences with anticipated destabilizing polar residues at a and d positions; selected those with potential b→c′, b→g′, e→c′, or e→g′ salt-bridging interactions; and excluded hydrophobic residues at peripheral b and c positions. This reduced the set to 370 sequences. Many of these resembled classical tetramers, with Leu at a, Ile at d, and polar residues at e and g. Thus, to select larger assemblies, we added a requirement for a hydrophobic residue at one or both of e and g, which reduced the set to 188 sequences. There was some redundancy between Arg and Lys, so we retained only the more readily synthesized Lys-containing sequences. Of the initial 106 sequences, 76 met the full selection criteria. Of these, 22 repeats representing sequence diversity in the full set were chosen for further study (Fig. 1I). These were named after their gabcde repeat and synthesized as four-heptad peptides (table S1).

Next, we developed a software tool, CCBuilder (25), to construct in silico models of α-helical coiled coils and barrels. This uses Crick’s equations to build coiled-coil backbones (11, 12), adds side chains using SCRWL (26) and PyRosetta (27), and assesses interhelix packing through implementations of the BUDE force field (25, 28) and SOCKET (29). For the 22 sequences, models were generated for each oligomer state with four to eight all-parallel helices. We used a genetic algorithm to search and optimize structural space defined by three independent parameters: radius, pitch, and the rotational offset between helices (Fig. 1D and table S2); note that the number of residues per α-helical turn also varies as it is related to coiled-coil pitch; however, it was constrained within limits known for proteins (3.65 ± 0.07) (25). We predicted the preferred association state for each sequence, produced an atomistic model for this, and estimated an energetic difference between it and alternative states.

The most commonly predicted oligomeric state was pentameric (12 sequences) consistent with the angular offset between the two hydrophobic seams in a type II interface (103°) most closely matching the internal angle of a regular pentagon (108°) (fig. S1A). No sequences were predicted to form tetramers, although seven were predicted to form hexamers, two to form heptamers, and one was predicted to form an octamer (table S3).

We synthesized peptides for the 22 targets (table S1 and fig. S2). As judged by circular dichroism (CD) spectroscopy, all of these formed highly α-helical and thermally stable assemblies (Fig. 2, A and B, and fig. S3). Two of the peptides showed low solubility and could not be characterized further. The oligomeric states of the remaining 20 soluble examples were assessed by analytical ultracentrifugation (AUC) (Fig. 2, C and D, fig. S4, and table S3), which indicated assemblies with four to seven peptide chains consistent with the design rationale; specifically, there were two tetramers, four pentamers, six hexamers, and a single heptamer (table S3). In seven cases, a single state could not be identified by AUC. Where determined, the experimental oligomer states matched those predicted using CCBuilder in 8 out of 13 cases (table S3).

Fig. 2 Solution-phase biophysical data for the designed peptides.

(A) CD spectra at 5°C for ILQKIE (red), SLKEIA (green), SIKEIA (green dashes), and ALKEIA (blue). MRE, mean residue ellipticity. (B) Thermal denaturation profiles monitored by the change in CD signal at 222 nm. (C) Sedimentation velocity c(s) distribution fits at 20°C for ILQKIE, SLKEIA, and ALKEIA. (D) Representative sedimentation-equilibrium AUC data (dots) and fitted single-ideal species model curves recorded at 20°C, 280 nm, and 24,000 rpm for ILQKIE, SLKEIA, and ALKEIA, respectively. AU, arbitrary units. Color key: (B) to (D) same as for (A). Conditions: 10 μM (A) and (B); 150 μM (C); and 70 μM (D); peptide concentrations, phosphate-buffered saline (PBS), pH 7.4, except ALKEIA; (C) and (D), tris-buffered saline, pH 7.5.

High-resolution x-ray crystal structures were determined for four of the peptides (Fig. 3 and fig. S5 to S7, Table 1, and table S4). Structures were solved by molecular replacement with either part of, or intact, CCBuilder predictions (backbone and Cβ atoms only) as initial search models. The structures revealed parallel, blunt-ended α-helical barrels (Fig. 3, A and B), with knobs-into-holes packing confirmed by SOCKET (29) and type II interfaces, as designed (Fig. 3C). The experimental and in silico models were closely correlated with low root-mean-square deviations (RMSDs) for backbone and side-chain atoms (Table 1 and fig. S5). The pore sizes of the assemblies are consistent with those expected for each oligomer state (Fig. 1A). To accord with our other de novo coiled coils (30), we renamed the sequences CC-Pent, CC-Hex2, CC-Hex3, and CC-Hept (Table 1).

Fig. 3 X-ray crystal structures for three de novo α-helical barrels.

(A) and (B) From left to right, orthogonal views of ILQKIE (CC-Pent, red, PDB ID 4pn8); SLKEIA (CC-Hex2, green, PDB ID 4pn9); and ALKEIA (CC-Hept, blue, PDB ID 4pna). (C) Conserved packing of the a Leu (red) and d Ile (green) residues and variation of the steric bulk of the e and g residues.

Table 1 Modeled and experimental oligomer-state and structural parameters for selected assemblies.

Oligomeric state [number of monomers (n)] and structural parameters in angstroms (Å): pitch, radius, and pore diameter. Oligomeric state in the second column was taken from AUC sedimentation-equilibrium experiments. Pore diameters were measured with PoreWalker (32) and are given as the average pore diameter through the assembly. The coordinates of the experimental and model structures were fitted and the RMSD calculated using the McLachlan algorithm implemented by ProFit (33). RMSDs are for all nonhydrogen atoms and are shown in parentheses for Cα coordinates only.

View this table:

The crystal structure of CC-Pent is the first for a de novo designed pentameric α-helical coiled coil and is one of four pentameric barrels from this study. CC-Hex2, CC-Hex3, and four other sequences are further examples of de novo hexameric coiled-coil barrels (21). CC-Hept is the first parallel, blunt-ended heptamer. The repeats for CC-Hex2 and CC-Hex3 are point mutants of each other; i.e., SLKEIAx and SIKEIAx, respectively. The former has substantially more favorable raw bZIP and fitness scores. The less-favorable score for CC-Hex3 likely reflects the β-branched residues at the a positions (fig. S6), which are analogous here to d positions in type-N, dimeric interfaces where Leu is favored (10, 23). Consistent with this, CC-Hex3 has the beginning of a sharp unfolding transition at high temperatures, whereas CC-Hex2 is highly thermally stable (Fig. 2B). The sequence repeat of CC-Hept, ALKEIAx, has small Ala residues at both g and e, as seen in the staggered heptamer (22). These small residues appear critical to dictate the fold, and a heptamer may be the highest oligomeric state possible on the basis of type II sequences. As mentioned above, one of our designs was predicted to form an octamer, and this also has e = g = Ala (table S3). However, AUC measurements were ambiguous, and the peptide did not crystallize. We posit that extending oligomer states past heptamer reliably will require type-III repeats. However, this may prove difficult to achieve because the energetic difference between oligomer states is predicted to diminish with increasing number of helices (fig. S1A), and therefore, single-chain constructs may be needed to direct specific topologies.

In summary, we have developed a geometrical and computational framework for designing α-helical barrels based on one type of complex coiled-coil packing, namely type-II interfaces (15). This has produced de novo parallel pentameric, hexameric, and heptameric structures, none of which are commonly observed in nature and with a design success rate of ~36% (8/22). These expand the set of rationally designed coiled-coil assemblies from dimer through to heptamer (21, 30). The CC-Pent, CC-Hex2, and CC-Hept structures are based on a similar sequence framework of a = Leu plus d = Ile. Although this constellation appears to be a good general solution in terms of stability, it is not sufficient to confer oligomer-state specificity alone. That requires contributions from the remaining sites of the helical interfaces; i.e., the g and e sites, which complete the type II pattern of hhxxhhx, and must have progressively smaller side chains as oligomer state increases. These structures have accessible and chemically defined channels, opening tantalizing prospects for the rational design of channel- and pore-containing protein assemblies with defined internal chemistries and properties (15, 21, 31).

Supplementary Materials

Materials and Methods

Figs. S1 to S7

Tables S1 to S4

References (3441)

References and Notes

  1. ACKNOWLEDGMENTS: C.W.W. and A.J.B. are supported by the U.K. Biotechnology and Biological Sciences Research Council (BBSRC) South West Doctoral Training Partnership and the Engineering and Physical Sciences Research Council (EPSRC) Bristol Chemical Synthesis Centre for Doctoral Training, respectively. The work was funded by grants from the EPSRC (EP/J001430/1; D.N.W.), BBSRC (BB/J008990/1; D.N.W. and R.L.B.) and the European Research Council (340764; D.N.W.). D.N.W. holds a Royal Society Wolfson Research Merit Award. We thank the Diamond Light Source for access to beamlines I0-3 and I0-4 (award MX8922). We thank N. Zaccai for advice on protein x-ray crystallography, and members of the Brady and Woolfson groups for general discussions. A.R.T. and D.N.W. conceived the research program and approach; A.R.T. designed the peptide sequences; A.R.T., C.W.W., and A.J.B. synthesized the peptides and performed solution-phase characterizations; C.W.W., A.J.B., and R.L.B. solved the x-ray crystal structures; A.R.T., C.W.W., G.J.B., and R.B.S. wrote the computational design and modeling algorithms; A.R.T. and D.N.W. wrote the paper. All authors analyzed the data, and reviewed and contributed to the manuscript. Coordinates and structure factors have been deposited in the Protein Data Bank with the accession codes: CC-Pent, 4pn8; CC-Hex2, 4pn9; CC-Hex3, 4pnb; CC-Hept, 4pna; CC-Pent-variant, 4pnd.
View Abstract

Stay Connected to Science

Navigate This Article