Structural principles that enable oligomeric small heat-shock protein paralogs to evolve distinct functions

See allHide authors and affiliations

Science  23 Feb 2018:
Vol. 359, Issue 6378, pp. 930-935
DOI: 10.1126/science.aam7229

Putting distance between protein relatives

Many proteins form complexes to function. When the gene for a self-assembling protein duplicates, it might be expected that the related proteins (paralogs) would retain interfaces that would allow coassembly. Hochberg et al. show that the majority of paralogs that oligomerize in fact self-assemble. These paralogs have more diverse functions than those that coassemble, implying that maintaining coassembly would constrain evolution of new function. The authors experimentally investigated how two oligomeric small heat-shock protein paralogs avoid coassembly and found that flexibility at regions outside of the interaction interfaces played a key role.

Science, this issue p. 930


Oligomeric proteins assemble with exceptional selectivity, even in the presence of closely related proteins, to perform their cellular roles. We show that most proteins related by gene duplication of an oligomeric ancestor have evolved to avoid hetero-oligomerization and that this correlates with their acquisition of distinct functions. We report how coassembly is avoided by two oligomeric small heat-shock protein paralogs. A hierarchy of assembly, involving intermediates that are populated only fleetingly at equilibrium, ensures selective oligomerization. Conformational flexibility at noninterfacial regions in the monomers prevents coassembly, allowing interfaces to remain largely conserved. Homomeric oligomers must overcome the entropic benefit of coassembly and, accordingly, homomeric paralogs comprise fewer subunits than homomers that have no paralogs.

Many proteins associate into selective homo- or heteromers in order to function (1). New assemblies are most often created by gene duplication of a preexisting homomer (2). The resulting oligomeric paralogs initially coassemble because both have the same sequence (and hence structure and interfaces) as their ancestor (Fig. 1A) (3). This coassembly can easily become entrenched if evolution of the two resulting duplicates is functionally constrained to maintain the interaction (4, 5), implying that heteromerization should be the most likely fate of oligomeric paralogs. However, when we analyzed the human, Arabidopsis, yeast, and Escherichia coli interactomes (supplementary materials and data file S1), we found that most oligomeric paralogs do not form heteromers (i.e., do not coassemble) (Fig. 1B), despite overlapping localization and expression profiles (fig. S1, A and B). Moreover, we found that those paralogs that cannot coassemble share lower sequence identity and fewer common functions than paralogs that can (Fig. 1, C and D). This suggests that heteromerization acts as a constraint on the functional divergence of oligomeric paralogs (6). Relieving this constraint is therefore a key step in the evolutionary trajectories of oligomeric proteins toward evolving new functions.

Fig. 1 Self-selective assembly allows oligomeric paralogs to evolve distinct functions.

(A) After gene duplication, oligomeric paralogs coassemble into and predominantly populate heteromers, constraining their functions to be compatible with coassembly. If they subsequently evolve the ability to assemble self-selectively into homomers, their functions are free to diverge. (B) Percentage of pairs of oligomeric paralogs that either coassemble into heteromers (purple) or only self-assemble into homomers (gray) in E. coli (73 pairs in data set), Saccharomyces cerevisiae (215 pairs), Arabidopsis thaliana (742 pairs), and Homo sapiens (1086 pairs). (C) Pairwise sequence identity is higher between coassembling paralogs (purple) than between self-assembling paralogs (gray). Horizontal lines denote medians. *P < 0.05, **P < 0.01, ****P << 0.0005, Mann-Whitney rank sums test. (D) Pairwise functional similarity of coassembling (purple) and self-assembling (gray) pairs of paralogs as measured by the intersection over the union of their Gene Ontology annotations. Horizontal lines denote medians. ****P << 0.0005, Mann-Whitney rank sums test. (E) Maximum-likelihood phylogeny of select clades of plant sHSPs. Scale bar indicates average number of substitutions per site. Mya, millions of years ago. (F) Schematic of the three different interfaces used by sHSP to assemble into oligomers. (G) Mass spectrum of WT-1 and WT-2 after prolonged incubation plotted in the mass-to-charge (m/z) dimension. WT-1 (blue) and WT-2 (orange) 12-mers are observed, with varying numbers of charges. No peaks corresponding to heteromers are detected (upper). Hetero-12-mers are formed via exchange of dimers if WT-2 is mixed with N1α1C2, resulting in additional peaks for each charge state (lower). One charge state is labeled for each 12-mer. (H) When mixed before incubation with pea-leaf lysate at 42°C, WT-1 and WT-2 partition into aggregates at different rates (****P << 0.0005). When WT-2 is incubated with N1α1C2, subunits from both proteins partition at the same, intermediate rate (inset). Heteromers thus function differently from segregated WT oligomers. Error bars in the raw data are standard deviations from three independent experiments; error bars in the inset are standard deviations calculated from 1000 bootstrap replicates of the fit.

To investigate how this occurs, we examined the selective assembly of two paralogous small heat-shock proteins (sHSPs), molecular chaperones found across the tree of life that are key to the cell’s ability to respond to stress (7, 8). A duplication event led to land plants having two classes of cytosolic sHSPs (class 1 and 2; Fig. 1E and fig. S2) that both assemble as dodecamers but cannot form heteromers between classes (9). Both are required for thermotolerance in vivo (10) and have different mechanisms of action (11, 12). We chose one paralog of each class from Pisum sativum: HSP18.1 and HSP17.7 (hereafter WT-1 and WT-2, respectively). Both proteins comprise an N-terminal region, an α-crystallin domain, and a C-terminal tail, and both form homo-12-mers (12) using three independent interfaces: The α-crystallin domain mediates the formation of an isologous α·α dimer; these dimers assemble into oligomers through heterologous contacts between the α-crystallin domain and the C-terminal tails from neighboring dimers (α·C), and interactions between the N-terminal regions (N·N) (Fig. 1F) (13). Their complex, multi-interface architecture makes these proteins an ideal system to investigate how evolution acts to regulate the biophysical properties of oligomers to develop a set of selective interfaces that allows them to diverge functionally.

Small-angle x-ray scattering experiments indicated that both proteins form tetrahedral oligomers (fig. S3), implying that there are no major differences in quaternary structure that prevent coassembly. Nonetheless, when we obtained native mass spectra of a mixture of WT-1 and WT-2 after prolonged incubation (Fig. 1G, upper) or initiating reassembly from their subunits (fig. S4A), we could not detect any hetero-12-mers in either case. However, both homo-12-mers underwent continual dissociation and reassociation, although WT-1 did so >10 times faster than WT-2 (fig. S4). These facile quaternary dynamics show that heteromers are in principle kinetically accessible and so, despite the similarity in quaternary architectures of WT-1 and WT-2, must be thermodynamically unfavorable.

To identify the sequence determinants of selective assembly, we aligned class 1 and 2 sHSPs and noted conserved differences in their C-terminal tails (fig. S5). We then engineered a chimera with the class 1 N-terminal region and α-crystallin domain linked to the class 2 C-terminal tail (N1α1C2; see table S1) and incubated it with WT-2. This small change in sequence produced a series of hetero-12-mers formed between WT-2 and N1α1C2 (Fig. 1G, lower). These represent a proxy for class 1 and 2 coassembly and allowed us to interrogate the functional consequences of heteromerization. We incubated purified sHSPs with pea leaf lysate under heat-shock conditions to form reversible aggregates (14), mimicking their action in vivo (10, 11). WT-2 partitioned significantly faster into the insoluble fraction than WT-1 (Fig. 1H and fig. S6). The rate measured for the heteromers of N1α1C2 and WT-2, however, was intermediate between that of WT-1 and WT-2 homomers. The functional differentiation of the two proteins therefore depends on their selective homomerization, demonstrating the operational necessity of avoiding coassembly.

The hetero-12-mers formed by swapping C-terminal tails comprised only even numbers of each type of subunit (Fig. 1G, lower), implying that either the α·α or the N·N interface must also be selective. To determine which, we engineered an N-terminal chimera, N2α1C1, and incubated it with WT-1. This produced a series of hetero-12-mers comprising odd and even numbers of each subunit (fig. S7A). Although N·N contacts therefore are not thermodynamically selective (and hence the α·α interface must be), we noticed that dissociation of N1α1C2 oligomers was as fast as that of WT-1 (fig. S7B), whereas dissociation of N2α1C1 was slower (fig. S7A). This means that the promiscuous N·N contacts, not the thermodynamically selective α·C and α·α interfaces, control the kinetic stability of the 12-mers.

Our subunit-exchange data indicate that, over the functional temperature range, hetero-12-mers formed via N·N contacts during assembly would decompose into homomers on the time scale of minutes to hours (fig. S4E). Yet, we had observed no long-lived heteromers in our assembly experiment, even at low temperatures (fig S4A). To resolve this apparent conflict, we generated constructs of WT-1 and WT-2 lacking the N-terminal region and measured their stoichiometries using native ion mobility mass spectrometry (IM-MS). Both were polydisperse, spanning dimers to 12-mers (Fig. 2A and fig. S8A). Constructs instead lacking the C terminus only formed monomers and dimers (Fig. 2B and fig. S8B). α·C contacts therefore likely form early and ensure rapid self-selective oligomerization, whereas N·N contacts subsequently stabilize the 12-meric fraction (fig. S8C and supplementary text). This hierarchy obviates the need for kinetically stable N·N contacts to be selective and avoids long-lived heteromers that would compromise the rapid stress response of sHSPs in the cell.

Fig. 2 Oligomeric interfaces form in a hierarchical order.

(A) IM-MS spectra of truncated constructs of WT-1 (upper) and WT-2 (lower) lacking the N-terminal region. The two dimensions of separation (m/z and arrival time, which depends on collision cross section) separate charge-state series corresponding to a series of stoichiometries (colored individually). Both truncated proteins assemble into polydisperse ensembles. MPB, maltose binding protein. (B) IM-MS spectra of truncated constructs of WT-1 (upper) and WT-2 (lower) lacking the C-terminal tail. Both proteins do not assemble beyond dimers. Truncations in the exposed N-terminal region result in several charge series for monomers and dimers that are separated in the arrival-time dimension (see fig. S8 for detailed assignments). (C) Distribution of stoichiometries populated by truncated constructs extracted from spectra in (A), (B), and fig. S6. The C-terminal tail is required for assembly beyond dimers, whereas the N terminus is required for monodisperse 12-mers. The α2C1 construct (fig. S8E) does not oligomerize, indicating an unfavorable α·C interaction.

To understand the thermodynamic basis of selectivity at the α·C interface, we examined chimeric versions of the N-terminal truncations. α1C2 formed polydisperse oligomers, but α2C1 did not assemble beyond a dimer (Fig. 2C and fig. S8, D to F). Selectivity in the α·C interface is therefore directional, arising from an unfavorable association between the WT-1 C-terminal tail and WT-2. We quantified this effect directly by excising the core domains of both proteins (α1 and α2, table S1) and measuring their affinity for each other’s C-terminal tails. Whereas α1 bound peptides mimicking each tail equally well, α2 had a much lower affinity for a WT-1 than WT-2 peptide (ΔΔG >6 kJ mol−1, fig. S9).

We next turned our attention to the α·α interface, which is selective (fig. S10A) despite high sequence conservation (fig. S5B). Crystal structures revealed α1 and α2 to be extremely alike (Fig. 3, A and B, and table S2). The dimer interface is formed in both homodimers by salt bridges centered on the β8-β9 loop (L8/9) that are fully conserved between the two proteins, and by reciprocal strand-exchange between β6 and β2. The latter involves only one obvious class-specific contact: between the π systems of a histidine on β6 and a tryptophan on β2 in WT-1 that is absent in WT-2 (Fig. 3, C and D). In 2-μs molecular dynamics (MD) simulations, both homodimers and a modeled heterodimer were stable. The interfaces of the heterodimer featured equivalent overall numbers of interacting side chains, hydrogen bonds, and level of structural flexibility compared to both homodimers (figs. S10, B to E, and S11). The α-crystallin domain is therefore selective, with only minimal differences in the number or type of contacts at its interface.

Fig. 3 Selectivity in the structurally conserved α-crystallin domain.

(A and B) α1 and α2 dimers have an identical fold [backbone root mean square deviation (RMSD) = 1.2 Å] in which two highly similar interfaces (labeled L8/9 and β6·β2) connect monomers. (C) The L8/9 interface is centered on the loop between β8 and β9 (black outline) and is indistinguishable in the two proteins. Interchain hydrogen bonds are shown as dashed lines. (D) The two β6·β2 interfaces in the dimer are formed by exchange between the β6 and β2 strands. Side chains that differ between α1 and α2 at homologous positions are outlined in black. The π-stacking interaction specific to α1 is shown as a dotted red line. (E) Constructs were designed by swapping the β-sandwich, loop, and β6 strand (left). These were used to assess the strength of the β6·β2 interface and deconvolve the contribution from the loop and β6 strand (right). (F) Global thermodynamic model of dimerization based on experimentally determined ΔGα·α values in fig. S12G. The combined loop and β6 from α1 interact less favorably with β2 from α2 than all other combinations (left). α2 and α1 partition contributions to ΔGα·α differently (shaded). Error bars are standard deviations from 1000 bootstrap replicates of the model fit. (G) In a simulated heterodimer, the free-energy barrier is significantly reduced for the αα1 pair (yellow), but indistinguishable from the homodimers in the case of αα2 (green) when the β6·β2 interface is disrupted along a reaction coordinate that separates them. Shaded area corresponds to the standard error of the mean. (H and I) Median monomeric conformations determined by principal-component analysis colored according to structural difference. This is calculated at each residue from the Cα RMSD between α1 and α2 monomers, minus the RMSD between repeats for each monomer. Positive ΔRMSD values indicate conformational differences between proteins that cannot be explained by the variations intrinsic to each protein, and only those with P < 0.05 (after Bonferroni correction, permutation test) are colored. Differences are apparent in the loop surrounding β6 and in β2. In α1 the loop curls up, whereas in α2 the β2 strand detaches readily from the remainder of the β-sandwich.

To investigate the origin of this selectivity, we performed calorimetric measurements and found that there are differences in the relative contributions from entropy and enthalpy to the favorable free energy of dimerization in α1 and α2 (fig. S12, A to C). This suggests subtle differences in their association mechanisms that may impart selectivity. To quantify which parts of the dimer are responsible for selectivity, we divided the core domain into three segments (Fig. 3E and table S1): the β-sandwich (S), which includes the L8/9 interface and β2 from the β6·β2 interface; β6 (B); and the loop (L) connecting β6 to the β-sandwich. We shuffled these segments between α1 and α2 (Fig. 3E) and, for the 36 pairwise combinations of chimeric and wild-type constructs, determined the corresponding free energy of dimerization, ΔGα·α, by performing quantitative IM-MS titration experiments (fig. S12, D to G). From the overall data set, we identified statistically significant intermolecular interactions between β6 and the β-sandwich (B·S) and between the loop and the β-sandwich (L·S). Summed (B+L·S, Fig. 3F), these interactions contribute ≈11 kJ mol−1 to the stability of the dimer, except when S2 encounters B1L1, which unilaterally destabilizes the dimer by ≈7 kJ mol−1 (Fig. 3F, left). The L·S and B·S components contribute nearly equally to dimer stability (Fig. 3F, middle and right), a surprising observation considering that the loop is not part of the interface.

Because the α1 and α2 dimer structures did not reveal differences that account for our experimental thermodynamic data, we performed steered MD simulations in which we gradually detached β6 from β2 and estimated the resulting free-energy profile (Fig. 3G). As predicted by our thermodynamic data, we found that the heteromeric B+LS2 interface was significantly easier to break than the other combinations. We also noticed that in unconstrained simulations of the α1 monomer (performed in triplicate), the β-sandwich remained rigid (Fig. 3H and fig. S13, A, C, and D), whereas the loop distorted and formed intramolecular contacts (fig. S13D). In the α2 monomer, the loop more closely retained its conformation from the dimer (Fig. 3I and fig. S13, B to D), but β2 detached from the β-sandwich and became highly flexible (fig. S13, C to E).

Our data imply that the loop in α1, and β2 in α2, have a propensity to sample conformations in the monomers that are limited upon formation of a dimer interface (fig. S13D). In both homodimers, only one side of each B+L·S interface is restrained in this way, whereas in the heterodimer, both sides of the B+LS2 interface are restrained (Fig. 4), making it easier to break apart. Conversely, to dimerize, dynamic regions must undergo a structural transition from their monomeric conformations. In homodimers, only one side of each interface would have to do this, with the other being preordered for dimerization. In a heterodimer, this conformational complementarity would be absent for the B+LS2 interface, also leading to a slow association rate. These effects would therefore combine to discourage the formation of heterodimers and instead ensure self-selection.

Fig. 4 Selective interfaces overcome unfavorable entropy of homomerization.

(A) Selective homomerization is entropically unfavorable and requires an energetic penalty upon forming heteromeric contacts to suppress heteromerization. Shown is the theoretical magnitude of this penalty per subunit (ΔGDemix) required to populate heteromers at only 2% of all oligomers. It increases logarithmically with the size of the oligomer, making it more challenging for larger oligomers to be selective. (B) Empirically derived stabilities of all possible heteromers along the assembly pathway compared to homomers of the same size (ΔΔG = ΔGheteromer – ΔGhomomer). The upper and lower tiles of each column correspond to homomers of WT-1 and WT-2, respectively. Those in between represent heteromers, with increasing numbers of WT-2 subunits (downward). The ΔΔG values are positive for all heteromers, meaning that the energetic penalty to coassembly that we quantified in selective interactions is larger than the positive entropy of heteromerization. (C) The equilibrium population of homo- and hetero-12-mers calculated based on the values in (B) results in mole fractions of hetero-12-mers just below detectable levels. More than 96% of subunits partition into homomers, compared to only 0.05% based on the binomial distribution of hetero-oligomers that would arise in the absence of selective interfaces. (D) The oligomeric stoichiometries populated by selective oligomeric paralogs (gray fill) are smaller with a particular excess of dimers than for a control set of oligomers that have no paralogs (purple). **P < 0.005, Mann-Whitney rank sums test. Error bars represent 90% Clopper-Person confidence interval, n denotes sample size. Applying a scaling according to ΔGDemix to the control set reproduces closely the observed selective distribution (purple outline, P = 0.0005, Akaike information criterion).

If this mechanism is correct, with the loop making a large contribution to the instability of the heterodimer (Fig. 3E), it should be a major regulator of the monomeric structure. Indeed, the conformations of simulated chimeric monomers lie between the extremes occupied by α1 and α2, and the segment that shifts the structure the most is the loop, not the interfacial segments (fig. S13F). Similarly, chimeric dimers incorporating segments that do not change conformations in our simulations (S1, B2, and L2; Fig. 3E) should be more stable than both α1 and α2. This prediction is borne out in their experimental melting temperature being ≈5°C higher (fig. S13G).

We mined our MD trajectories for specific contacts that were more abundant in one class over the other and identified 11 and 3 that involved residues that displayed class-specific evolutionary conservation in α1 and α2, respectively. Notably, we found that most of these are outside of the dimer interface: In α1, 7 out of 11 conserved sites either attach β2 to the sandwich or promote curling of the loop, whereas in α2, one maintains an extended loop conformation (fig. S14), and another makes β2 prone to detach in the monomer. Thus, noninterfacial regions, and their effects on the structure of dissociated monomers, determine selectivity in the α-crystallin domain of class 1 and 2 sHSPs across land plants. This is consistent with the observation that noninterfacial residues can affect interface stabilities (15).

To homomerize, paralogs must overcome a substantial entropic benefit of coassembly arising from the number of ways distinguishable subunits can be arranged. This mixing entropy increases with the number of subunits in the oligomer such that the energetic cost of homomerization rises logarithmically (Fig. 4A) (supplementary text). Combining this contribution with the strength of interactions that we quantified experimentally allowed us to generate a model predicting the stability of all possible combinations of the two sHSPs and their chimeras, dependent only on their stoichiometry and constituent α·C and α·α interfaces (supplementary text and fig. S15). We used this model to calculate the difference in stability between every possible heteromer and the corresponding homomers along the assembly pathway (Fig. 4B). The selective interactions in the α·C and α·α interfaces narrowly overcome the entropic benefit of coassembly for all stoichiometries (Fig. 4C), resulting in a predicted population of hetero-12-mers at equilibrium that is just below detectable levels (Fig. 4C, right).

Homomers are therefore only marginally more stable than heteromers, even though the paralogs have diverged for >400 million years (16). The number and type of selective interactions that we found are the minimum required for a tetrahedron (17), with half of the oligomeric interfaces (N·N and those involving C2) remaining promiscuous. These observations imply that selectivity is difficult to evolve, perhaps because most substitutions that disfavor coassembly also disfavor self-assembly (18).

Our model predicts that this would be more problematic for oligomers with more subunits, for which the entropic barrier to self-assembly is higher (Fig. 4A). Using a data set of oligomeric architectures based on curated crystal structures (17) and combining it with our list of paralogs (Fig. 1B and data file S2), we found that self-selective paralogs comprise fewer subunits than homomers that have no paralogs (Fig. 4D). The data are well explained by the probability that selectivity evolves after duplication being inversely proportional to the mixing entropy (supplementary text). Applying this relationship to scale the stoichiometry distribution of oligomers without paralogs renders it indistinguishable from the self-selective set (Fig. 4D). This indicates that this fundamental thermodynamic bias acts as an evolutionary constraint across oligomeric proteins. The mechanisms for selectivity that we have uncovered for the sHSPs studied here are some of possibly many ways in which proteins have evolved to escape coassembly.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S15

Tables S1 and S2

References (1977)

Data Files S1 and S2

References and Notes

Acknowledgments: We thank C. Robinson, J. Schnell, P. Kukura, D. Staunton (all at University of Oxford), and B. Metzger (University of Chicago) for helpful discussions. We acknowledge access to B21 and help from M. Tully and J. Doutch at the Diamond Synchrotron (J.L.P.B. for SM9384-2); and the ARCUS cluster at Advanced Research Computing, Oxford. We thank the following funding sources: Engineering and Physical Sciences Research Council (G.K.A.H. for a studentship, J.L.P.B. for EP/J01835X/1); Carl Trygger’s Foundation (E.G.M.); Swiss National Science Foundation (M.T.D. for P2ELP3_155339) and Biotechnology and Biological Sciences Research Council (A.J.B. for BB/J014346/1, J.L.P.B. for BB/K004247/1 and BB/J018082/1); National Institutes of Health (E.V. for RO1 GM42761); Massachusetts Life Sciences Center (E.V. for a New Faculty Research Award); and the Royal Society (J.L.P.B. for a University Research Fellowship). All data necessary to support the conclusions are available in the manuscript or supplementary materials and are deposited with DOI 10.5287/bodleian:54jBVeAzw.

Stay Connected to Science

Navigate This Article