Research Article

Recognition Dynamics Up to Microseconds Revealed from an RDC-Derived Ubiquitin Ensemble in Solution

See allHide authors and affiliations

Science  13 Jun 2008:
Vol. 320, Issue 5882, pp. 1471-1475
DOI: 10.1126/science.1157092

Abstract

Protein dynamics are essential for protein function, and yet it has been challenging to access the underlying atomic motions in solution on nanosecond-to-microsecond time scales. We present a structural ensemble of ubiquitin, refined against residual dipolar couplings (RDCs), comprising solution dynamics up to microseconds. The ensemble covers the complete structural heterogeneity observed in 46 ubiquitin crystal structures, most of which are complexes with other proteins. Conformational selection, rather than induced-fit motion, thus suffices to explain the molecular recognition dynamics of ubiquitin. Marked correlations are seen between the flexibility of the ensemble and contacts formed in ubiquitin complexes. A large part of the solution dynamics is concentrated in one concerted mode, which accounts for most of ubiquitin's molecular recognition heterogeneity and ensures a low entropic complex formation cost.

Protein function relies on structural protein dynamics, with time scales ranging from picoseconds to beyond seconds. For molecular recognition, for example, proteins adapt their structure to different binding partners, often exhibiting large structural heterogeneity. In the past 30 years, atomic information on many dynamical processes has been accumulated from a broad variety of techniques (1, 2). Nuclear magnetic resonance (NMR) relaxation has been used to quantitatively probe protein dynamics at the fast end (picoseconds to nanoseconds) as well as in a much slower range (microseconds to milliseconds) of this broad spectrum of time scales (36). Relaxation of nuclear magnetization is caused by fluctuations of magnetic interactions between nuclei resulting from the nanosecond rotational tumbling of the molecule and internal dynamics. The amplitudes of these motions are expressed as so-called Lipari-Szabo order parameters Embedded Image (7). Internal dynamics slower than the rotational tumbling time τc have no impact on the overall fluctuation of the magnetic interactions. Therefore, Embedded Image order parameters reflect only sub-τc motions, at the fast end of time scales.

The slow range of time scales is accessible by relaxation dispersion measurements, based on the stochastic fluctuations of isotropic chemical shifts, which are independent of rotational tumbling (3, 5). Conformational heterogeneity slower than 10 ms can be directly observed as peak splitting in NMR spectra. For backbone amides, motions faster than 50 μs do not result in sufficient line broadening to be detectable for relaxation dispersion measurements. These measurements therefore probe motions slower than about 50 μs up to about 10 ms and have been used to characterize major structural changes and enzymatic reactions (6, 8). Except for certain favorable cases (9), it is, however, difficult to translate these fluctuations into ensembles of structures. Therefore, relaxation-based ensembles of solution structures take only motions faster than τc into account: They are limited to sub-τc dynamics (10, 11). These sub-τc motions are typically much smaller than the structural changes involved in molecular recognition and are likely to contribute mainly to the entropy of proteins (1214). As a consequence, the structural heterogeneity observed in protein complexes has frequently been assumed to be inaccessible to equilibrium fluctuations in solution, thus favoring induced-fit models (15, 16).

RDCs probe supra-τcdynamics. RDCs are sensitive to motion from picoseconds to milliseconds, which includes the previously invisible time window between τc and 50 μs, which we will call supra-τc. Indeed, RDCs recorded for ubiquitin, as well as for the B1 domain of protein G, hint at substantial dynamics between nanoseconds and microseconds (1725). Here, we present a structural ensemble of ubiquitin based on an extensive RDC data set (Fig. 1). Ubiquitin is a key to many cellular signaling networks (26, 27) (as in protein degradation, for example) and is recognized by a broad variety of proteins with high specificity (28). Accordingly, ubiquitin crystal structures of 46 different complexes show a particularly pronounced structural heterogeneity (Fig. 2), which cannot be explained from the available sub-τc ensembles refined against NMR relaxation data (10, 11) (Fig. 2, C and E).

Fig. 1.

Structure ensemble of ubiquitin. (A) Backbone trace of 40 randomly chosen structures from the EROS ensemble. Residues are colored by the amount of additional (supra-τc) mobility as compared with the Lipari-Szabo order parameters (Fig. 3C) Embedded Image. (B) For each x-ray structure (for numbering on the x axis, see table S3), the backbone RMSDs of residues 1 to 70 are shown for superpositions with each EROS structure (red dots) and each x-ray structure (black dots). The minimal RMSD for EROS structures (red line) and the maximal RMSD for x-ray structures (black line) are highlighted to guide the eye. (C) Cα root mean square fluctuations (RMSF) of EROS structures (red line) and of 46 known ubiquitin x-ray structures (black line).

Fig. 2.

Comparison of supra-τc and sub-τc solution ensembles (colors) with the collection of 46 x-ray structures (black) of ubiquitin by PCA: EROS (A and B), 1xqq (C and D), and 2nr2 (E and F). The PCA was carried out over the merged two ensembles that are displayed (in each case, the x-ray ensemble and one NMR ensemble: EROS, 1xqq, and 2nr2). Panels (A), (C), and (E) show projections onto the principal modes 1 and 2, whereas panels (B), (D), and (F) show projections onto modes 3 and 4. Systematic deviations are observed along the principal modes for both sub-τc ensembles but not for the supra-τc EROS ensemble.

RDCs are observed in an anisotropic solution, induced (for example) by a highly diluted liquid crystalline medium (29) or a polyacryl amide gel. In such an anisotropic solution, the protein does not adopt all orientations with the same probability. Therefore, the rotational tumbling no longer averages the dipolar coupling to zero but to a measurable RDC. The anisotropic orientation distribution is represented by an alignment tensor, which is fixed to the molecular frame. For directly bonded nuclei, the RDC D depends only on the direction (θ,φ) of the internuclear vector in the alignment frame Embedded Image(1) Embedded Image where Da is the axial component of the alignment tensor and R describes its rhombicity (17, 29). Internal dynamics lead to orientational fluctuations of the internuclear vector (θ,φ) in the alignment frame (and therefore also in the molecular frame) and affect the size of the RDC according to Eq. 1. This variation of the RDC is usually in the range of less than 10 Hz, and therefore the RDC D is averaged to the measured 〈D〉 for motions faster than the upper limit of relaxation dispersion (10 ms), thus sampling the previously inaccessible supra-τc time window.

Because the alignment tensor includes five parameters, the extraction of these fluctuations requires the measurement of RDCs in at least five independent alignment media. To assess the supra-τc time scale for ubiquitin, we measured RDCs for the backbone amide NH couplings in 18 different alignment conditions, as well as backbone HNC′ (amide proton to carbonyl carbon in the same peptide bond) and NC′ (amide nitrogen to carbonyl carbon in the same peptide bond) RDCs from 4 different alignment media. Together with data from the literature (3032), 36 NH RDC data sets and 6 HNC′ and NC′ RDC data sets were available. To probe side-chain dynamics as well, we included side-chain methyl group RDCs measured for 11 alignment media in the analysis (33).

Supra-τcubiquitin ensemble reveals conformational selection. To extract a structural ensemble from these data, we carried out cross-validated ensemble refinement from unfolded structures in explicit solvent subjected simultaneously to restraints from NMR nuclear Over-hauser enhancement (NOE) and RDC data (henceforth referred to as EROS for ensemble refinement with orientational restraints). The unperturbed protein exhibits considerable flexibility, with a substantial fraction (color coded, Fig. 1A) attributed to supra-τc. Slower motions, at the microsecond-to-millisecond time scale, have previously been observed for only a very limited number of residues (34), thus confining the additional motion to the time range between the correlation time and about 50 μs. As a cross-validation, the ensemble was also calculated without NOEs. The resulting ensemble was found to be virtually unchanged [(33), EROS4], indicating that the ensemble is predominantly defined by the RDC data.

Unexpectedly, this supra-τc ensemble comprises the complete range of crystallographically observed structural changes during interface engagement (Figs. 1B and 2A), in contrast to the known fast dynamics (Fig. 2, C and E) (10, 11). Indeed, each of the x-ray structures is similar to members of the solution ensemble within less than 0.8 Å backbone root mean square deviation (RMSD) (Fig. 1B), although no crystallographic data have been used during refinement. Conformational selection, rather than induced fit, thus suffices to explain all known structural adaptations that the ubiquitin backbone undergoes upon complex formation with different binding partners. Remaining induced-fit motions are restricted to rotameric side-chain rearrangements and minor backbone changes.

As an independent validation of our ensemble, we have also applied a self-consistent RDC-based model-free (SCRM) analysis (33) to the set of 36 NH RDC experiments. This method is an enhanced implementation of the previously published model-free method (21, 24, 25) that largely alleviates structural bias (33). The SCRM analysis quantifies dynamics as the degree of orientational restriction of the amide NH bond in the molecular frame in terms of a generalized order parameter S2(NH), which is zero for complete isotropic disorder and one for a fixed orientation of the respective NH bond. For comparison, generalized order parameters were also computed from the EROS ensemble. A correlation coefficient r = 0.74 between Embedded Image and Embedded Image is found (Fig. 3A). This agreement between two independent approaches shows that the dynamics observed in the EROS ensemble are indeed strongly determined by the experimental RDC data. This conclusion is supported by rigorous cross-validation implemented in EROS by systematically leaving out all RDCs between backbone amide N and carbonyl C, as well as all scalar couplings, from refinement. The ensemble-averaged free RDC R-factor of 18.5% is considerably lower than for other solution ensembles (>24%; table S2). Combining all x-ray structures into an “ensemble” (35), we obtained a similarly low R-factor of 18.3%. As compared with the R-factor of 25 ± 4% for individual x-ray conformers, this result confirms that the conformational heterogeneity (as found in the EROS ensemble and in the x-ray data) considerably improves the description of the experimental solution NMR data. In addition, the correlation between order parameters derived from the x-ray “ensemble,” particularly when relaxed in short (10-ps) molecular dynamics simulations at 300 K [Fig. 3B; (33)], and the RDC-derived order parameters Embedded Image and Embedded Image suggests that the interconversion between the different ubiquitin conformations in the x-ray ensemble strongly contributes to the solution dynamics.

Fig. 3.

Comparison of NH order parameters of ubiquitin. (A and D) The order parameters of the presented EROS ensemble (red) are compared with SCRM order parameters (blue) derived from the NH part of the RDC data used for EROS. The SCRM order parameters shown in dark blue reflect the most probable overall scaling with respect to the Lipari-Szabo–derived order parameters Embedded Image. The most conservative scaling of SCRM order parameters to Embedded Image is shown in light blue. (B and E) Order parameters intrinsic to the ensemble of 46 crystallographic structures (black). The dashed curve is obtained when the 46 structures are relaxed at 300 K by short molecular dynamics simulations of 10 ps. (C and F) Generalized order parameters obtained from NMR relaxation data (green) for the sub-τc dynamics of ubiquitin via Lipari-Szabo model-free analysis (36). Green circles mark the data points taken from the most recent and accurate measurement (36), whereas remaining data points are taken from previously published data (46). The latter (46) were rescaled such that they align with the newer results (36). The EROS order parameters were scaled by 0.93 to account for limited ensemble size and underestimation of the librational contribution (SOM text S4). Error bars (1σ) for the EROS ensemble (light-red) comprise intrinsic sampling and force-field errors as well as propagated experimental errors. The uncertainty in the libration correction was estimated as ±4% and is represented in gray. A solid line is shown for residues where sufficient RDC data were available to determine a robust value with SCRM analysis; for the other positions, EROS order parameters are shown as a dashed line. [(D) to (F)] Scatterplots for a direct comparison of the two sets of order parameters shown to the left of the respective plot.

To assess how much of the solution dynamics is slower than τ, we compare Embedded Image and Embedded Image with order parameters derived from NMR relaxation measurements. The picosecond-to-nanosecond time scale dynamics of the ubiquitin backbone were probed previously by NMR relaxation techniques, yielding a set of Embedded Image order parameters as derived from a Lipari-Szabo analysis (7, 36). Figure 3C compares order parameters Embedded Image from the ensemble presented in Fig. 1A with Embedded Image order parameters. For most residues, additional mobility is seen, thus quantifying the supra-τc motion in the EROS ensemble, shown as color code in Fig. 1A. For EROS, absolute order parameters were derived from the RDC-refined ensemble and corrected for limited ensemble size and libration effects. For SCRM analysis, the absolute scale was determined relative to Embedded Image order parameters, with Embedded Image as an upper bound for Embedded Image, within the error bars [see supporting online material (SOM) text S1, section 1.2, and SOM text S4 for details]. Although the RDCs do not provide the absolute amplitude of the dynamics, the overall scale of the independently determined Embedded Image and Embedded Image is nearly identical.

Solution fluctuations allow for interface contact formation. As noted above, the supra-τc motion accesses all the conformations that are observed in complex structures. To rationalize this unexpected result, we overlaid all interface-contacts (gray spheres) of the different binding partners found in the x-ray structures with a single structure of ubiquitin whose coloring represents the solution dynamics as given by Embedded Image (Fig. 4A). Notably, helix α1, for which no contacts are observed, shows only little motion in solution (blue), whereas high flexibility (orange-red) is observed in regions that form many different protein-protein interfaces. A quantitative analysis of the number of interface contacts per residue (Fig. 4C) shows an unexpectedly high similarity to the conceptually unrelated order parameters Embedded Image, which corroborates this initial observation.

Fig. 4.

Solution dynamics correlate with molecular recognition sites. (A and B) The apo structure of ubiquitin (1UBI) is colored by backbone flexibility in solution as given by Embedded Image. (A) Positions of contacting atoms of complexing proteins (<5 Å distance) are shown as gray spheres. (B) View toward the surface at the most prominent recognition site around residues I44/H68. H68 (sticks) lies within a rigid crevice that connects via F45 to the other known recognition site centered at D58. The walls of this crevice are formed by regions with high flexibility. Around H68, rigidity is provided by packing of core residues L67 and L69 (not shown) against the central helix; at D58, packing of L55 and a long-range hydrogen bond from Y59 to E51 provide stability. (C) Number (nr) of ubiquitin-binding protein contacts per residue (blue line) and the flexibility in solution for the sub-τc time regime (green line) and the supra-τc time range, as extracted from the EROS ensemble (red line). A marked correlation between contacts and solution fluctuations is observed, particularly for the EROS ensemble. Exceptions from the observed correlation are found for known molecular recognition hotspots (marked with “x” symbols: I44/H68, D58), which may act as rigid anchors, allowing flexibility for neighboring residues. Lysines responsible for polyubiquitination are marked with circles (K48, K63).

Two prominent exceptions from the observed high flexibility in the binding regions are residues Ile44 and His68 [I44 and H68 (37)] (two of the three “x” symbols in Fig. 4C). Both are known from mutation studies to be central hotspot (38) residues of a binding motif (Fig. 4B) that is involved in recognition of many different binding partners (26, 39). Recently, the first crystal structure with a new recognition motif centered at hotspot D58 (one of the three “x” symbols in Fig. 4C) has been found (40). Our results show that, in solution, this residue is as rigid as I44/F45 and H68.

At first sight, the observed fluctuations appear incompatible with the proposed conformational selection scenario. In particular, it seems combinatorially highly unlikely to find all involved residues simultaneously in the proper configuration required for binding, thus imposing a high entropic barrier. Only concerted fluctuations, implying reduced entropic cost, would explain the observed high physiological on-rates and affinities (39).

Collective molecular recognition dynamics. To check whether such concerted fluctuations are actually observed in the ubiquitin ensemble, we have carried out a principal component analysis (PCA). The conformational changes observed in x-ray structures are well described within the first five principal components. Although the number of degrees of freedom is reduced from 1839 to only 5, all x-ray structures can be described up to a backbone RMSD of 0.45 ± 0.04 Å. From linear combinations of these five principal components, we found a single collective mode that corresponds to a pincer-like motion of predominantly those residues that are frequently involved in interfaces and accounts for 25% (RMSD) of all backbone fluctuations in the solution ensemble (Fig. 5B).

Fig. 5.

Equilibrium supra-τc dynamics are dominated by conformational selection dynamics. A large amplitude collective solution mode entails a pincer-like motion of loop β1-β2 and loop α1-β3 including the C-terminal tip of helix α1. For each of altogether 41 binding partners, this collective solution mode was systematically varied to find a predicted position that maximized contacts. (A) The position on the mode of the thus predicted selected structures is plotted on the y axis, whereas the projected position onto this mode for the actual crystal structures is plotted on the x axis. (B) In order to illustrate the conformational selection along the collective solution mode, two of the selected snapshots (dark blue and red) are shown together with relevant parts of their respective binding partners: the zinc finger ubiquitin-binding domain of isopeptidase T (2G45, yellow) and HRS (2D3G, cyan). Contacts affected by the motion along the collective mode are shown as spheres. The crystal structure of 1UBI is shown at relevant regions as a gray cartoon. The full protein is shown as a semitransparent surface.

Whether this mode indeed describes the molecular recognition dynamics can be tested stringently by predicting the bound ubiquitin conformations with the use of information only from the binding partner. To this end, we systematically varied the ubiquitin structure along this mode for each of altogether 41 interfaces, until the highest number of contacting interface atoms (i.e., atoms within 3 to 8 Å of the binding partner) was reached. A correlation of 0.94 between the projection of the thus predicted and the actual x-ray structure was found for the pincer-like mode (Fig. 5A). Analogously, correlations of 0.90 and 0.84 were obtained for the linearly combined first three principal components and for the third principal component, respectively. These consistently high correlations for collective modes indicate that the interface adaptation dynamics of ubiquitin are indeed well described within a few collective degrees of freedom that dominate the solution ensemble. Moreover, this analysis indicates that the ability to optimize contacts with binding partners via backbone interface adaptation is important for ubiquitin to reach sufficient affinity with many different binding partners. As illustrated in Fig. 5B, for the ubiquitin interfaces with hepatocyte growth factor–regulated tyrosine kinase substrate (HRS) and the zinc finger ubiquitin-binding domain of isopeptidase T [Protein Data Bank (PDB) accession codes 2D3G and 2G45], the collective solution mode allows molecular recognition by enabling ubiquitin to adapt to different protein interfaces.

The slow supra-τc time scale of ubiquitin's interface adaptation dynamics is corroborated by the observation that collective solution modes obtained from the first five principal components of nanosecond ensembles 1xqq and 2nr2 (10, 11) were less adept in describing the interface adaptation. For these modes, the correlation between predicted and crystallized position dropped from 0.94 to 0.68 and to 0.55, respectively. The supra-τc time scale has previously been speculated to be important in the context of signal propagation of the immunoglobulin-binding domain of protein G (20) as well as for aggregation dynamics (41).

Summary. Taken together, we have determined a solution ensemble of a globular protein from experimental data that comprises all solution dynamics up to the microsecond time scale at atomic resolution. A large part of this solution dynamics is concentrated in a collective pincer-like motional mode that strongly contributes to the interface adaptation dynamics during molecular recognition events. All available crystallographic structures of ubiquitin complexed to different binding proteins were shown to be accessible in solution. Conformational selection, rather than induced fit, is thus the main contributor to the observed interface adaptations. The observed conformational selection dynamics lower entropic barriers, thereby explaining physiologically observed high affinity and fast on-rates which otherwise would need to be explained by induced-fit motions.

These findings suggest how ubiquitin recognizes many different partner proteins with a high degree of specificity and sufficient affinity. In order to reach sufficient affinity, a certain degree of structural plasticity is required that is thermally accessible in solution. In order to maintain high specificity despite the inherent flexibility, the binding interfaces are centered around the rigid hotspot (38) residues H68/I44 and D58. The rigidity of these mutational hotspots (26, 39, 40) might prevent promiscuous binding, because only precisely aligned partner interfaces benefit from the high hotspot energy contribution. Structurally, the observed rigidity is maintained for H68 by packing with its neighbors L67 and L69 tightly into the protein core, whose rigidity is reinforced by helix 1. Similarly, I44 is anchored via F45 and decoupled from the adjacent flexible loop via an alanine-glycine linker (A46/G47). At D58, packing of L55 and a long-range hydrogen bond from Y59 to E51 provide stability. Because the solution dynamics are dominated by the collective pincer-like interface adaptation, it seems that only functionally essential flexibility is present. Apparently, ubiquitin has evolved to be as rigid as possible while remaining as flexible as necessary to engage in different interfaces.

Our finding that conformational selection is responsible for protein-protein binding of ubiquitin is in line with recent findings of conformational selection occurring for antibodies and enzymes (4244). For the latter, relaxation dispersion experiments that are sensitive to microsecond-to-millisecond time scales (i.e., 1000 times slower than the processes we described here) show conformational selection for all steps in enzymatic reactions of dihydrofolate reductase (9). It should be noted that our findings differ from the stepwise model proposed for the binding of unfolded proteins to folded ones (45) and thus open up a whole range of possible molecular recognition mechanisms.

Supporting Online Material

www.sciencemag.org/cgi/content/full/320/5882/1471/DC1

SOM Text S1 to S7

Figs. S1 to S9

Tables S1 to S8

References

References and Notes

View Abstract

Navigate This Article