Long-Range Interactions Within a Nonnative Protein

See allHide authors and affiliations

Science  01 Mar 2002:
Vol. 295, Issue 5560, pp. 1719-1722
DOI: 10.1126/science.1067680


Protein folding and unfolding are coupled to a range of biological phenomena, from the regulation of cellular activity to the onset of neurodegenerative diseases. Defining the nature of the conformations sampled in nonnative proteins is crucial for understanding the origins of such phenomena. We have used a combination of nuclear magnetic resonance (NMR) spectroscopy and site-directed mutagenesis to study unfolded states of the protein lysozyme. Extensive clusters of hydrophobic structure exist within the wild-type protein even under strongly denaturing conditions. These clusters involve distinct regions of the sequence but are all disrupted by a single point mutation that replaced residue Trp62 with Gly located at the interface of the two major structural domains in the native state. Thus, nativelike structure in the denatured protein is stabilized by the involvement of Trp62 in nonnative and long-range interactions.

Incompletely folded states of proteins are coupled to cellular processes such as protein synthesis, translocation across membranes, and signal transduction [reviewed in (1, 2)]. In addition, intrinsically unstructured proteins have been predicted to be common within the genomes of all organisms (3). Unstructured and partially folded conformations of proteins are, however, prone to aggregate and have been implicated in a wide range of diseases (4). The structural and dynamic characterization of nonnative states of proteins is therefore crucial for understanding these processes in addition to being fundamental to an understanding of protein folding itself.

Nonnative states of proteins are ensembles of conformers, the individual members of which may differ substantially in their structural and dynamic properties. Conformational sampling of denatured proteins can be significantly restricted, and the existence of “compact states” has been postulated to occur (5–9). In some cases, specific experimental structural information has been obtained although in general this information is either indirect or highly localized. An important question relating to all nonnative states is the extent to which long-range interactions are important in the stabilization of nonrandom interactions. We use site-directed mutagenesis and NMR spectroscopy to show that long-range nonnative interactions stabilize nativelike hydrophobic clusters in lysozyme.

A wide range of approaches has been developed to characterize nonnative states of proteins in atomic detail by NMR spectroscopy (10), and evidence for the presence of residual structure even under strongly denaturing conditions has been presented, see, e.g. (11–17). Residual structure appears to reside predominantly in hydrophobic clusters, in which tryptophan or histidine residues are surrounded by other hydrophobic side chains (18–20). It has been postulated that hydrophobic clusters are stabilized by long-range interactions and may influence the folding of the protein, for example by acting as nucleation sites around which structure can be formed (16–18,21). Hydrophobic clusters have also been identified in nonnative states of hen lysozyme, in both the oxidized and the reduced form in 8 M urea at pH 2 [in the reduced protein the free sulfhydryl groups are blocked by methylation (16)].

Of the measured NMR parameters, chemical shift values of HN and Hα protons and transverse (R 2) relaxation rates are the most direct indicators of residual structure. Here, we use such parameters to examine the reduced state of hen lysozyme in the absence of urea and then to investigate the structural changes resulting from the replacement of residue Trp62 by Gly (W62G). HN and Hα chemical shifts measured for reduced and methylated wild-type lysozyme (WT-SME) in water are shown in Fig. 1A along with data for WT-SME in 8 M urea (16). In WT-SME, significant deviations in chemical shifts of the HN resonances from random coil values (22,23) can be seen for Gly22, Trp63, and Cys64, and of the Hα resonances for residues 19 to 32, 58 to 64, 119 to 124, and 106 to 113. The largest differences are observed at positions 106 to 116, a result indicative of an increase in helical character for this region of the sequence for the protein in water compared with it in urea; consistent with this conclusion is the observation from circular dichroism (CD) measurements of an 8% increase in helicity.

Figure 1

(A) Comparison of perturbations of chemical shift values measured for reduced hen lysozyme (WT-SME) in water and in 8 M urea from chemical shifts measured in short unstructured peptides (Δδ = δexpt − δrandom coil) (22,23). (Top) ΔδHN for WT-SME in water (red line) and in 8 M urea (blue line). (Bottom) ΔδHα for WT-SME in water (red line) and in 8 M urea (blue line). (B) Comparison of perturbations of chemical shift values measured for reduced hen lysozyme (WT-SME) and W62G. (Top) ΔδHN for WT-SME in water (red line) and for W62G-SME in water (blue line). (Bottom) ΔδHα for WT-SME in water (red line) and for W62G-SME in water (blue line).

Heteronuclear 15N R 2relaxation rates of backbone amide groups of WT-SME in water are shown in Fig. 2B. For comparison, 15N R 2 rates of this protein in 8 M urea (pH 2) are reproduced from (16) in Fig. 2A. The relaxation rates for the protein in urea reach a plateau in the central region of the protein sequence, with the terminal regions having shorter relaxation rates. Simple models of a polypeptide chain, in which the physical properties of the chain are dominated by unrestrained segmental motion, predict this pattern of behavior (16). Thus the15N relaxation properties of a given amide group are in general not markedly influenced by the identity of its neighbors, but predominantly reflect the motional properties of the polypeptide main chain. The model allows a two-parameter fit of experimental relaxation rates when using an intrinsic relaxation rateRintrinsic of an amide 15N nucleus and a persistence length (λ0) of the polypeptide chain [Web eq. 1 in supplementary material (24)]. The relaxation rates fitted by the model are indicated by the bold line in Fig. 2A. The best fit for the protein in water (Fig. 2B) corresponds to Rintrinsic of 0.20 s−1 and λ0 of 7 (in units of number of residues); in urea, the values were found to be 0.25 s−1 and 7, respectively. Thus, WT-SME in water has the same chain stiffness as WT-SME denatured in urea (λ0 = 7). The lower value for Rintrinsic of WT-SMEin water compared with that in urea can be attributed to the differences in viscosity of the two solvent systems (25).

Figure 2

Comparison of 15N transverse relaxation rates R 2 in hen lysozyme. (A) WT-SME in 8 M urea (pH 2) [data taken from (16)]. (B) WT-SME in water (pH 2). (C) W62G-SME in water (pH 2). (D) WT in 8 M urea (pH 2) [data taken from (16)]. (E) W62G in 8 M urea (pH 2). The experimental relaxation rates are shown as a scatter plot and, the rates fitted by a model of segmental motion alone in the absence or presence of slow dynamics around the four disulfide bonds are shown as bold lines. The locations of the native disulfide bridges are indicated. Six clusters of residual structure were identified for HEWL-SME in water (B). Values obtained by statistical modeling are given in (24).

Several regions of the polypeptide chain show larger15N R 2 relaxation rates than those anticipated from the model. These differences indicate the presence of nonrandom structure in the chain, attributable to the presence of clusters of hydrophobic side chains (26). Overall, the distribution of relaxation rates as a function of sequence could be described by a sum of the segmental motion and additional deviations centered around six clusters, which can be modeled as a Gaussian distribution [Web eq. 2 in (24)]. Although the locations of the regions showing such deviations are similar for the protein in the two solvents, their intensity and length are very different. In general, the deviations are larger for the protein in water than in urea. In particular, cluster 2 is much more pronounced and defined in water than in urea. This is consistent with the anticipated weakening of hydrophobic interactions in the presence of denaturant.

Lysozyme contains two structural domains, the α domain, involving residues 1 to 35 and 85 to 129, and the β domain, which comprises residues 36 to 84 (27, 28). The clusters are located in sequentially distinct regions along the polypeptide chain; the positions are highlighted in Fig. 3. The largest clusters of residues from the relaxation analysis correlate with the location of hydrophobic residues in the sequence; tryptophan residues are involved in four of the six clusters: Trp28(cluster 2), Trp62 and Trp63 (cluster 3), Trp108 and Trp111 (cluster 5), and Trp123 (cluster 6).

Figure 3

Mapping of the deviations of the relaxation behavior from those predicted by a random coil model (black line inFig. 2, A and B) onto the native state structure (Protein Data Bank access code 193L) (39). (A) 15N relaxation rates measured for WT-SME in 8 M urea (Fig. 2A). (B) 15N relaxation rates measured for WT-SME in water (Fig. 2 B). Single-letter amino acid abbreviations: A, Ala; L, Leu; V, Val; and W, Trp.

Elevated values of R 2 relaxation rates are observed toward the NH2-terminus of the protein in water, notably for Ala9 to Ala11 (cluster 1); these residues are located in the region of the sequence that forms the A helix in the native structure. High values of R 2are also found for Leu25 to Glu35 residues located in the B helix in the native structure (cluster 2) (Fig. 3B). Side chains from the A and B helices interact with each other in the native structure. Specifically, Ala9 is close to Leu25 and Val29; both of the latter residues are located on one face of helix B. On the other face of the B helix, there is a common interface with helix D (cluster 5) and the residues of the weak and extended cluster 4. Contacts involve residues Asn27 to Trp111, Trp28 to Trp108, Phe34 to Ala110, Phe34 to Arg114, Glu35 to Trp108, and Glu35 to Ala110. This interface brings into proximity residues that are linked in the native state by the formation of a disulfide bond between Cys30and Cys115 in the oxidized protein. Trp123 is close to Ala9 (cluster 1) and Val29 (cluster 2). Thus, a core of residual structure appears to be formed, even in the absence of disulfide bridges in WT-SME, by clusters 1, 2, 5, and 6 in what is to become the central region of the α domain (Fig. 3). A second core of residual structure is formed by cluster 3 located at the interface between the α and β domains in the native state.

Kinetic refolding studies indicate that both domains achieve their nativelike structure in folding intermediates formed before the development of the extensive tertiary interactions that span the two domains (29, 30). In the hydrogen exchange measurements, on which this conclusion is based, protection of amide hydrogens from exchange with solvent water occurs significantly faster for residues in the α domain than for those in the β domain. An exception is residue Trp63, whose amide hydrogen becomes protected as rapidly as most of the residues in the α domain, despite its location in the β domain of the native structure. This result suggests that the folding of Trp63 must be associated in some manner with the folding of the α domain. Rothwarf et al. showed in addition that Trp62 and Trp108 are involved in nonnative tertiary interactions in intermediates populated during the refolding (31); replacement of tryptophan residues by tyrosines at either of these two positions results in an increase in the rate of refolding relative to that of the wild-type sequence. Further evidence that Trp62plays an important role in folding arises from the observation that chemical modification of this residue increases the propensity of the protein to misfold (32). It has also been shown that Trp62 is critical for the correct formation of disulfide bonds in peptide fragments lacking residues 1 to 59 and 105 to 129 (33). Furthermore, in a peptide fragment (36 to 105) with native disulfide bonds, tryptophan fluorescence spectra indicate that Trp62 and Trp63 become exposed upon disruption of helical structure at position 88 to 98 through acetylation of Lys96 and Lys97, suggesting that Trp62 and Trp63 interact with residues 88 to 98 (34).

In order to explore the tertiary interactions involving Trp62, 15N R 2 relaxation rates were measured for a mutant protein with a single amino acid replacement, W62G. 15N R 2 relaxation rates for W62G-SME in solution in the absence of urea are shown in Fig. 2C. Remarkably, the marked deviations (Fig. 2B) from random coil behavior observed in the WT protein are virtually absent, although the parameters defining the underlying random coil behavior are unchanged (bold line in Fig. 2, B and C). This result indicates that all the clusters of residual structure are substantially disrupted by the single amino acid replacement at the domain interface. It also reveals that Trp62 must be involved in extensive long-range tertiary interactions in the denatured state of the WT, in order for the effect of its substitution by glycine to be so widespread. As WT-SME does not contain disulfide bonds, these long-range interactions do not require the cross-linking of different regions of the sequence. In addition, the CD spectra of W62G in water and in 8 M urea at pH 2 do not indicate substantive residual helical character under both conditions.

The presence of disulfide bonds does, however, affect the properties of denatured lysozyme. In the oxidized WT protein (Fig. 2D), the deviations of the rates from random coil values are larger than in the reduced protein with R 2 values of up to 20 s−1; for comparison, the maximum value ofR 2 is 7 s−1 in WT-SME(16). These increased relaxation rates have been attributed to the intrinsic effect of cross-linking on the conformational dynamics of the polypeptide chain in the vicinity of the disulfide bonds; the simulated effects of such motions are shown inFig. 2D. This conclusion is, however, challenged by the observation in the present study of the dramatic effects of the W62G mutation on the dynamics of the polypeptide chain in the presence of the disulfide bonds. The variation in R 2 values is substantially diminished as a result of this mutation, although the overall perturbations of the relaxation rates from random coil values still correlate with the location of the disulfide bonds. The relaxation rates of the oxidized W62G mutant are therefore unlikely to reflect significantly the direct contribution of the presence of additional covalent restraints resulting from disulfide cross-linking to the dynamics of the polypeptide chain. Rather, the strikingly larger deviations from random coil behavior observed in oxidized WT compared with WT-SME can be attributed to the effects of the additional restraints imposed by the presence of disulfide bonds on the hydrophobic clusters present in the denatured protein.

The changes in relaxation rates throughout the protein sequence as a result of the W62G mutation indicate that the deviations from random coil behavior in the WT-SME protein must be coupled together by long-range tertiary interactions. In order to probe these effects in more detail, analysis of the chemical shifts of the different residues in WT-SME and W62G-SME was carried out and is shown in Fig. 1B. There is an excellent correlation (r 2 = 0.995) between the HN chemical shifts in the two proteins when residue 62 is excluded from the analysis (Fig. 1B, top). A similar situation pertains to the Hα chemical shifts (Fig. 1B, bottom), although here the correlation was somewhat lower than for HNchemical shifts (r 2 = 0.935). That the mutation does not alter significantly the chemical shift values of residues other than Trp62 and its immediate neighbors indicates that the local structures of the clusters are on average largely unperturbed. We conclude that the observed changes in relaxation rates result primarily from changes in the dynamic rather than structural properties of the various clusters.

In the native structure, Trp62 is highly exposed to solvent; the crystal structure shows that its side chain is substantially disordered (27), and NMR measurements in solution reveal dynamic behavior (35). By contrast, in the denatured states (36), and particularly during the early stages of folding (37), NMR experiments indicate that this residue, like the other tryptophan residues, is largely inaccessible to solvent. Together with data from the hydrogen-exchange protection experiments, we conclude that in these nonnative states the β-domain residues Trp62 and Trp63 associate with a nativelike hydrophobic cluster in the α domain involving Trp108 and Trp111 that is itself strongly linked to the other regions of nonrandom structure. Thus nonnative interactions stabilize a nativelike core. Presumably, the replacement of the large hydrophobic tryptophan residue at position 62 by glycine results in the destabilization of this core. The resulting increase in dynamic flexibility could reflect an increased population of more extended structures in the ensemble of interconverting conformers. A polypeptide chain in such structures will undergo conformational averaging much more rapidly than in compact denatured states, where significant energetic barriers are known to exist.

Our results suggest that, within the ensemble of conformations representing the denatured states of lysozyme, there are long-range interactions that link clusters of residues that are not close together in sequence. These results are consistent with the hypothesis that steps that involve the reorganization of the species formed initially during the refolding of lysozyme are likely to be key determinants of the kinetics of the folding process (29, 30). Although the folding of small proteins is dominated by the search for nativelike contacts, in the case of larger proteins, including those with multiple domains, species with at least some nonnative interactions can be important determinants of the folding process (8). Such interactions appear to be located primarily at the interface between the two structural domains, the region associated with the slowest step in the folding of lysozyme (4).

The avoidance of misfolding and potential aggregation of nonnative species is a key aspect of the folding and long-term stability of proteins. For example, single point mutations in human lysozyme are responsible for the occurrence of systemic disease in which large quantities of amyloid fibrils are deposited in a variety of internal organs (38). That a single amino acid replacement can perturb the nonnative state of the protein is of particular interest, as the aggregation of partially or completely unfolded species is the essential step in the development of the amyloid structures. Thus, although a residue such as tryptophan may be exposed in the native state for functional reasons, it could be buried in the early stages of folding to reduce the tendency of these transiently populated species to aggregate. Such a conclusion leads to the possibility that the sequence of a protein codes for structural characteristics other than those of the native fold.

  • * Present address: Institute for Software Research International, Carnegie Mellon University, Wean Hall 4604, Pittsburgh, PA 15213, USA.

  • Present address: Johann Wolfgang Goethe-University, Center for Biological Magnetic Resonance, Institute for Organic Chemistry, Marie-Curie-Strasse 11, D-60439 Frankfurt am Main, Germany.

  • Present address: Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.

  • § To whom correspondence should be addressed. E-mail: schwalbe{at}


View Abstract

Navigate This Article