Single-Molecule Fluorescence Experiments Determine Protein Folding Transition Path Times

See allHide authors and affiliations

Science  24 Feb 2012:
Vol. 335, Issue 6071, pp. 981-984
DOI: 10.1126/science.1215768


The transition path is the tiny fraction of an equilibrium molecular trajectory when a transition occurs as the free-energy barrier between two states is crossed. It is a single-molecule property that contains all the mechanistic information on how a process occurs. As a step toward observing transition paths in protein folding, we determined the average transition-path time for a fast- and a slow-folding protein from a photon-by-photon analysis of fluorescence trajectories in single-molecule Förster resonance energy transfer experiments. Whereas the folding rate coefficients differ by a factor of 10,000, the transition-path times differ by a factor of less than 5, which shows that a fast- and a slow-folding protein take almost the same time to fold when folding actually happens. A very simple model based on energy landscape theory can explain this result.

Theory predicts that folding mechanisms are heterogeneous, so that an individual unfolded molecule can self-assemble to form its biologically active, folded structure by means of many different sequences of conformational changes (1). The distribution of these folding pathways can now be calculated from atomistic molecular dynamics simulations (26). Information on pathway distributions from experiments must come from measurements on single molecules, because only average properties are obtained in experiments on the large ensemble of molecules in bulk experiments. A single-molecule, equilibrium protein folding-unfolding trajectory is illustrated in Fig. 1, as monitored by Förster resonance energy transfer (FRET) spectroscopy, and its relation to the free-energy barrier as it crosses between the folded and unfolded states is shown. The most interesting part of the trajectory is contained in what appears to be an instantaneous jump between the two states, called the transition path, which contains all of the information on the mechanism of folding and unfolding. The first step toward observing transition paths in protein folding, which we report here, is the determination of its average duration (transition-path time) for a fast-folding, all-β protein [39-residue formin-binding protein (FBP) WW domain] shown to be two-state in ensemble studies (7, 8), as well as a markedly reduced upper bound compared with our previous study for the 56-residue, α/β protein GB1(the B1 immunoglobulin-binding domain of protein G from Streptococcus) (9). In contrast to a rate coefficient, which measures the frequency of a transition, the transition-path time is the duration of a successful barrier-crossing event (Fig. 1).

Fig. 1

Schematic of a folding transition path for a two-state protein. (A) The kinetics of protein folding is described by energy landscape theory as diffusion on a one-dimensional free-energy surface with an order parameter (x) as a reaction coordinate (1, 15, 25, 26). The unfolded molecule spends the vast majority of time visiting a large number of conformations in the free-energy well of the unfolded state. A transition path is the part of the trajectory that crosses the reaction coordinate x at x0 and reaches x1 on the other side of the barrier without recrossing x0 (12). The duration of this part is the transition-path time. (B) FRET efficiency trajectory. In the typical experiment, the donor and acceptor FRET fluorophores are attached to cysteine residues, which are closer on average in the folded state (higher FRET efficiency) than in the unfolded state (lower FRET efficiency). The duration of the jump in the FRET efficiency trajectory is the transition-path time. The FRET efficiency monitors reconfiguration of the polypeptide backbone to form the native fold but is most probably blind to the annealing of side chains. Consequently, the transition path measured by FRET is expected to be shorter than the transition path monitored by side-chain contacts, for example, in a molecular dynamics simulation (6).

The strategy used in this study is to illuminate dye-labeled protein molecules at very high intensities to increase the number of detected photons per transition path, to discard the majority of photons from the less-interesting segments of the trajectories between transitions, and to analyze the transition region with a maximum likelihood method by using simple models for the transition path.

Photon trajectories were measured for immobilized WW domain and protein GB1 molecules with donor and acceptor fluorophores attached to cysteines incorporated into the proteins (Fig. 2). In these trajectories, two properties of each photon were recorded—the color, either donor green or acceptor red, and the absolute time of arrival to within ~0.5 ns. As shown in Fig. 3, A and B, transitions between states are clearly resolved in the binned fluorescence and photon trajectories, and the FRET efficiency distributions (Fig. 3, C and D) are bimodal, which indicates the presence of two states. The photon trajectories were extracted from the region near the transitions and analyzed using the Gopich-Szabo maximum likelihood method (10).

Fig. 2

Schematic of immobilized folded proteins showing donor (green-emitting) and acceptor (red-emitting) fluorophores. The proteins are attached to a polyethyleneglycol (PEG)–coated glass surface via a biotin-streptavidin-biotin linkage (11).

Fig. 3

Representative fluorescence and photon trajectories and FRET efficiency histograms of WW domain and protein GB1. (A and B) For the fluorescence trajectories—donor (green) and acceptor (red)—photons were collected in 50-μs bins for the WW domain and in 100-μs bins for protein GB1. Measurements were made at 293 K at high illumination intensity (A) in 3 M GdmCl in 50% glycerol for the WW domain (20 kW/cm2, ~650 photons/ms) and (B) in 4 M urea for protein GB1 (10 kW/cm2, ~350 photons/ms). Strings of arrival times and colors of donor and acceptor photons (photon trajectories) in the transition region (the 80-μs yellow-shaded regions) are displayed below the binned fluorescence trajectories. Dashed vertical lines in the photon trajectories indicate the most probable transition interval found by the Viterbi algorithm (11). The absolute times refer to the start of data collection, ~100 ms before the laser was turned on. (C and D) FRET efficiency histograms. The mean FRET efficiencies for the WW domain were calculated (C) for each of the 50-μs bins for the trajectories with the mean photon count rate >400 ms–1 and (D) for folded and unfolded segments of protein GB1 containing ~2500 photons.

For a given model, the Gopich-Szabo method calculates the parameters of the model that can most accurately reproduce the photon trajectories (Fig. 3). We adopt a one-step model for the transition path, which may be viewed as the simplest discrete representation of how the FRET efficiency changes along the path. This picture can be represented in a kinetic model for a two-state system with a finite transition path by introducing a third virtual state, S, for which the FRET efficiency is midway between the folded and unfolded states [ES = (EF + EU)/2]. In this model, the lifetime of S (τS) corresponds to the average transition-path time, 〈tTP〉 (Fig. 4A). S has the property of a transition state, because the rate coefficients from S to F and S to U (kS) are the same, and therefore, the pfold = ½.

Fig. 4

Determination of average transition-path times in a kinetic model. (A) Schematic of a FRET efficiency trajectory using a one-step model to describe the transition path from unfolded (U) to folded (F) states for a protein exhibiting two-state kinetics and thermodynamics. The average transition-path time, 〈tTP〉, is equal to the lifetime of a virtual intermediate state S [τS = (2kS)–1]. (B and C) The difference of the log likelihood, Δln L = ln LS) – ln L(0), between the two-state model with a finite transition-path time and a two-state model with an instantaneous transition-path time is plotted as a function of τS (B) for the WW domain in 3 M GdmCl in 50% glycerol and (C) for protein GB1 in 4 M urea. The horizontal dashed line at Δln L = +3 represents the 95% confidence limit for the significance of the peak in (B), and the intersection of the likelihood function with the horizontal dashed line at Δln L = –3 in (C) yields the 95% confidence limit for the upper bound of τS.

The likelihood function for the jth photon trajectory is (10):Lj=vfinTi=2N{n F(ci)exp[(Kn)τi]} n F(c1)vini (1)Here, K is the rate matrix [equation S6 (11)] containing the three rate coefficients (kF′, kU′, and kS), N is the number of photons in the jth trajectory, ci is the color of the ith photon (donor or acceptor), and τi is a time interval between the ith and (i – 1)th photons as shown in fig. S4B (11). The photon color matrix F depends on the color of a photon as F(acceptor) = E and F(donor) = IE, where E is a diagonal matrix with elements that are FRET efficiencies of the three states (F, S, and U), and I is the unit matrix. n is a diagonal matrix with elements that are photon count rates of the three states. vini and vfin are vectors that describe the state (folded or unfolded) at the beginning and the end of the trajectory. Practically, log-likelihood functions were calculated, and the total log likelihood function of all trajectories was calculated by summing the log-likelihood functions (lnL=jlnLj) of individual trajectories that contain a transition between folded and unfolded states. In the likelihood function L, τS is the only variable parameter (11).

The difference of the log-likelihood functions, Δln L = ln LS) – ln L(0), as a function of τS, is plotted in Fig. 4B for the WW domain. This function was calculated from 527 transitions between the folded and unfolded states. In this plot, the likelihood at τS = 0, L(0)=limτS0L(τS) is the value for a two-state model where every transition between folded and unfolded states is instantaneous, i.e., it occurs faster than the shortest photon interval. Therefore, the plot displays how much better (or worse) a two-state model with a finite transition-path time describes the photon trajectories than a two-state model with an instantaneous transition. There is a highly significant peak in the likelihood function in Fig. 4B at 16 (±3) μs. (The error is the standard deviation obtained from the curvature of the peak.) Simulations of photon trajectories show that, if Δln L at the peak is higher than a certain confidence level, the value of τS at the peak corresponds to the assumed τS and does not arise from statistical fluctuations (fig. S6) (11). We used a confidence level that satisfies a condition LS)/[LS) + L(0)] = 0.95, which assures 95% confidence in the significance of the maximum and corresponds to Δln L ≈ 3 (the dashed horizontal lines in Fig. 4). The value of 16 μs at Δln L = 7.8 is therefore a well-determined quantity and corresponds, in our model (Fig. 4A), to the average transition-path time 〈tTP〉. That 〈tTP〉 is the same for folding and unfolding transitions is shown in fig. S5 (11), which is consistent with the requirement of microscopic reversibility that 〈tTP〉 for a barrier crossing be the same in both directions (12).

To extrapolate the value of 〈tTP〉 to the viscosity in the absence of glycerol, we determined the rate coefficients at different viscosities (table S1) (11). Using a linear free-energy relation to account for the change in stability resulting from the addition of glycerol and guanidinium chloride (GdmCl), we find that the rate coefficients for folding and unfolding depend inversely on the first power of the viscosity (11), so 〈tTP〉 should scale the same way (see Eqs. 2 and 3 below). Because the viscosity of 3 M GdmCl in 50% glycerol solution is found to be 10 times that of 2 M GdmCl (11), our best estimate of 〈tTP〉 in the absence of a viscogen at 293 K is ~2 μs.

We have used the simplest possible model for determining 〈tTP〉. However, more realistic models that depict a more gradual change in the FRET efficiency along a transition path—with two and three steps in the FRET efficiency in the transition path between states instead of just one (Fig. 4A)—yield very similar values for 〈tTP〉 (fig. S9) (11). We also found that the value of 〈tTP〉 is not sensitive to the choice of the FRET efficiency for S, as long as the value is between the two FRET efficiencies of the folded and unfolded states (0.6 ≤ ES ≤ 0.7) (fig. S7) (11).

For proteins with very low free-energy barriers, it may be possible to estimate 〈tTP〉 from ensemble measurements. Gruebele and co-workers have studied the kinetics of the ultrafast-folding, 33-residue FiP35 WW domain, which has a very similar fold to that of our WW domain (FBP28) and ~30% sequence identity (13). Prior to the ~10-μs folding-unfolding relaxation at the melting temperature of ~350 K, a ~1.5-μs relaxation was observed, which was called a “molecular phase” and attributed to a change in the small population of molecules at the top of a low free-energy barrier in response to the temperature jump. No molecular phase was observed for the FBP WW domain (7), presumably because it is a slower folder owing to a higher barrier, and there is therefore no detectable amplitude from the change in the barrier top population. In this interpretation, Gruebele’s ~1.5-μs relaxation corresponds to the lifetime, τS, of our kinetic model for the transition path (Fig. 4).

Shaw and co-workers have simulated equilibrium trajectories of the FiP35 WW domain using all-atom molecular dynamics calculations (4). They found 〈tTP〉 to be 0.5 (±0.1) μs at 360 K using the TIP3P explicit water model (6). After rescaling for the difference in viscosity compared with real water, the simulated 〈tTP〉 becomes ~1.5 μs (14). Although the sequences for the two WW domains are different, the finding of similar values for 〈tTP〉 from the simulations and both ensemble and single-molecule experiments provides support for the accuracy of the simulations, for Gruebele’s interpretation of the molecular phase, and for our interpretation of the single-molecule photon trajectories.

The folding time of protein GB1 in 4 M urea is ~1 s. This time is far too long to observe folding transitions in trajectories simulated by atomistic equilibrium molecular dynamics, which makes even an upper bound for the transition-path time an interesting quantity. In previous work (9), we were able to determine an upper bound of ~200 μs, based on an analysis of individual trajectories. The photon count rate in those experiments was only 50 ms–1, and the average time before photobleaching was ~100 ms. In the present experiments, the much higher count rate of 350 ms–1 from the increased illumination intensity, together with the collective analysis using the maximum likelihood method, has allowed us to determine a much more accurate upper bound. The penalty for the higher photon count rate is that the lifetime of the trajectories is shortened to ~10 ms by the more intense illumination, and transitions, albeit clearly resolved (Fig. 3B), are only observed in a very small fraction of the trajectories. Measurement at 4 M urea (with no added glycerol) of trajectories for ~47,000 molecules yielded just 114 transitions.

These 114 transitions were analyzed with the same model as for the WW domain. No peak is observed in the Δln L versus τS plot (Fig. 4C), so 〈tTP〉 is too short to measure. Nevertheless, the analysis permits a determination of an upper bound for 〈tTP〉. By analogy to the significance of the peak for the WW domain, we can set a confidence level for the answer to the question: How long can 〈tTP〉 be before it becomes inconsistent with the data? The 95% confidence level that τS in a two-state model with a finite transition path is less consistent with the photon trajectories than a two-state model with an instantaneous transition path is given by its value at Δln L ≈ –3. In other words, 〈tTP〉 cannot be longer than τS at Δln L = –3 and is therefore an upper bound on 〈tTP〉. As shown in Fig. 4C, this upper bound is ~10 μs.

The major result of our experiments is that, whereas the folding rate coefficients for the WW domain and protein GB1 differ by four orders of magnitude, 104 s–1 and 1 s–1, the transition-path times differ by less than fivefold (~2 μs and <10 μs), which shows that a fast- and a slow-folding protein take almost the same time to fold when folding actually happens.

It is interesting that a simple model by A. Szabo, based on describing the kinetics of folding for a two-state system as diffusion over a barrier on a one-dimensional free-energy surface as in the energy landscape theory of Wolynes, Onuchic, and co-workers (1, 15), can explain this result. According to Kramers’ theory for such a barrier crossing (Fig. 1A), the folding time (τF = 1/kF) is given by:τF=2πD*βωω*exp(βΔGF*)τ0exp(βΔGF*)(2)where D* is the diffusion coefficient at the barrier top, ω2 is the curvature of the unfolded well (near x0 in Fig. 1A), –(ω*)2 is the curvature at the barrier top, β = 1/kBT (where kB is Boltzmann’s constant and T is temperature), and ΔGF* is the height of the folding free-energy barrier (1620). For ω = ω*, 〈tTP〉 is approximately given by (9, 12):


The model predicts that 〈tTP〉 is insensitive to the barrier height and that fast- and slow-folding proteins will have similar transition-path times as long as there are only small differences in the curvatures and the diffusion coefficients (i.e., small difference in τ0). The diffusion coefficient depends on the roughness of the underlying energy landscape and could therefore differ substantially among proteins (2123). The best current estimate for τ0 of fast-folding proteins is ~1 μs (24), which predicts a ratio of 〈tTP〉 for protein GB1 and the WW domain of 1.4, compared with the experimental ratio of <5, if we assume the same τ0 for the two proteins. This ratio varies from 1.3 to 1.8 for τ0 between 0.1 and 10 μs.

Our determination of an average transition-path time is a first step toward the goal of obtaining information on the distribution of folding pathways from measurements of interdye distance versus time trajectories during transition paths. However, the result of this first step by itself has turned out to be extremely interesting. Folding involves a complex and intricate rearrangement of a polypeptide chain to form a unique structure, yet the time for this nontrivial self-assembly process is almost the same for two proteins with different topologies and vastly different folding rates.

Supporting Online Material

Materials and Methods

Figs. S1 to S9

Table S1

References (2737)

References and Notes

  1. Materials and methods are available as supporting online material on Science Online.
  2. In the case of protein GB1, there is the possibility of a sparsely populated intermediate between the folded and unfolded states (1720). In this study, we have implicitly defined the transition-path time for both the WW domain and protein GB1 in terms of just the two deep minima of the folded and unfolded states.
  3. Clarke and co-workers (22) have found, for example, domains with similar structures and stability that have folding rates that differ by ~3000-fold. The slower-folding domains show very little dependence on solvent viscosity, which suggests a large internal friction and, therefore, a much smaller D* (22, 23).
  4. Acknowledgments: We thank I. Gopich, A. Szabo, and G. Hummer for numerous helpful discussions and A. Aniana for technical assistance in the expression and purification of proteins. This work was supported by the Intramural Research Program of the National Institute of Diabetes and Digestive and Kidney Diseases, NIH.
View Abstract

Stay Connected to Science

Navigate This Article