## How biomolecules fold

In order to fold, biomolecules must search a conformational energy landscape to find low-energy states. There are peaks in the landscape where the molecules must occupy unstable conformations for a short time. Neupane *et al.* used optical tweezers to observe these transition paths directly for single nucleic acid and protein molecules (see the Perspective by Wolynes). They measured a distribution of times taken to cross the transition path and found that the shape of the distribution agrees well with theory that assumes one-dimensional diffusion over the landscape.

## Abstract

Transition paths, the fleeting trajectories through the transition states that dominate the dynamics of biomolecular folding reactions, encapsulate the critical information about how structure forms. Owing to their brief duration, however, they have not previously been observed directly. We measured transition paths for both nucleic acid and protein folding, using optical tweezers to observe the microscopic diffusive motion of single molecules traversing energy barriers. The average transit times and the shapes of the transit-time distributions agreed well with theoretical expectations for motion over the one-dimensional energy landscapes reconstructed for the same molecules, validating the physical theory of folding reactions. These measurements provide a first look at the critical microscopic events that occur during folding, opening exciting new avenues for investigating folding phenomena.

Biomolecular folding is famously complex, involving a diffusive search over a multidimensional conformational energy landscape for the lowest-energy structure (*1*). The most critical parts of the folding pathway, dominating the dynamics, are the transition states, the unstable intermediates through which a molecule must pass when changing conformation (*2*). A key goal in folding studies has been to observe molecules as they traverse a particular path through the transition states, providing a direct view of the behavior in the transition states. Such transition paths (Fig. 1A) are very short-lived, however, and are moreover inherently a single-molecule phenomenon, features that make them very challenging to observe experimentally. As a result, although transition paths have been studied in computational simulations (*3*, *4*), folding experiments have primarily used indirect means of characterizing transition states (*2*). The inability to observe transition paths has also limited direct experimental tests at the microscopic level of the fundamentally diffusive nature of folding.

Recently, advances in single-molecule methods have enabled the first glimpses of the transition paths in folding reactions. By analyzing photon statistics from high–time-resolution single-molecule fluorescence spectroscopy measurements, the average transit time across transition paths, τ_{tp}, was found for two small proteins, α_{3}D and a WW domain (*5*, *6*), and upper bounds were determined for the protein GB1 (*5*, *7*) and a DNA hairpin (*8*). Average transition path times were also found indirectly for both protein and nucleic acid folding from force spectroscopy measurements, in which single molecules were unfolded and refolded under tension applied by optical tweezers (*9*), using diffusive theories to determine the value for τ_{tp} expected from the measured energy landscape for folding (*10*–*12*). Statistical features of the transition paths, such as the conditional probability of being on a transition path as a function of the reaction coordinate (*13*), were also investigated, revealing that the folding was well described by one-dimensional (1D) diffusion (*14*). Despite these advances, however, it has not been possible to discern directly the properties of individual transition paths.

Here we describe measurements observing transition paths directly in the folding of single molecules, using high-resolution force spectroscopy. Force spectroscopy is an especially powerful tool for studying transition paths, because many transitions can be measured from a single molecule, yielding robust statistics. We first studied the two-state DNA hairpin 30R50/T4 (Fig. 1B, inset), which has been characterized extensively in previous work (*10*, *14*, *15*), especially through measurements of the energy landscape underlying its folding properties (*16*–*19*). Folding and unfolding transitions of single hairpins connected via DNA handles to beads held in two stiff optical traps (Fig. 1B) were measured in equilibrium at constant trap separation, at a load close to the force required to populate the folded and unfolded states equally, *F*_{½} (*20*). The time resolution of the measurement was improved more than fivefold from previous studies of τ_{tp} (*10*, *11*), to about 6 to 11 μs (fig. S1).

From equilibrium trajectories of the extension of the molecule (Fig. 1C), individual transitions (Fig. 1D, red) were identified as those crossing between boundaries *x*_{1} and *x*_{2} (Fig. 1D, cyan) defining the barrier region separating the folded and unfolded states (Fig. 1D, orange). Examining individual transition paths for unfolding (Fig. 2A) and refolding (Fig. 2B), their duration was found to vary widely, from less than 10 μs to over 100 μs. Moreover, diverse shapes were observed: The speed often varied greatly along the paths, and noticeable pauses occurred at one or more points in the transition. These pauses occurred in the barrier region, providing the first direct visualization of a host of transient, high-energy transition intermediates. In many transitions, the hairpin shuttled back and forth between different extensions, directly demonstrating, at the microscopic level, the diffusive nature of folding.

To test quantitatively the physical picture of folding as a diffusive search, we studied the duration of the transition paths. The transit time for barrier crossing in each transition, *t*_{tp}, was measured directly from the extension trajectory as the time required to cross from one boundary to the other (Fig. 1D). For consistency, the boundaries were chosen to define the barrier region as the middle half of the total extension change between the folded and unfolded states, Δ*x*_{UF} (*20*). Measuring transit times individually for 24,591 unfolding transitions and 24,600 refolding transitions, the average value, τ_{tp}, was found to be 27 ± 2 μs for unfolding and 28 ± 2 μs for refolding (errors represent SEM). These average times were slower than the upper bound for τ_{tp} estimated previously for a much shorter DNA hairpin (*8*), but similar to the value for an engineered protein (*6*). Notably, τ_{tp} was roughly 1000-times smaller than the lifetimes of the unfolded and folded states. As expected from symmetry under time reversal (*21*), τ_{tp} was the same for both directions.

These results allowed us to test the theory of transition paths experimentally. For example, assuming 1D diffusive motion over a harmonic barrier in the high-barrier limit (Δ*G*^{‡} > 2 *k*_{B}*T*), τ_{tp} is related to the diffusion coefficient, *D*, by
(1)where Δ*G*^{‡} is the barrier height, κ_{b} is the stiffness of the barrier, γ is Euler’s constant, and β = 1/*k*_{B}*T* is the inverse thermal energy (*7*, *21*). Previously, the measured landscape profile (*16*) and folding/unfolding rates (*15*) for hairpin 30R50/T4 were used to calculate *D* from Kramers’ equation for diffusive barrier crossing (*22*), in turn enabling calculation of τ_{tp} from Eq. 1. These earlier results, τ_{tp} = 30 ± 6 μs for unfolding and 33 ± 8 μs for refolding (*10*), agree very well with those from the direct measurements, validating Eq. 1. Because τ_{tp} is in principle a more robust measure than approaches that estimate *D* from rates using Kramers’ theory (*6*, *23*), we used Eq. 1 to refine the previous estimate of *D*. Using the barrier parameters from the reconstructed landscape for this hairpin (*16*), Δ*G*^{‡} = 9.1 ± 0.1 *k*_{B}*T* and κ_{b} = 0.29 ± 0.02 *k*_{B}*T*/nm^{2}, we found *D* = 4.4 ± 0.4 × 10^{5} nm^{2}/s, which is very close to the value of 4.6 ± 0.5 × 10^{5} nm^{2}/s estimated previously (*10*).

We next tested a proposed relationship between τ_{tp} and kinetic rates (*21*, *24*)
(2)where *k*_{F} and *k*_{U} are the rates for folding and unfolding, respectively; *P*_{F} and *P*_{U} are the equilibrium probabilities to be in the folded or unfolded states; and *p*(TP) is the fraction of time spent on transition paths. Each of these quantities could be measured directly from the extension trajectories. Comparing the measured τ_{tp} values with the estimates from Eq. 2 based on folding and unfolding rates (Fig. 3A), we found excellent agreement for both folding and unfolding across a range of forces, from the hairpin being mostly folded (*P*_{U} ~ 0.03) to mostly unfolded (*P*_{U} ~ 0.8). This agreement validates Eq. 2; furthermore, it emphasizes that the time-reversal symmetry of τ_{tp} holds across the full range of state occupancies.

Even more interesting than the average transit time is the distribution of the individual transit times, because the variability in transit times reflects the fundamentally statistical nature of the folding process (*1*). The distribution of transit times, *P*_{TP}(*t*), for unfolding transitions (Fig. 3B, black) had the same shape as that for refolding (Fig. 3B, green), as expected from the time-reversal symmetry of the problem (*21*). The transit times were broadly distributed, with a peak around 10 μs and a long exponential tail (Fig. 3B, inset). This behavior is similar to that expected for transit across harmonic barriers in the high-barrier limit in the Kramers’ regime: *P*_{TP}(*t*) is predicted to have the form
(3)where ω_{K} = β*D*κ_{b} sets the time scale for the decay of the exponential tail (*21*). The exponential tail of *P*_{TP}(*t*) on its own can also be approximated (*21*) by

Fits of the two distributions to Eq. 3 (Fig. 3B, solid lines) and Eq. 4 (Fig. 3B, dashed lines) were barely distinguishable, yielding ω_{K} = 6 ± 3 × 10^{4} s^{−1} for both folding and unfolding. This result yields *D* = 2 ± 1 × 10^{5} nm^{2}/s, which is close to the values calculated from the measured τ_{t}_{p} using Eq. 1 and estimated from rates via Kramers’ theory. However, the barrier height returned by the fit, Δ*G*^{‡} ~ 0.4 *k*_{B}*T*, was much lower than that measured directly from landscape reconstructions (*16*–*19*), reflecting the fact that there were more fast transitions than would be expected from the theory for 1D harmonic barriers.

DNA hairpins represent a powerful model system for exploring the physical basis of folding phenomena, but their folding is simpler than that of proteins because they have only secondary structure. We therefore sought to apply the same approach to protein folding. To this end, we focused on a specific structural transition in the prion protein PrP that occurs during the formation of non-native structure in PrP dimers (*20*). This transition undergoes unusually slow conformational diffusion (*25*), making it much easier to measure transit times. The transit times were measured directly from extension trajectories, using the same criterion as for the hairpins. Here, however, the transit times were much longer, up to the millisecond scale (Fig. 4A). Once again, the transit times for both unfolding and refolding were broadly distributed with an exponential tail, and the same distribution was observed for both unfolding (Fig. 4B, black) and refolding (Fig. 4B, green).

As expected, τ_{tp} was again the same for unfolding and refolding: 0.5 ± 0.1 ms from 1766 transitions. These values also matched (within error) the value expected from Eq. 1, τ_{tp} = 1 × 10^{0 ± 0.3} ms, which was calculated (*25*) based on the properties of the energy landscape reconstructed for this transition (Δ*G*^{‡} = 4 ± 1 *k*_{B}*T* and κ_{b} = 2 ± 0.5 *k*_{B}*T*/nm^{2} at *F*_{½}) and the diffusion coefficient estimated from the rates using Kramers’ theory (*D* = 1 × 10^{3 ± 0.3} nm^{2}/s). Fitting the transit time distributions to Eq. 3 (Fig. 4B, solid lines) and Eq. 4 (Fig. 4B, dashed lines), ω_{K} was again similar for both folding and unfolding. The result, ω_{K} = 3 ± 1 × 10^{3} s^{−1}, implied *D* = 1.3 ± 0.6 ×10^{3} nm^{2}/s, in excellent agreement with the result found from the rates and landscape reconstruction using Kramers’ theory (*25*). However, the barrier height returned by the fit, Δ*G*^{‡} = 0.5 ± 0.3 *k*_{B}*T*, was once again lower than the measured value, because (as for the hairpin) more short transit times were observed than would be expected from the 1D harmonic theory.

For both molecules, the transit time distributions thus agreed reasonably well with the expectations from 1D harmonic approximations to the previously measured landscapes. The primary discrepancy was a slight bias in the transit time distributions toward shorter times, which caused the fitted barrier height to be lower than the measured height. This bias might arise from a breakdown in the approximations used to derive Eq. 3 (*21*), such as anharmonicity in the barriers or the need to include higher dimensionality in the landscape, or it could reflect the influence of the dynamics of the beads and handles to which the molecules are tethered (*23*, *26*–*28*), which were not included in the analysis because they are difficult to treat (*28*). The fact that values of the diffusion coefficient (a fundamental descriptor of the dynamics) obtained from different experimental variables using 1D theories are similar suggests that these 1D descriptions of folding (*8*, *11*, *14*, *19*, *29*) can hold even at the microscopic level, despite their many simplifying assumptions.

The ability to observe and characterize transition paths opens up many exciting avenues to explore in folding studies by allowing more direct investigation of transition states and the microscopic thermally driven motions that underlie the conformational search. Previously invisible microstates along the transition paths may now be detectable, permitting their properties to be characterized directly. Moreover, it may be possible to distinguish different classes of transition paths having different properties, such as barrier heights, intermediates, or roughness. The potential for deeper integration of experiment and simulation through direct comparisons of the transition path properties found experimentally to the results of atomistic simulations is also exciting (*4*). Because the transit time is so sensitive to the diffusion coefficient *D* (*4*, *6*, *23*), such measurements also hold great promise for investigating the effects of solvent viscosity and internal friction (*4*, *6*, *30*).

## Supplementary Materials

## References and Notes

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵Materials and methods are available as supplementary materials on
*Science*Online. - ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
**Acknowledgments:**We thank D. Makarov and A. Szabo for helpful discussions. This work was supported by the Alberta Prion Research Institute, Alberta Innovates (AI) Technology Futures, AI Health Solutions, the Natural Sciences and Engineering Research Council, and National Research Council Canada. M.T.W., K.N., and D.R.D. designed the research; F.W. provided new reagents; K.N., D.A.N.F., and H.Y. performed measurements; K.N., D.R.D., and M.T.W. analyzed the data; and M.T.W., K.N., D.R.D., D.A.N.F., and H.Y. wrote the paper.