Watching helical membrane proteins fold reveals a common N-to-C-terminal folding pathway

See allHide authors and affiliations

Science  29 Nov 2019:
Vol. 366, Issue 6469, pp. 1150-1156
DOI: 10.1126/science.aaw8208

A pathway for helical membrane proteins

Membrane proteins are inserted into cell membranes while they are being translated and may fold concurrently into their secondary and tertiary structures. Choi et al. describe a single-molecule force microscopy technique that allowed them to monitor folding of helical membrane proteins in vesicles and bicelles. Two helical membrane proteins, the Escherichia coli rhomboid protease GlpG and the human β2-adrenergic receptor, both folded from the N to the C terminus, with structures forming in units of helical hairpins. In the cell, this would allow these proteins to begin folding while being translated.

Science, this issue p. 1150


To understand membrane protein biogenesis, we need to explore folding within a bilayer context. Here, we describe a single-molecule force microscopy technique that monitors the folding of helical membrane proteins in vesicle and bicelle environments. After completely unfolding the protein at high force, we lower the force to initiate folding while transmembrane helices are aligned in a zigzag manner within the bilayer, thereby imposing minimal constraints on folding. We used the approach to characterize the folding pathways of the Escherichia coli rhomboid protease GlpG and the human β2-adrenergic receptor. Despite their evolutionary distance, both proteins fold in a strict N-to-C-terminal fashion, accruing structures in units of helical hairpins. These common features suggest that integral helical membrane proteins have evolved to maximize their fitness with cotranslational folding.

Tens of thousands of mutations associated with diseases are thought to affect membrane protein folding and trafficking (1). The biogenesis of most helix-bundle membrane proteins has been divided conceptually into two stages (2, 3). First, cotranslational insertion of the hydrophobic protein into the membrane occurs through the Sec translocon pathway (4, 5), thereby establishing much of the transmembrane helical structure and initial topology. Second, the protein completes folding to its final tertiary structure. The two stages, however, are not necessarily cleanly separable (69). Studying folding mechanisms of membrane proteins by single-molecule force spectroscopy has been challenging and limited mostly to observing unfolding (1013), because folding intermediates are usually invisible at the lower forces where folding occurs on a practical time scale. Here, we use physicochemical conditions that strongly favor folding, thereby enabling the observation of folding at forces high enough to achieve 1-nm resolution.

To develop an experimental method that can be generally applied to the observation of the folding pathways of polytopic membrane proteins, we built on a single-molecule approach that we have developed using magnetic tweezers (MT) (Fig. 1A) (14, 15). We linked DNA handles to the N- and C-termini of the protein using a SpyTag-SpyCatcher attachment system (16). The handles are in turn attached to a magnetic bead and a polymer-coated glass surface, respectively. The target membrane protein is embedded in bicelles that provide a lipid bilayer–like environment (Fig. 1A). While applying pN- to tens-of-pN–scale force to the magnetic bead, we record the vertical position of the bead relative to a reference bead stuck on the surface (referred to as the extension value) (fig. S1).

Fig. 1 Physicochemical search for refolding conditions of polytopic helical membrane proteins.

(A) Schematic of single-molecule MT folding experiment for a single GlpG protein reconstituted in a bicelle. N, north; S, south; Dig–anti-dig, digoxigenin–anti-digoxigenin; CHAPSO, 3-([3-cholamidopropyl]dimethylammonio)-2-hydroxy-1-propanesulfonate; DMPC, 1,2-dimyristoyl-sn-glycero-3-phosphocholine; H1 to H6, TM helices 1 to 6; bp, base pairs; PEG, polyethylene glycol. (B) FEC of single GlpG proteins averaged over 28 cycles of mechanical stretching and relaxation (black heat map). To show individual unfolding events, representative raw traces are overlaid above 20 pN tension (blue traces). The yellow trace shows the mean extension value in the relaxation phase. Fh,start and Fh,end indicate the force levels at which the coil-to-helix transition starts and ends. Theoretical FECs for the N, Uc, and Uh states are shown as red, light blue, and pink dashed lines, respectively. The upper inset shows a close-up view of the unfolding events. The lower inset shows a close-up view of the FEC between 4- and 8-pN force. PG, DMPG. (C) Average Fh,start and Fh,end values determined under different refolding conditions. n = 34, 42, and 58 FECs for the 0, 10, and 30 mol % DMPG cases. n = 16 for the no bicelle case. n = 6 and 10 for the A206G and L155A single-point GlpG mutant cases. All error bars represent mean ± SD. (D) Refolding probability determined using a simple force-jump experiment (see fig. S3) at different applied force levels (n = 125, 147, and 111 force-jump experiments for the 0, 10, and 30 mol % PG cases, respectively). The inset shows the refolding probability normalized to the 0 mol % PG case. (E) Allan deviation of the magnetic-bead fluctuation at different force levels. The inset is a representative trace showing the Brownian fluctuation of a magnetic bead at different force levels (black trace, raw data at 1.2-kHz sampling; yellow trace, a median-filtered data with a 5-Hz window).

We applied this method to study the folding of Escherichia coli rhomboid protease GlpG. Figure 1B shows force-extension curves (FECs) averaged over multiple cycles of mechanical stretching and relaxation for single GlpG proteins. At high-force levels above 20 pN, the single GlpGs show cooperative unfolding of six transmembrane (TM) helices to an unstructured polypeptide (referred to as the unfolded coil or Uc state) while exhibiting two unfolding intermediates, as we described previously (Fig. 1B, upper inset) (14, 15). When relaxing the mechanical tension, we detected a gradual transition in the FEC from the theoretical curve for Uc to a more compact state that is dependent on the presence of bicelles (Fig. 1, B and C, and fig. S2). We designate the new state as unfolded helical (Uh) because it fits the FEC expected for a state in which all α-helical structures are restored for the TM helices and linkers. But the protein remains fully stretched along the pulling direction (Fig. 1B). Below the Uc-to-Uh transition, the FEC shifts to a yet more compact state (Fig. 1B, bottom inset), which we call the Uz state (unfolded zigzag state). As discussed below, the Uz state appears to consist of bilayer-inserted, but weakly interacting, TM helices arranged in a zigzag-like fashion. Finally, at low forces, GlpG finds its folded, native conformation (referred to as the N state) (Fig. 1B). Because formation of the native state could only be achieved at low tension below 2 pN, Brownian motions of the magnetic bead preclude observation of any detailed intermediates during the refolding process under these conditions.

In hopes of seeing folding at higher forces where extension measurements can be more precise, we screened for more favorable folding conditions using a simple force-jump experiment (fig. S3) (10, 17, 18). We found that addition of 1,2-dimyristoyl-sn-glycero-3-phosphorylglycerol (DMPG) is effective in enhancing GlpG refolding. When we add 30 mole % (mol %) DMPG lipids in the bicelle phase, the refolding probability after a waiting time of 200 s at 5-pN tension increases by a factor of seven (Fig. 1D). The FECs obtained with and without 30 mol % DMPG lipids almost exactly overlap with one another, preserving the coil-to-helix transition as well as the formation of the Uz state (figs. S2 and S4). These observations suggest that the addition of negatively charged lipids does not fundamentally alter the folding pathway but selectively enhances refolding commencing from the Uz state.

With the ability to observe folding at higher forces, we tested the potential for achieving high resolution by examining the Brownian motion of magnetic beads in the bicelle phase. With high-speed tracking at 1.2 kHz and the force above 5 pN, we obtained an Allan deviation (i.e., uncertainty in our tracking) of less than 1 nm when median filtered at 5 Hz (200 ms), corresponding to a resolution of a few amino acids (Fig. 1E). Also, we observed folding with a reasonable probability up to 8-pN tension, the force at which the Uz state starts to form (Fig. 1D).

On the basis of these observations, we developed a force-application protocol to monitor the folding process of single GlpG proteins (Fig. 2). We first induced full unraveling of GlpG to the Uc state by applying a high mechanical tension above 20 pN and then made a force jump to a low-force level between 5 and 8 pN (Fig. 2A). The force jump takes a finite time of ~300 ms, during which single GlpGs relax to the Uz state (Fig. 2B, right). We experimentally confirmed that the force jump indeed reaches the same extension state as that reached through slow gradual force relaxation at −1 pN s−1 (Fig. 2B, left). When maintained at the low-force level, the magnetic bead begins to show complex up-and-down movements, finally culminating in a compact N state (Fig. 2A, 5-pN phases). Achievement of the N state is verified by observing the extension expected for the N state after jumping the tension back to ~20 pN. By repeating our designed mechanical cycle, we can observe the folding of single GlpG molecules multiple times.

Fig. 2 Direct observation of single GlpG folding.

(A) Designed mechanical cycle for inducing refolding of single GlpG proteins. The gray and black traces are 1.2-kHz raw data and 5-Hz median-filtered data, respectively. (B) Representative time-resolved traces comparing the extensions following slow force relaxation and force jump. The lower inset shows the extension difference (ΔzUz) at the indicated force levels. The right inset shows a close-up view of the trace of the force jump that takes ~300 ms. (C) Representative folding traces for wild-type (WT), A206G, and L155A GlpG under 6-pN tension. The insets to the right show close-up views of the traces exhibiting reversible transitions among the Uz, I1, and I2 states. (D) BIC values for the indicated number of states. n = 20, 21, and 11 low-force folding-unfolding traces for the WT, A206G, and L155A cases, respectively. (E) Positions of the intermediate states in the normalized extension space [at 6 pN, n is same as in (D); at 5 pN, n = 21, 24, and 14 low-force folding-unfolding traces for WT, A206G, and L155A, respectively]. Error bars represent SEM. (F) Transition kinetics between the neighboring states at the indicated force levels. In the N-to-I2 transition, both slow (inset, black) and fast (red) rates are displayed. Error bars represent SEM.

With the reduced Brownian motion of the magnetic bead in the 5- to 8-pN force window, we can see distinct conformational changes during refolding when the extension traces are median filtered down to 5 Hz (Fig. 2C, black traces). Application of hidden Markov modeling (HMM) and Bayesian information criteria (BIC) to the time-resolved extension traces indicates that the data are best fit by a total of four states: two intermediate states (referred to as I1 and I2) in addition to the Uz and N states (Fig. 2, C to E, and fig. S5) (19).

The magnetic beads show many upward (i.e., local unfolding) and downward movements (i.e., local folding) before reaching the native N state extension. Thus, the resultant time-resolved traces report reversible intermediate folding-unfolding events—equilibrium processes from which we can directly reconstruct the folding energy landscape. The local folding and unfolding processes pass through the same I1 and I2 intermediate states, justifying a one-dimensional representation of the energy landscape (20, 21). Our HMM analysis indicates that although rates connecting non-neighboring states are negligible, the transitions connecting the neighboring states are well described by single rates falling in a narrow region between 0.1 to 10 s−1 (Fig. 2F and fig. S6). An exception is the N-to-I2 transition that shows two different rates. One N-to-I2 transition is relatively fast, with an average dwell time in N of only ~5 s, indicating incomplete refolding (Fig. 2F, red symbols, and fig. S6). The other subset of the N states has higher stability, requiring higher forces (~8 pN) to show unfolding within our observation time, and presumably corresponds to a correctly folded state (Fig. 2F, inset, and fig. S7).

We next examined whether we could extend our MT folding experiment to true bilayers by reconstituting GlpG in vesicles produced through slow detergent removal (Fig. 3A and fig. S8). For the vesicle-reconstituted GlpGs, we observed cooperative unraveling of six TM helices to the Uc state, similar to what is observed for the bicelle-reconstituted GlpGs (Fig. 3B, first stretching cycle). We also observed the expected coil-to-helix transition during the relaxation phase (Fig. 3C) as well as complete refolding, albeit with low probability (Fig. 3B, sixth stretching cycle). Because there are no free vesicles, these observations suggest that single GlpGs remain bound to the vesicle membranes after their unraveling to unstructured polypeptides. Nevertheless, many of the vesicle-reconstituted GlpGs fail to refold (e.g., Fig. 3B, second and fifth stretching cycles). The refolding probability at 1 pN is only ~15% with a 200-s wait time, in contrast with a refolding probability approaching 100% seen for the bicelle-reconstituted GlpGs. We also found that the FECs of vesicle-reconstituted GlpGs persistently follow the Uh curve and fail to form the loosely stretched Uz state (Fig. 3C, lower inset).

Fig. 3 Characterization of folding properties for vesicle- and bicelle-reconstituted single GlpGs.

(A) Schematic of single-molecule MT folding experiments for GlpGs reconstituted in vesicle membranes. (B) Representative FECs showing successive stretching cycles applied to a single GlpG protein in a vesicle membrane. (C) FEC of vesicle-reconstituted single GlpG proteins averaged over five cycles of mechanical stretching and relaxation (black heat map). To show individual unfolding events, representative raw traces during stretching are overlaid (blue traces). Other definitions are the same as in Fig. 1B. The upper inset shows the average Fh,start and Fh,end values. The lower inset shows a close-up view of the FEC between 2- and 5-pN force. (D) Refolding probability determined under different membrane conditions. n is the number of trials; Δt is the waiting time at 1-pN force. (E) Representative traces of the folding protocol that directly induces the Uz state. The inset shows the extension changes (ΔzUnfolding) during unfolding at 8 pN under indicated membrane conditions. The pink dashed line is the expected extension change when reaching the Uh state. (F) Normalized transition rates determined for the A206G and L155A single-point mutants relative to the WT case (dashed line). The illustrations to the right show an anticipated conformational status of GlpG in each indicated state. Error bars represent SEM. (G) Representative force-jump experiments applied for the intermediate states. Each inset shows the distribution of extension values recorded during high-force unfolding. Estimated extensions for individual states are shown as the dashed lines. (H) Folding energy landscapes of a single GlpG protein along the molecular extension reconstructed on the basis of the Bell-Zhrukov (black trace) and the Dudko-Hummer-Szabo models (green trace). The inset shows detailed structural segments in the folding pathways of GlpG. Each structural segment is indicated by a different color. Black-colored amino acids correspond to the boundaries of the intermediate states. Faint colors around the boundaries represent the measurement errors (SD). Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.

We hypothesized that polypeptide insertion into bilayers may be more difficult in vesicles compared with bicelles, which would explain the barrier to folding and the block to formation of the Uz state if the Uz state was indeed membrane inserted. To explore this possibility, we tested whether decreasing vesicle size would enhance folding, because increased bilayer curvature may allow more facile insertion of the TM helices into the membrane (22). Indeed, when we decreased the diameter of reconstitution vesicles to 100 nm by extrusion, the refolding probability at 1 pN increased to ~60% (Fig. 3D and fig. S8).

We next attempted to avoid membrane extraction and directly access the Uz state by applying a moderate force of 8 pN, a tension at which the Uz state was seen to form in FEC (referred to as the direct-Uz protocol). We tested the feasibility of this protocol first with the bicelle-reconstituted GlpGs and found that application of 8 pN indeed directly induces the Uz state (Fig. 3E, left). Subsequent lowering of the force leads to complete refolding in bicelles (fig. S9). When we applied 8 pN to the vesicle-reconstituted GlpGs, the resultant unfolding step was almost identical to that expected for the Uz state (Fig. 3E, right and inset). When we subsequently induced refolding by lowering the force to 1 pN, the refolding probability in vesicles increased relative to refolding from the Uc state (~50 versus ~15%) (Fig. 3, D and E), consistent with the possibility that this 8-pN unfolding selectively disrupts the tertiary structure while decreasing exposure of the TM helices to the outside of the lipid bilayer, thereby reducing the need to reinsert TM helices during refolding.

Our analysis of the end-to-end distance of the Uz state further suggests that the penetration depths of TM helices in the Uz state might not be enough to completely reach the other side of the lipid bilayer (Fig. 3F, right, and fig. S10). If so, tertiary structure formation would be intimately coupled with the membrane insertion (Fig. 3F, right; compare I2 and N at 6 pN). To determine whether membrane insertion or tertiary structure formation dominates the energy barriers, we examined the local folding-unfolding kinetics of the two single-point GlpG mutants in the bicelle phase (23). The N-terminal [L155A (Leu155→Ala) on TM helix 2] and C-terminal [A206G (Ala206→Gly) on TM helix 4] mutants selectively slow down the I1-to-I2 and the I2-to-N transitions, respectively (Fig. 3F, left, and fig. S11). These observations suggest that the tertiary structure formation makes a major contribution to the observed energy barriers. Moreover, these mutant data are consistent with GlpG folding occurring in a unidirectional manner from the N to C terminus.

To map the partially folded structures in I1 and I2, we made force jumps to ~20 pN while the protein sampled either I1 or I2 in the course of refolding (Fig. 3G). Surprisingly, the extensions after the force jump exactly coincide with the two intermediates of the high-force unfolding (Figs. 3G and 1B, upper inset), which indicates that the low-force folding-unfolding and the high-force unfolding intermediates share the same partially folded structures, albeit with different levels of stretching in the unfolded regions (Fig. 3F, right; compare I2 at 6 pN with I2 at 22 pN). We therefore used the extension difference between I2 and N at 22 pN to estimate that I2 is positioned at the C terminus of TM helix 4 (Fig. 3H, inset). Likewise, we used the extension difference between Uc and I1 at 22 pN to estimate that I1 is positioned after TM helix 2. Combined with the mutant data above, we conclude that GlpG folds in an N-to-C-terminal direction, largely in units of helical hairpins.

On the basis of the structural assignments made above, we examined one more GlpG mutant in which two hydrophobic residues in the long linker region between TM helices 1 and 2 are mutated to negatively charged residues [L121E/F133E (Leu121→Glu/Phe133→Glu)] (Fig. 3H, inset, and fig. S12B). Although such mutations reportedly increase the energy barrier for membrane insertion and flip-flop (24, 25), we did not detect any sign of slowing down in the transition between Uz and I1 (fig. S12). This data supports again our conclusion that the TM helices have made their initial membrane integration as the Uz state forms. In particular, because of the many polar and charged residues in the long linker, we suspect that TM helices 1 and 2 of GlpG are inserted more deeply than other TM helices in the zigzag-aligned Uz state.

Finally, we constructed one-dimensional energy landscapes for the reversible folding-unfolding process of single GlpG proteins using the Bell (26) and the Dudko-Hummer-Szabo models (27). Both models indicate a free-energy difference (ΔG0) of 15.2 kBT (where kB is the Boltzmann constant and T is temperature) between the native N state and the zigzag Uz state, albeit with slightly different intermediate positions (Fig. 3H). As expected, this ΔG0 value is slightly larger than the estimates from previous ensemble measurements under less favorable folding conditions (7.1 to 13.9 kBT) (23, 28, 29). When we apply the Crooks fluctuation theorem to the FECs as shown in Fig. 1B, we obtain an estimate of ~115 kBT for a free-energy difference between the N and the Uc states (fig. S13 and table S1), almost eightfold larger than 15.2 kBT estimated between the N and Uz states. We attribute this larger free-energy difference to additional processes imposed on the high-force unfolding, such as pulling TM helices out of the membrane and disruption of secondary structures. These observations attest to the fundamental difference between the energy barriers seen during high-force unfolding and the low-force folding-unfolding processes. At lower forces, we can explore rearrangements of intact TM helices that occur largely within the lipid bilayer, more closely reflecting the process expected for second-stage folding (2).

Using the experimental methods established with GlpG, we next sought to observe the folding process of a single human β2-adrenergic receptor (β2AR), which belongs to the G protein–coupled receptor family (Fig. 4A). We first examined the FEC and again observed a large mechanical hysteresis in the unfolding and refolding of β2AR (Fig. 4B). Because β2AR has an odd number of TM helices, the DNA handles are pulling on opposite sides of the bilayer, and we note the possibility that after the cooperative unraveling, some part of β2AR may reside within the lipid bilayer (most likely the first TM helix; see fig. S14). During the relaxation phase, we observed the coil-to-helix transition in nearly the same force range as that observed for GlpG. Moreover, below 8 pN, the FEC of β2AR became shorter than the Uh extension, consistent with the formation of a zigzag-aligned Uz state (Fig. 4B and fig. S15).

Fig. 4 Direct observation of the complete folding pathway of human β2AR.

(A) Schematic diagram of the single-molecule MT folding experiment for β2AR. (B) FEC of single β2AR proteins averaged over six cycles of mechanical stretching and relaxation (black heat map). To show individual unfolding events, representative raw traces during stretching are overlaid (blue traces). Other definitions are the same as in Fig. 1B. The inset shows a close-up view of the FEC between 4- and 6-pN force. (C) Designed mechanical cycle for inducing refolding of human β2AR at low-force levels. (D) Representative time-resolved traces for β2AR folding under 5-pN tension (with 2 mM TCEP). Right insets show close-up views of trajectories. Red traces show the transitions between four intermediates identified by the HMM. (E) BIC values for the indicated numbers of states (n = 18 low-force folding-unfolding traces). (F) Transition kinetics between the neighboring states at 5 pN. For the N-to-If4 transition, both slow (inset, black) and fast (red) rates are displayed. Error bars represent SEM. (G and H) Representative traces for the force-jump experiments applied to individual folding intermediates (G) and the native state (H). Each inset shows an extension distribution during high-force unfolding. (I and J) Extension distribution during high-force unfolding initiated from the native N state (n = 29 and 13 low-force folding-unfolding traces for the cases with 2 mM TCEP and without TCEP, respectively). The peaks indicate the fit centers of multiple Gaussian functions (colored for each function). The upper insets show structural diagrams of β2AR to guide mapping onto the structure. (K) Representative β2AR folding trace at 5 pN with no TCEP (n = 10 low-force folding-unfolding traces). HMM analysis finds three intermediate states (If1′, If2′, and If3′). (L) Normalized extensions for β2AR folding intermediates in the absence of TCEP. Dashed lines are anticipated extensions for the intermediates in the presence of 2 mM TCEP. Error bars indicate SD. (M) Representative folding traces for β2AR at 5 pN in the presence of 2.5 μM carazolol (with 2 mM TCEP). The right inset shows the structure of carazolol-bound human β2AR. The dashed yellow circle indicates interaction regions between carazolol and β2AR. (N) Normalized rates determined for carazolol-bound β2AR relative to the apo β2AR case (dashed line). (O) Representative FECs of β2AR showing high-force cooperative unfolding in the presence (orange) and absence (black) of 2.5 μM carazolol (with 2 mM TCEP). N and Uc are defined the same way as in (B). The inset shows distributions of unfolding forces with (orange, n = 26 FECs) and without (black, n = 35 FECs) carazolol. Error bars indicate SD. (P) Detailed structural segments in the folding pathways of β2AR. Each structural segment is indicated by a different color. Other notations are the same as those for the inset in Fig. 3H.

To observe the folding process of human β2AR, we used the original folding protocol starting from the Uc state (Fig. 4C). We first induced mechanical unraveling of a single β2AR Uc state by applying 25-pN tension and then induced the Uz state through force quenching to 5 pN. We reconfirmed that the force quenching within 300 ms yielded the same Uz state as that obtained through slow force relaxation (fig. S16). With the mechanical tension kept at 5 pN, the magnetic bead showed complex up-and-down movements, ending in a compact N state (confirmed to be the native state through reapplication of high force) (Fig. 4C).

By applying the HMM and BIC analyses to the time-resolved extension traces of the magnetic beads, we identified six major states (thus four intermediates) in the folding process of human β2AR (Fig. 4, D and E). Both local folding and unfolding processes share these four intermediate states (referred to as If1, If2, If3, and If4), indicative of the one-dimensionality of the folding energy landscape. The transition rates connecting the neighboring states fall between 10−1 and 10 s−1, whereas all other rates are negligibly small (Fig. 4F and fig. S17). As was the case for GlpG, we observed two groups of the N states: one with a lower stability (Fig. 4F, red symbol) and the other reflecting a correctly folded structure (Fig. 4F, inset).

To measure the number of amino acids unfolded in the structures of the four intermediates, we applied the force-jump technique to each intermediate observed during the low-force folding-unfolding processes. We found that all four intermediates correspond to distinct extension states at 25 pN, reflecting a direct connection between the low-force folding-unfolding and the high-force unfolding intermediates (Fig. 4G). We thus sought to use the extension states of the high-force unfolding intermediates to infer the partially folded structures in individual low-force intermediates. The distribution of intermediate extension states during high-force unfolding clearly revealed a total of nine peaks (Fig. 4, H and I).

To map the unfolded structures on the high-force unfolding intermediates, we took advantage of the fact that in the native structure of human β2AR, there is one conserved disulfide bond formed between Cys106 (C106) and Cys191 (C191), which locks TM helices 3 and 4 and the extracellular linker 2 (ECL2) into one structural unit (Fig. 4, I and J, upper insets). We reasoned that with removal of the reducing agent tris(2-carboxyethyl)phosphine (TCEP), the high-force unfolding intermediates related to the region linked by the disulfide bond might disappear from the extension distribution. Indeed, in the absence of TCEP, the first five peaks (peaks 1 to 5) and the last two peaks (peaks 9 and Uc) are essentially preserved, but the three peaks in the middle (peaks 6 to 8) selectively disappear (Fig. 4, I and J). We further note that in the absence of TCEP, the extension-change spanning the first five peaks is distinctly larger than that spanning the last two peaks by a factor of 1.98. This value closely matches the 2.02 ratio of the number of amino acids placed C-terminal to the disulfide bond [144 amino acid residues (aa)] to that placed N-terminal to the disulfide bond (71 aa) (Fig. 4I). Thus, our observations point to a hypothesis that the first five peaks correspond to unfolding from the C terminus to TM helix 5. In the absence of TCEP, unfolding from ECL2 to TM helix 3 is prevented by the disulfide bond, so that the last two peaks correspond to unfolding of the two N-terminal helices and ECL1. By aligning the unfolding traces in Fig. 4, G and H, with one another, we find that folding intermediates If1, If2, If3, and If4 correspond to the high-force unfolding intermediate peaks 9, 8, 5, and 2, respectively (Fig. 4I and fig. S18). Together, our data suggest that the human β2AR shows unidirectional folding from the N to the C terminus.

To test the validity of our structural assignment, we monitored the folding process at 5 pN in the absence of TCEP. The HMM and BIC analyses indicate a reduction in the number of folding intermediates to three (Fig. 4K and fig. S19). The positions of these three folding intermediates (If1′, If2′, and If3′) matched well with those expected when If2 and If3 are merged (Fig. 4L), reaffirming that the transition from If2 to If3 corresponds to folding of TM helices 3 and 4 and ECL2.

We also examined the effect of carazolol, a partial inverse agonist of human β2ARs, on the 5-pN folding process. Although the presence of 2.5 μM carazolol did not change the positions of the four intermediates in the extension space, it markedly inhibited any transition beyond If3 (Fig. 4M and fig. S19). This inhibition was highly selective because the transition Uz to If3 remained minimally affected (Fig. 4N), suggesting that single human β2ARs fold normally up to ECL2 but fail to fold TM helices 5 and 6 onto the growing structure in the presence of carazolol. When we examined unfolding by force ramping, carazolol increased the forces at which unfolding occurred by 4.5 pN on average (Fig. 4O, inset), indicating that additional work of more than 50 kBT is required to induce the unraveling of single β2ARs in the presence of carazolol (Fig. 4O, shaded area). Thus, our observations suggest distinct effects of carazolol on human β2AR folding and unfolding. Carazolol inhibits the addition of TM helices 5 and 6 during folding, perhaps being loosely located in the incomplete ligand binding pocket formed by TM helices 1 to 4 and sterically interfering with incoming TM helices 5 and 6. In the presence of excess carazolol, it is also possible that carazolol is already bound to TM helices 5 and 6, because carazolol makes an extended aromatic network with the residues on these helices. However, once folded, carazolol binding dramatically stabilizes the tertiary structure, as expected (30, 31).

The identified folding pathway of the human β2AR reveals several interesting features (Fig. 4P). The first intermediate If1 corresponds to an association between the first TM helix and the following linker helix, completing insertion of this nascent structure with respect to the residing membrane structure. The second TM helix folds onto this structure to form the first helical hairpin, completing intermediate If2. The next folding step involves the addition of TM helices 3 and 4 as well as ECL2 (forming If3). We note that the positions of If2 and If3 closely map to the cysteine residues of C106 and C191, thereby potentially consolidating the formed tertiary structure by means of disulfide bonding. The transition from If3 to If4 involves formation of the helical hairpin consisting of the TM helices 5 and 6. This folding step is found to be markedly inhibited in the presence of carazolol. The last step from If4 to N involves addition of the TM helix 7 and the C-terminal membrane-associated helix onto the structure, completing the known structure of human β2AR (30, 31). Although our experimental data consistently support the folding pathway delineated above, we cannot rule out the possibility that an alternative folding pathway exists in the physiological milieu. We also note that the folding pathway presented here is a coarse-grained one down to a 5-Hz sampling rate. Enhancing the bandwidth of our methods would reveal a more complex and dynamic nature of the polytopic membrane protein folding (12). Finally, we note the possibility that the strategy of using disulfide bonds to map the four folding intermediates of β2AR can be extended to other membrane proteins.

Although E. coli GlpG and human β2AR are at an enormous evolutionary distance, both integral membrane proteins accrue structure largely in units of helical hairpins, with a unidirectional N-to-C-terminal folding as a single, predominant pathway out of a countless number of permutations in the possible folding pathways. Unidirectional N-to-C-terminal folding is consistent with several prior studies (28, 3236) and would permit the nascent N-terminal chain to commence folding without needing to wait for the more C-terminal TM helices to be translated, thereby reducing the risk of generating misfolded structures. Thus, the folding processes of integral membrane proteins may be evolutionarily selected and tailored to fit with cotranslational folding.

Supplementary Materials

Materials and Methods

Figs. S1 to S19

Tables S1 to S3

References (3866)

References and Notes

Acknowledgments: Funding: This work was supported by the National Creative Research Initiative Program (Center for Single-Molecule Systems Biology to T.-Y.Y.; NRF-2011-0018352), the Bio Medical Technology Development Program (NRF-2018M3A9E2023523 to T.-Y.Y.), the Basic Science Research Program (NRF-2016R1A6A3A03007871 to D.M.), and a NRF grant (NRF-2016R1A2B4013488 to H.-J.C.), all funded by the National Research Foundation of South Korea. This work was also supported by a National Institutes of Health grant (R01GM063919 to J.U.B.). Author contributions: T.-Y.Y. conceived of the project. H.-K.C., H.-J.C., J.U.B., and T.-Y.Y. designed the experiments. H.-K.C. and H.C.K. performed magnetic tweezers experiments. H.-K.C., S.-H.R., H.J., and T.-Y.Y. analyzed the magnetic tweezers data. M.J.S. built the high-speed magnetic tweezers setup. D.M. and J.U.B. expressed and purified GlpG proteins. H.K. and H.-J.C. expressed and purified β2AR proteins. H.-K.C., D.M., J.U.B., and T.-Y.Y. wrote the manuscript with input from other authors. Competing interests: The authors declare no competing interests. Data and materials availability: All data that support the findings of our study are available in the manuscript or supplementary materials. A program, written in LabView, to control the entire magnetic tweezers setup, which includes a high-speed complementary metal-oxide semiconductor (CMOS) camera, translation and rotation motors, and a piezo lens positioner and allows tracking of three-dimensional positions of magnetic beads at a frequency of up to 1.2 kHz has been deposited in Github ( and is available at Zenodo (37).

Stay Connected to Science

Navigate This Article