Surface Sites for Engineering Allosteric Control in Proteins

See allHide authors and affiliations

Science  17 Oct 2008:
Vol. 322, Issue 5900, pp. 438-442
DOI: 10.1126/science.1159052


Statistical analyses of protein families reveal networks of coevolving amino acids that functionally link distantly positioned functional surfaces. Such linkages suggest a concept for engineering allosteric control into proteins: The intramolecular networks of two proteins could be joined across their surface sites such that the activity of one protein might control the activity of the other. We tested this idea by creating PAS-DHFR, a designed chimeric protein that connects a light-sensing signaling domain from a plant member of the Per/Arnt/Sim (PAS) family of proteins with Escherichia coli dihydrofolate reductase (DHFR). With no optimization, PAS-DHFR exhibited light-dependent catalytic activity that depended on the site of connection and on known signaling mechanisms in both proteins. PAS-DHFR serves as a proof of concept for engineering regulatory activities into proteins through interface design at conserved allosteric sites.

Proteins typically adopt well-packed three-dimensional structures in which amino acids are engaged in a dense network of contacts (1, 2). This emphasizes the energetic importance of local interactions, but protein function also depends on nonlocal, long-range communication between amino acids. For example, information transmission between distant functional surfaces on signaling proteins (3), the distributed dynamics of amino acids involved in enzyme catalysis (46), and allosteric regulation in various proteins (7) all represent manifestations of nonlocal interactions between residues. To the extent that these features contribute to defining biological properties of protein lineages, we expect that the underlying mechanisms represent conserved rather than idiosyncratic features in protein families.

On the basis of this conjecture, methods such as statistical coupling analysis (SCA) quantitatively examine the long-term correlated evolution of amino acids in a protein family—the statistical signature of functional constraints arising from conserved communication between positions (8, 9). This approach has identified sparse but physically connected networks of coevolving amino acids in the core of proteins (812). The connectivity of these networks is remarkable, given that a small fraction of total residues are involved and that no tertiary structural information is used in their identification. Empirical observation in several protein families shows that these networks connect the main functional site with distantly positioned secondary sites, enabling predictions of allosteric surfaces at which binding of regulatory molecules (or covalent modifications) might control protein function. Both literature studies and forward experimentation in specific model systems confirm these predictions (812). Thus, techniques such as SCA may provide a general tool for computational prediction of conserved allosteric surfaces.

The finding that certain surface sites might be statistical “hotspots” for functional interaction with active sites suggests an idea for engineering new regulatory mechanisms into proteins. What if two proteins were joined at surface sites such that their statistically correlated networks were juxtaposed and could form functional interactions (Fig. 1A)? If the connection sites are functionally linked to their respective active sites through allosteric mechanisms intrinsic to each protein, it should be possible to couple the activity of one protein to that of the other.

Fig. 1.

A design concept for allosteric communication. (A) The SCA computational method identifies networks of statistically interacting amino acids in proteins (blue arrows). These networks often link primary functional sites (yellow) with distant surface positions (red) through the core (blue). This motivates the idea of functionally coupling the two proteins (denoted “input” and “output” modules) through linkage at the predicted allosteric sites. (B) A slice through the structure of a complex between the PDZ domain of the cell polarity protein Par6 (cyan surface) with its allosteric regulator, the Cdc42 G protein (white surface) shows an example of SCA network linkage in a natural protein-protein interaction. SCA residues [in CPK representation and colored as in (A)] constitute a contiguous network linking the site of nucleotide exchange on Cdc42 with the ligand-binding pocket of the PDZ domain (both in yellow) through specific residues at the allosteric interface (red).

Evidence from natural systems supports this design concept (Fig. 1B). Ligand binding in a PDZ (PSD95/Dlg1/ZO1) protein interaction module in the human Par6 protein is allosterically regulated by interaction with the guanine nucleotide–binding protein (G protein) Cdc42 at a distant surface site (13, 14). SCA for the PDZ and G protein families reveals a contiguous network of amino acids that connects the nucleotide-binding pocket of Cdc42 with the ligand-binding pocket of the PDZ domain through specific interactions across an allosteric interaction surface (Fig. 1B and figs. S1 to S3). Mutagenesis studies confirm that these networks in PDZ and G proteins contribute to allosteric signaling (11, 14). In both proteins, the allosteric site is uniquely identified as the surface-exposed residues of the SCA network that are nonetheless distant from the active site (fig. S3).

Can statistically correlated allosteric networks be joined to engineer functional communication between proteins? To test this idea, we selected two protein modules as components for creation of a non-natural allosteric two-domain protein in which a signal originating in one domain (the “input” module) is transmitted to influence the activity of the second domain (the “output” module) (Fig. 1A). For the input module, we chose a light-sensing domain from plant phototropin [Avena sativa LOV2 (15)], a member of the Per/Arnt/Sim (PAS) family of signaling modules (16). PAS domains mediate biological responses to a diverse set of stimuli, including aromatic hydrocarbons, gases, redox potential, and light, and share a conserved core structure comprising a five-stranded antiparallel β sheet with two flanking α helices (17) (fig. S4A). A deep ligand-binding pocket opens to one surface of the domain. In the phototropin LOV2 domain, signaling is initiated by light absorption by a flavin mononucleotide (FMN) chromophore bound within the binding pocket, which then transmits the signal through the structure to cause two large conformational changes at the opposite surface: destabilization and unbinding of helical extensions at both the C terminus (the Jα helix) and the N terminus of the core PAS domain (18, 19) (Fig. 2A and fig. S4A). The structural details of the N- and C-terminal extensions varies among members of the PAS domain family, but conformational change at these regions appears to be a conserved feature of allosteric signaling in this protein family (20, 21).

Fig. 2.

Design principle of the PAS-DHFR chimera. (A and B) Surface-exposed SCA sites shown on the LOV2 PAS domain [(A), PDB code 2VOU] and E. coli DHFR [(B), PDB code 1RX2] shown in four successive rotations of each molecule. As in Fig. 1A, SCA network positions are colored yellow (within 5 Å of substrate), red (surface-exposed), or blue (buried). A residue is considered buried if its fractional solvent-accessible surface area is <0.1. In the core PAS domain, this analysis reveals two surface-exposed nonsubstrate proximal SCA sites: (i) the N- and C-terminal regions that, in LOV2, mediate light-dependent interaction with the Jα and N-terminal helical extensions, and (ii) the α3/β4-β5 region (see text and fig. S4). In DHFR, SCA reveals a network (fig. S5) that relates the enzyme active site to a specific distant surface loop (βF-βG, site A, in red). The experiment is to insert the LOV2 domain into several DHFR positions at site A and at a control surface (site B, αC-βE loop).

SCA for a multiple sequence alignment of 1104 core PAS domains mirrors the experimental findings. A spatially contiguous network of correlated amino acids links the ligand-binding pocket to surface-exposed residues at the N- and C-terminal regions of the core domain that, in LOV2, make direct interactions with the Jα and N-terminal helices (Fig. 2A and fig. S4, B and C). A second surface-exposed site is also evident (the α3 helix and β4-β5 linker); this site undergoes light-dependent conformational change in another member of the PAS family, the photoactive yellow protein, but is close to the chromophore-binding pocket (22). These data uniquely identify the N- and C-terminal regions of the LOV2 domain as the logical connection sites in our design experiment.

As the output module, we chose E. coli dihydrofolate reductase (EcDHFR), an enzyme system in which extensive prior structural, biochemical, and theoretical work has established the basic catalytic mechanism and evidence for long-range control of activity. DHFR is an essential enzyme required for folate metabolism in all organisms; it catalyzes the stereospecific reduction of 7,8-dihydrofolate (H2F) to 5,6,7,8-tetrahydrofolate (H4F; Fig. 2B, yellow stick bonds), using nicotinamide adenine dinucleotide phosphate (NADPH; Fig. 2B, green stick bonds) as a cofactor (23). The structure of DHFR comprises a central eight-stranded β sheet (β strands A to H) and four flanking α helices (αB, αC, αE, and αF) that make up an active-site cleft that positions the substrate and cofactor for the catalytic step: transfer of the pro-R hydride from NADPH to position C-6 of H2F (fig. S5A) (24). Changes in both the structure and dynamics of loops surrounding the active site are implicated in DHFR activity (25); in particular, the dynamics of the βF-βG loop (residues 116 to 132) is thought to control transition-state stabilization and to enhance the rate of hydride transfer (khyd). Consistent with these findings, SCA for an alignment of 418 members of the DHFR family uniquely identifies the βF-βG loop as the most distant surface-exposed site showing strong correlated evolution with the enzyme active site (Fig. 2B). [See (26) and fig. S5B for details about SCA mapping and known allosteric mechanism in DHFR.]

To functionally couple the light-dependent allosteric mechanism in the LOV2 domain with the DHFR catalytic mechanism, we created sets of chimeric proteins in which the core LOV2 domain is inserted via its N- and C-terminal helical extensions into DHFR at two different surface sites (sites A and B; Fig. 2B and Fig. 3, A and B). Site A chimeras interrogate the computationally and experimentally defined allosteric surface (A120 to A122, βF-βG loop), and site B chimeras interrogate another surface site (B86 to B89, αC-βE loop) that is similarly distant but statistically uncorrelated with the active site; site B serves as a control for potential nonspecific coupling between the two domains.

Fig. 3.

Light-dependent catalysis in a LOV2-DHFR chimera. (A and B) Schematics of the chimeric constructs in sites A and B and associated hydride transfer rates (khyd) carried out under single-turnover conditions. Data are measured either upon dark adaptation (black bars) or immediately after a 5-min exposure to intense white light (white bars) (26). The A120-noJ chimera lacks the C-terminal Jα helix (a major site of light-dependent conformational change in LOV2), and the A120-C450S chimera carries a point mutation that locks LOV2 in the dark state. No site B chimeras show light-dependent catalysis, but one site A chimera (A120) shows a modest but clear increase in khyd upon light activation. (C) Dark (black curves) and light-activated (red curves) absorbance spectra in the A120 and A120-C450S chimeras show the characteristic 447-nm peak of the FMN chromophore in the dark state of LOV2; in A120 alone, light activation shows the expected transition to the 390 nm–absorbing lit-state species. (D) The kinetics of dark recovery of the catalytic rate follows a single-exponential process (red curve), which closely mimics that kinetics of recovery of the 447 nm–absorbing dark state of LOV2 (blue curve). In all panels, error bars indicate SD.

We began by evaluating the independent activities of the DHFR and LOV2 domains in context of the chimeras. All LOV2-DHFR chimeras were well expressed and rescued growth in the DHFR auxotropic E. coli strain (ER2566ΔfolΔthy) under minimal media conditions (27) (fig. S6), which indicated that insertion of the LOV2-Jα domain did not abolish DHFR activity in any instance. Consistent with the known mutagenic sensitivity of the βF-βG loop (25), site A chimeras showed a factor of ∼1000 reduction in hydride transfer rate (Fig. 3B, wild-type khyd = 220 s–1), a value that translates to a factor of ∼60 change in the overall turnover rate kcat. Site B chimeras showed a more modest decrease in catalytic activity (by a factor of 5 to 7), consistent with the prediction that this site is less influential for active-site function. With regard to LOV2 domain function, the absorbance spectra of dark-equilibrated proteins showed clear evidence of a 447-nm peak consistent with a noncovalently bound FMN chromophore (28), and light activation of all the chimeric enzymes triggered a characteristic spectral shift to a 390 nm–absorbing species due to formation of a covalent thiol-FMN adduct (Fig. 3C and fig. S7). This species showed single-exponential relaxation to the dark-state spectrum (Fig. 3D and table S4) with a rate constant (0.019 s–1) nearly identical to that reported for the isolated domain (29). These findings show that the basic intrinsic features of the PAS domain and DHFR are structurally and functionally intact in the LOV2-DHFR chimeras.

We next examined the chimeras for light-dependent DHFR activity by comparing khyd for matched samples either maintained in the dark or immediately after exposure to light. Rates were measured under single-turnover conditions designed to minimize the effects of dark relaxation of the LOV2 domain. Consistent with the prediction that site B is uncoupled from active-site function, none of the site B chimeras showed light-dependent changes in enzyme activity (Fig. 3A). In contrast, one of the site A chimeras (A120) showed a modest but clear light-dependent DHFR activity (Fig. 3B and table S3). By varying temperature and pH conditions (fig. S8), we determined that light increased the catalytic rate of the enzyme by a factor of 2 at 17°C and by a factor of 1.6 at 25°C (Fig. 3B). The light-dependent effect depends on the known mechanism of signaling in the LOV2 domain; an A120 variant lacking the Jα helix (19) (A120-noJ) or an A120 variant carrying a point mutation in LOV2 that is known to lock the molecule in the dark state (A120-C450S) (30, 31) showed no light dependence, although both constructs showed absolute DHFR activities similar to that of A120 (Fig. 3B). Shifting light-exposed A120 to the dark caused a decay of the light-dependent increase in khyd that followed a single-exponential relaxation (kΔk(hyd), 0.016 s1) (Fig. 3D), observable through several cycles of excitation. The relaxation of enzyme activity nearly matched the rate of thermal relaxation of the LOV2 domain to the dark state (0.019 s–1, Fig. 3D) (29); this result implies that the light-dependent enzymatic activity in A120 is due to establishment of allosteric communication between LOV2 and DHFR.

The magnitude of the effect is small and hardly optimal, given the factor of ∼600 allosteric effect intrinsic to LOV2-Jα (32). In addition, it is clear that not every chimera made at site A showed light-dependent activity (Fig. 3B). Nonetheless, these results show that site-specific connections at allosteric surfaces can, even without directed optimization or mechanism-based design, begin to produce coupled activities in designed chimeric proteins.

In addition to the hydride transfer step, the overall turnover cycle of DHFR involves several other reactions, including NADP+/NADPH exchange after reduction of substrate and rate-limiting release of the product H4F (26, 33). To examine potential allosteric control of these processes in the A120 chimera, we measured the light dependence of the H4F off-rate and of the equilibrium dissociation constant for NADPH (Fig. 4). The H4F off-rate, measured via methotrexate (MTX) competition under conditions saturating in H4F, NADPH, and MTX (34), showed a factor of ∼3 decrease in the A120 chimera, a value comparable to a previously characterized point mutation in the βF-βG loop [G121V (25)] (Fig. 4A). We observed a small but statistically significant light dependence on H4F off-rate that was abrogated in the background of the dark-locked LOV2 mutant (C450S) (table S4). In contrast, we observed no light dependence in the equilibrium dissociation constant for the NADPH cofactor, although the LOV2 insertion into the βF-βG loop showed an effect on NADPH binding similar to that of the G121V mutant (Fig. 4B). Previous work shows that, separate from the structural contributions of the βF-βG loop to cofactor or substrate binding, the dynamics of this loop specifically controls khyd and product release (35). Thus, the engineered interdomain allostery in the A120 chimera likely works mechanistically through light-dependent modulation of βF-βG loop dynamics.

Fig. 4.

Light dependence of product off-rate (A) and cofactor binding (B) in the A120 chimera. In (A), the product dissociation rate (koff, H4F) shows a small light dependence in A120 (factor of ∼1.3) that is abrogated in the background of the dark-locked C450S mutation. (B), Cofactor binding (Kd, NADPH) shows no light dependence. In both assays, the overall effect of LOV2 domain insertion between positions 120 and 121 in the βF-βG loop is similar to that of the point mutation G121V. Error bars indicate SD.

Taken together, the data presented here are consistent with the notion that modular allosteric networks in each protein can be brought together to initiate the formation of new allosteric control. The installation of light-dependent enzymatic activity in the A120 LOV2-DHFR chimera occurred with minimal disruption of the internal biochemical features of each module. Few discernible alterations to the photodynamics of LOV2 were noted, and the changes in DHFR properties were no greater than the effect of point mutation at the surface site used for connection. Although the number of sites tested here is small, the emergence of interdomain allostery through insertion at the βF-βG loop but not at the αC-βE loop is in agreement with the proposal that specific surface locations might act as evolutionarily conserved hotspots for allosteric control. Allosteric effects in proteins could also arise through idiosyncratic variation in individual family members (36), but these results suggest the notion that the linkage of conserved networks of amino acid interactions might represent a statistically preferential strategy for the evolution of allosteric signaling in multidomain proteins.

The engineering of light-dependent allosteric control in the LOV2-DHFR chimera represents an initial step toward a general scheme for the creation of allosteric multidomain systems. As methods become more refined, the computational prediction of potential allosteric surface sites should be combined with physics-based interface design and experimental screening to design high-performance allosteric systems.

Supporting Online Material

Materials and Methods

Figs. S1 to S8

Tables S1 to S4


References and Notes

Stay Connected to Science

Navigate This Article