Computational Design of an Enzyme Catalyst for a Stereoselective Bimolecular Diels-Alder Reaction

See allHide authors and affiliations

Science  16 Jul 2010:
Vol. 329, Issue 5989, pp. 309-313
DOI: 10.1126/science.1190239


The Diels-Alder reaction is a cornerstone in organic synthesis, forming two carbon-carbon bonds and up to four new stereogenic centers in one step. No naturally occurring enzymes have been shown to catalyze bimolecular Diels-Alder reactions. We describe the de novo computational design and experimental characterization of enzymes catalyzing a bimolecular Diels-Alder reaction with high stereoselectivity and substrate specificity. X-ray crystallography confirms that the structure matches the design for the most active of the enzymes, and binding site substitutions reprogram the substrate specificity. Designed stereoselective catalysts for carbon-carbon bond-forming reactions should be broadly useful in synthetic chemistry.

Intermolecular Diels-Alder reactions are important in organic synthesis (13), and enzyme Diels-Alder catalysts could be invaluable in increasing rates and stereoselectivity. No naturally occurring enzyme has been demonstrated (4) to catalyze an intermolecular Diels-Alder reaction (1, 2), although catalytic antibodies have been generated for several Diels-Alder reactions (3, 4). We have previously used the Rosetta computational design methodology to design novel enzymes (5, 6) that catalyze bond-breaking reactions. However, bimolecular bond-forming reactions present a greater challenge, because both substrates must be bound in the proper relative orientation in order to accelerate the reaction and impart stereoselectivity. Also, previous successes with computational enzyme design have involved general acid-base catalysis and covalent catalysis, but the Diels-Alder reaction provides the opportunity to alter the reaction rate by modulation of molecular orbital energies (7). To investigate the feasibility of designing intermolecular Diels-Alder enzyme catalysts, we chose to focus on the well-studied model Diels-Alder reaction between 4-carboxybenzyl trans-1,3-butadiene-1-carbamate and N,N-dimethylacrylamide (Fig. 1, substrates 1 and 2, respectively) (8).

Fig. 1

The Diels-Alder reaction. Diene (1) and dienophile (2) undergo a pericyclic [4 + 2] cycloaddition (3) to form a chiral cyclohexene ring (4). Also shown in (3) is a schematic of the design target active site, with hydrogen bond acceptor and donor groups activating the diene and dienophile and a complementary binding pocket holding the two substrates in an orientation optimal for catalysis.

The first step in de novo enzyme design is to decide on a catalytic mechanism and an associated ideal active site. For normal-electron-demand Diels-Alder reactions, frontier molecular orbital theory dictates that the interaction of the highest occupied molecular orbital (HOMO) of the diene with the lowest unoccupied molecular orbital (LUMO) of the dienophile is the dominant interaction in the transition state (7). Narrowing the energy gap between the HOMO and LUMO will increase the rate of the Diels-Alder reaction. This can be accomplished by positioning a hydrogen bond acceptor to interact with the carbamate NH of the diene (thus raising the energy of the HOMO energy and stabilizing the positive charge accumulating in the transition state), and a hydrogen bond donor to interact with the carbonyl of the dienophile (lowering the LUMO energy and stabilizing the negative charge accumulating in the transition state) (9). Quantum mechanical (QM) calculations predict that these hydrogen bonds can stabilize the transition state by up to 4.7 kcal mol–1. (fig. S1). In addition to electronic stabilization, binding of the two substrates in a relative orientation optimal for the reaction is expected to produce a large increase in rate through entropy reduction (10). Thus, a protein with a binding pocket (Fig. 1) that positions the two substrates in the proper relative orientation and has appropriately placed hydrogen bond donors and acceptors is expected to be an effective Diels-Alder catalyst.

We used the Rosetta methodology to design in silico enzyme models containing active sites with the desired properties (Fig. 1). The design methodology starts from three-dimensional atomic models of minimal active sites (theozymes) consisting of the reaction transition state and protein functional groups involved in binding and catalysis. We chose the carbonyl oxygen from a glutamine or asparagine to hydrogen bond with the N-H of the diene carbamate and the hydroxyl from a serine, threonine, or tyrosine to hydrogen bond with the carbonyl oxygen of the dienophile amide moiety (Fig. 1). QM calculations were carried out to determine the geometry of the lowest free energy barrier transition state between substrates and product in the presence of these hydrogen bonding groups. Starting from these coordinates, a large and diverse ensemble of distinct minimal active sites was then generated by systematically varying the identity and rotameric state of the catalytic side chains, the hydrogen bonding geometry between these residues and the transition state, and the internal degrees of freedom of the transition state (figs. S4 and S5).

By using RosettaMatch (11), we searched a set of 207 stable protein scaffolds for backbone geometries that allow the two catalytic residues and the two substrates, oriented as in one of the minimal active sites, to be placed without making substantial steric clashes with the protein backbone. A hashing technique allows efficient searching through the very large number of distinct sites (4). From the set of 1019 possible active site configurations, about 106 could be matched in a stable protein scaffold. Each match was then optimized by using RosettaDesign (12) to maximize transition state binding while not clashing with bound substrates or product (4). These designs were filtered on the basis of satisfaction of catalytic geometry, transition state binding energy, and shape complementarity between designed pocket and the transition state (4). A total of 84 designs were selected for experimental validation.

Genes encoding these 84 designs were synthesized with a C-terminal six-histidine affinity tag and expressed in Escherichia coli. Fifty of the designed proteins were soluble; these were purified by using affinity chromatography, and Diels-Alder activity was monitored by using a liquid chromatography–tandem mass spectrometry assay in a phosphate-buffered saline (PBS) solution at pH = 7.4 and 298 K (4). Two designs (DA_20_00 and DA_42_00) were found to have Diels-Alderase activity. The active design DA_20_00 was created from a six-bladed β-propeller scaffold [Protein Data Bank identification code (PDB ID) 1E1A; a diisopropylfluorophosphatase from Loligo vulgaris, 13 mutations, fig. S6A]. As observed for many native β-propeller enzymes, the functional groups that play key roles in catalysis—a glutamine carbonyl group and a tyrosine hydroxyl group that provide the activating hydrogen bonds—are located in the middle of one side of the propeller. The rest of the pocket is lined with hydrophobic residues that form a tight shape-complementary surface (Fig. 2A). The active design DA_42_00 was created from the ketosteroid isomerase scaffold (PDB-ID 1OHO, 14 mutations, fig. S4B). The active site is quite different than that of DA_20_00 in that only the carbon-carbon bond-forming portion of the diene and dienophile is actually buried within the protein.

Fig. 2

Structure of a designed Diels-Alderase. (A) Surface view of the design model (DA_20_00, green) bound to the substrates (diene and dienophile, purple). The catalytic residues making the designed hydrogen bonds are depicted as sticks. (B) Overlay of the design model (DA_20_00, brown) and crystal structure of DA_20_00_A74I (green). (C) Contribution of designed residues to catalysis assessed through reversions back to the native amino acid at each position individually. Colors indicate the reduction in activity upon reversion to the native amino acid; twofold (blue) to >10 fold (red). The figure was generated with PyMol (17).

To further improve the catalytic activity of DA_20_00 and DA_42_00, we mutated residues that were in direct contact with the transition state in each designed enzyme individually to sets of residues that were predicted to retain or improve transition state binding and bolster the two catalytic residues. A set of six mutations [A21 → T21 (A21T) (13), A74I, Q149R, A173C, S271A, and A272N] was found to increase the overall catalytic efficiency of DA_20_00 by over 100-fold relative to the original design model (Table 1; we refer to the DA_20_00 protein with these six additional mutations as DA_20_10, fig. S6C). Three of the mutations improve the packing around the transition state (A74I and A21T) and the catalytic glutamine (A173C). Two of the mutations likely improve the overall electrostatic complementarity with the bound substrates: Q149R hydrogen bonds to the carboxylate on the diene, and S271A makes the dienophile environment more nonpolar. The last mutation (A272N) reverts a designed alanine residue back to the native asparagine: Molecular dynamics simulations (4) suggested that the catalytic tyrosine can flip into an alternative conformation not positioned to activate the dienophile, and a larger residue at 272, such as the native asparagine, was predicted to hold the tyrosine in the conformation required for catalysis.

Table 1

Kinetic parameters for DA_20_00, DA_20_10, and DA_42_04. Reactions for DA_20_00, DA_20_10, and DA_42_04 were carried out at 298 K (4). The errors represent the calculated 95% confidence interval. Kinetic parameters for catalytic monoclonal antibodies (mAb) 7D4 and 4D5 at 310 K were taken from (9, 14). The kuncat for the Diels-Alder reaction at 298 K was found to be 2.44 × 10−2 M−1 hour−1, in good agreement with the previously reported value at 310 K of 4.29 × 10−2 M−1 hour−1.

View this table:

For DA_42_00, a set of four mutations (Q58R, L61M, A99N, and V101I) (13) was found to increase the observed catalytic activity roughly 20-fold over the original design (Table 1; we refer to the DA_42_00 protein with four additional mutations as DA_42_04, fig. S6D). As in the case of DA_20_00, all of these mutations increase the size of the amino acid and either improve packing or electrostatic interactions with the ligand.

To investigate the contributions of the two catalytic residues in DA_20_10 to catalysis, we mutated glutamine 195 into a glutamate (Q195E) and tyrosine 121 into a phenylalanine (Y121F) (13). We had originally incorporated a glutamine rather than a glutamate at position 195, despite the fact that the carboxylate is more effective than the amide at increasing the energy of the diene HOMO, because we were concerned about the unfavorable contribution of carboxylate desolvation to substrate binding. Furthermore, QM calculations predict that the amide group of glutamine can simultaneously interact with the diene and dienophile, resulting in a 2 kcal mol–1 lower activation barrier than if glutamate was used as a catalytic residue (fig. S1C). Indeed, the Q195E mutation showed almost complete loss of activity (450-fold less activity), illustrating the sensitivity of the enzyme to the details of the designed active site. The Y121F mutation decreases catalytic activity 27-fold, consistent with the removal of a hydrogen bond that contributes to dienophile binding and a lowering of its LUMO.

The kinetic parameters of the DA_20_00, DA_20_10, and DA_42_04 catalyzed reactions were determined by measuring the dependence of the reaction velocity on the concentration of both diene and dienophile (4). The kinetic parameters are summarized in Table 1, and double reciprocal plots for DA_20_10 and DA_42_04 are shown in Fig. 3, A and B. DA_20_10 has an effective molarity [catalytic rate constant/uncatalyzed rate constant (kcat/kuncat) = 89 M] 20 times greater than those of the catalytic antibodies 7D4 (9) and 4D5 (14) previously elicited for the same reaction. DA_42_04 binds both the diene and the dienophile more tightly [significantly lower Michaelis constant (KM)] than DA_20_10, but the kcat is 100-fold lower, suggesting that the orientation of the two substrates relative to each other and/or to the catalytic groups is not optimal.

Fig. 3

Kinetic characterization. (A) Dependence of reaction velocity for DA_20_10 on diene concentration for different fixed dienophile concentrations. The diene concentration was varied from 3.0 to 0.18 mM with fixed concentrations of 100 mM (◢), 66 mM (◆), 44 mM (◆), 30 mM(▲), 20 mM(▼), and 13 mM (●) dienophile. (B) Dependence of reaction velocity for DA_42_04 at varying diene concentrations for different fixed dienophile concentrations. The diene concentration was varied from 2.0 to 0.06 mM with fixed concentrations of 100 mM (◢), 50 mM (■), 25 mM (◆), 13 mM(▲), 7 mM(▼), and 3 mM (●) dienophile. Reaction conditions are described in (4).

At high substrate concentrations, DA_20_10 proceeds for more than 30 turnovers with some loss of activity over time due to aggregation (4). At high enzyme concentrations, more than 80% of the diene substrate is converted to product (fig. S7). These properties suggest that de novo designed enzymes could be useful as catalysts in production-level chemical syntheses.

Some Diels-Alder reactions can be accelerated by binding within a nonspecific hydrophobic pocket (15). This, however, does not appear to be the case for the reaction studied here: E. coli cell lysate, cyclodextrins, and bovine serum albumin have either no effect or actually inhibit the reaction (table S2). The importance of the active site binding geometry is highlighted by a comparison of DA_42_04 and DA_20_10: DA_42_04 binds the substrates much more tightly but has a much lower kcat. To further probe the sensitivity of DA_20_10 catalysis to the details of the active site geometry, we reverted each of the 15 residues constituting the active site one at a time to its identity in the original scaffold. Remarkably, nine of the reversions completely abolished activity; the other six mutations decreased activity by 1.5-fold to 10-fold (Fig. 2C and table S5). The reversions that significantly reduced the catalytic activity of DA_20_10 are primarily in the core of the binding site, whereas mutations that had less of an effect on activity are closer to the active site rim. Similar sensitivities were observed for mutations that disrupt binding in DA_42_04 (4). Thus, although the catalytic efficiencies of the computationally designed Diels-Alderases are small in comparison with those of native enzymes, they exhibit similar sensitivity to the details of the active site and provide much more than a general hydrophobic environment.

To determine how well the structure of DA_20_00 matched the design model, we solved the crystal structure of one of the active variants of DA_20_00 (harboring the A74I mutation; Fig. 2B). The crystal structure solved to 1.5 Å resolution (table S4 and Fig. 2B) shows atomic-level agreement with the design model, with an all-atom root mean square deviation (RMSD) of 0.5 Å. The major deviation between the crystal structure and the design model is in a surface loop, which appears to be pulled back from the predicted active site (RMSD on residues 32 to 46, 0.93 Å). The conformations of the side chains at the active site in the crystal structure are close to those in the design model; taken together with the reversion data described above and the complete lack of activity observed for the starting scaffold (fig. S8), these results strongly suggest that the experimentally observed activity is generated by the designed active site.

The Diels-Alder reaction studied here can, in principle, produce eight different isomeric products, four of which are experimentally observed in the reaction in solution (9). The computational design was directed at the transition state that yields the 3R,4S endo product, which only comprises 47% of the total product mixture formed in the uncatalyzed reaction. To determine the stereoselectivity of DA_20_10 (Fig. 4), we used a previously described liquid chromatography–tandem mass spectrometry assay with a chiral column (16). Consistent with the design, DA_20_10 only catalyzes the formation of the expected 3R,4S product (>97%, fig. S9).

Fig. 4

Absolute stereoselectivity of DA_20_10. The transition states that lead to the four possible ortho-stereoisomers are shown above the reaction chromatograms. Background reaction: 2 mM diene and 70 mM dienophile in a PBS solution for 24 hours at 298 K. DA_20_10 reaction: 50 μM protein, 0.5 mM diene, and 10 mM dienophile in a PBS solution for 48 hours at 298 K (4).

Besides stereoselectivity, the level of control over a chemical reaction by a designed enzyme is reflected by its substrate specificity. To investigate the substrate specificity of DA_20_10, we characterized product formation with six different dienophiles that share the same acrylamide core but have different nitrogen substituents (Fig. 5). The catalytic activity against each of the substrates was measured by using a liquid chromatography mass spectrometry assay (4). DA_20_10 was observed to strongly favor the substrate for which it was designed. Even slight changes, such as adding a methyl group to the N,N-dimethylacrylamide (Fig. 5, 2A versus 2B), significantly decreased the activity of DA_20_10, consistent with the tight packing of the active site around the two substrates.

Fig. 5

Control of substrate specificity. Reactions were carried out with 0.2 mM diene 1 (Fig. 1), and 10 mM of one of the six dienophiles depicted above in PBS at 298 K in the absence or presence of 60 μM DA_20_10 or DA_20_10_H287N. The values in the figure are the mean (colored bars) and standard deviation (error bars) of four independent measurements of the product peak area (arbitrary units) formed per hour (4).

In addition to the ability to catalyze new reactions with high substrate specificity and stereoselectivity, one of the promises of de novo enzyme design is that once an initial active enzyme is engineered it can be easily modified to catalyze similar reactions with alternate substrates. To explore this possibility, we mutated histidine 287 on one side of the dienophile binding pocket in DA_20_10 to asparagine and several other residues. The H287N mutation has a substrate specificity profile different from DA_20_10; in particular there is a 13-fold switch in specificity for dienophile 2E relative to 2A, while the selectivity against 2F is maintained (Fig. 5). The specificity switch may have two origins: The histidine in the crystal structure clashes with the larger substrates, and the amino group on the asparagine can hydrogen bond with the hydroxyl in 2E.

Although we have succeeded in computationally designing an enzyme that catalyzes an enantio- and diastereoselective intermolecular reaction, there is much room for improvement in our computational design methods. Only 2 of the 50 designed enzymes tested had measurable activity, and a much higher success rate and higher overall activities are desirable. The differences in kcat of the two designed enzymes suggest that more precise control over the orientation of the two substrates relative to one another and to the catalytic residues could result in considerably more active designs. On the experimental side, by analogy with our previous results with computationally designed Kemp eliminases (5), it should be possible to increase the activity of these enzymes by directed evolution.

The agreement between the designed and the experimentally observed substrate specificity and stereoselectivity of DA_20_10 is notable given the importance of selectivity in organic chemistry reactions. The capability to rationally control both substrate specificity and stereoselectivity via designed enzymes opens up new avenues of research in both basic and applied chemistry.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S13

Tables S1 to S5


  • * These authors contributed equally to this work.

  • Arzeda Corporation, 2722 Eastlake Avenue East, Suite 150, Seattle, WA 98102, USA.

References and Notes

  1. Materials and methods are available as supporting material on Science Online.
  2. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.
  3. The PyMOL Molecular Graphics System, Schrödinger, LLC, Version 1.2r3pre.
  4. This work was supported by the Defense Advanced Research Projects Agency (DARPA), the Howard Hughes Medical Institute (HHMI), a Molecular Biophysics traineeship from the NIH for J.B.S., and the Lawrence Livermore National Laboratory Lawrence Scholars program for G.K. We thank M. Toscano (ETH) and C. Rosewall (University of Washington) for chemical synthesis and B. Siegel and Y.-h. Lam for helpful comments on the manuscript. The x-ray crystallographic coordinates have been deposited in the Protein Data Bank with accession ID 3I1C. The University of Washington has submitted a patent application on the protein sequence coding for the engineered enzymes reported here as well as some of the design methodology.
View Abstract

Navigate This Article