Research Article

A data-intensive approach to mechanistic elucidation applied to chiral anion catalysis

See allHide authors and affiliations

Science  13 Feb 2015:
Vol. 347, Issue 6223, pp. 737-743
DOI: 10.1126/science.1261043

Optimizing a catalyst many ways at once

Optimization strategies are often likened to hikes in a hilly landscape. If your goal is to get to the top of the highest hill, and you only take steps toward higher ground, you might never find a peak on a route that requires a preliminary descent. So it is in chemistry, where optimizing each structural feature of a catalyst consecutively might gloss over subtle tradeoffs that in combination offer the best performance. Milo et al. use multidimensional analysis techniques to generate a predictive model of how selectivity depends on multiple characteristics of the catalyst and substrate in a C-N bond-forming reaction (see the Perspective by Lu). They then apply this model to improve the catalyst globally.

Science, this issue p. 737; see also p. 719


Knowledge of chemical reaction mechanisms can facilitate catalyst optimization, but extracting that knowledge from a complex system is often challenging. Here, we present a data-intensive method for deriving and then predictively applying a mechanistic model of an enantioselective organic reaction. As a validating case study, we selected an intramolecular dehydrogenative C-N coupling reaction, catalyzed by chiral phosphoric acid derivatives, in which catalyst-substrate association involves weak, noncovalent interactions. Little was previously understood regarding the structural origin of enantioselectivity in this system. Catalyst and substrate substituent effects were probed by means of systematic physical organic trend analysis. Plausible interactions between the substrate and catalyst that govern enantioselectivity were identified and supported experimentally, indicating that such an approach can afford an efficient means of leveraging mechanistic insight so as to optimize catalyst design.

Catalyst discovery and development often rely on empirical observations gained through the laborious evaluation of multiple potential reaction variables (1). Although high-throughput methods can streamline this process (19), the ability to rationally design catalysts that affect chemical reactions in a predictable manner would be a transformative step forward. In catalysis, the principal challenge lies in inferring how catalyst structural features affect the mechanistic aspects of a given chemical reaction, including those that govern selectivity when multiple products are possible (1012). Although mechanistic studies are able to guide the rational design of catalytic systems, traditional approaches are not often suited to address the complexity of modern catalytic transformations. This limitation is especially apparent in cases in which selectivity is affected by subtle catalyst and substrate structural features (13), and/or the product-determining step of the reaction occurs after the rate-determining step. In order to address such systems, we envisioned a strategy for mechanistic study involving the application of modern data analysis techniques. This approach relies on the generation of mathematical correlations between quantifiable properties describing the interacting reaction partners’ molecular structures (molecular descriptors) and a measurable outcome of the reaction {for example, enantioselectivity, which is represented as the energy difference between transition states leading to either enantiomer ΔΔG = –RTln[(S)/(R)], kilocalories/mole} (14). Combining appropriate experimental design, data organization, and trend analysis techniques provides the basis to distinguish causal relations, producing testable hypotheses regarding the structural origin of the reaction outcome. New catalysts can be designed, and the ability of the models to predict new experimental outcomes can be used as validation of the mechanistic hypotheses. Here, we demonstrate that this approach enables in-depth mechanistic analysis of interactions that govern enantioselectivity, affording nonintuitive insight into the origin of asymmetric induction and guiding rational catalyst design.

Choice of case study

To assess the applicability of this data-driven approach toward mechanistic analysis, a system was sought with a poorly understood mechanism that would be difficult to probe by using standard techniques. Accordingly, the field of chiral anion catalysis was appealing because diverse catalyst-substrate interactions contribute to enantioselectivity, but their distinctive effects are difficult to deconvolute. Particularly, the oxoammonium salt (3)-mediated enantioselective cross dehydrogenative coupling (CDC) catalyzed by chiral 1,1′-Bi-2-naphthol (BINOL)–based phosphoric acids (PAs) (Fig. 1, 1) bearing triazoles at the 3 and 3′ positions reported in 2013 by Toste and coworkers (15) was identified as a prototypical example. This type of reaction could benefit from such an analysis because of the following challenges. First, high levels of enantioselectivity were necessarily predicated on the oxidant’s (1619) insolubility under the reaction conditions, precluding rigorous kinetic analyses. Second, the enantioselectivity trends with respect to both catalyst and substrate were not obvious, with even modest structural modifications resulting in substantial differences (Fig. 1C). Last, we hypothesized that enantioselectivity was governed by attractive noncovalent interactions (15). These subtle interactions are ubiquitous in biological and catalytic systems (13) but are difficult to study or apply toward rational catalyst design, especially if several such interactions are involved in determining reaction outcomes.

Fig. 1 System under study: asymmetric C-N dehydrogenative coupling.

(A) BINOL-based phosphoric acid scaffold, enantioselective cross dehydrogenative coupling reaction scheme, and nitrogen-deletion experiment. (B) Proposed mechanism involving a chiral phosphate-substrate ion pair. (C) Enantiomeric excess (ee) values obtained by using various substrate/catalyst combinations. (D) Library design and parameter identification strategy.

The distinct mechanistic features of triazole-PA catalysts are highlighted by the observation that they result in opposite and enhanced enantioselectivities (Fig. 1A) relative to more conventional PAs such as C8-TRIP (5) and TCYP (6), which are representative of BINOL-based PAs that have seen the most extensive use (2022). Additionally, electronically distinct pyrazolyl (pyr-1e) and imidazolyl (imid-1e) PAs afford products with significantly reduced enantioselectivities relative to the parent triazolyl (Fig. 1A, 1e), despite having nearly identical steric profiles. Although these data allude to selectivity determination via attractive, noncovalent interactions between the catalyst and substrate, such interactions are difficult to further characterize. This limitation is not uncommon in enantioselective catalysis. Thus, our goal was to develop a general, data-driven technique for the evaluation of how subtle structural features affect selectivity, using this reaction as a challenging case study.

Kinetic isotope effects

Before any mechanistic study focused on the origins of selectivity, we sought to establish the enantioselectivity-imparting step (or steps) in the catalytic cycle. Bearing in mind the aim of mathematically relating structural features of the reacting components to enantioselectivity, this knowledge would reveal the elementary step that is being represented by the catalyst and substrate molecular descriptors (vide infra). With respect to the general mechanism in Fig. 1B, we sought to distinguish two scenarios: (i) Enantioselectivity is determined during oxidation of substrate 2, or (ii) enantioselectivity is determined during the cyclization of an oxidized intermediate (Fig. 2A, A). With respect to the former scenario, it was conceivable that although the stereogenic center is formally set in the cyclization from the oxidized intermediate (A), the interactions between the substrate and catalyst during the oxidation event may preorganize the system for effective enantioselection.

Fig. 2 Kinetic isotope effect studies and mechanistic implications.

(A) Considerations regarding the origin of enantioselectivity. (B) KIEs of deuterated enantiomerically enriched substrate 2a-d1 with (R) and (S) PA catalysts: adamantyl-substituted triazolyl (1e) and pyrazoyl (pyr-1e), and TCYP (6). (C) Revised mechanism of enantioselectivity determination.

To distinguish between these possibilities, a set of kinetic isotope (KIE) experiments was performed by using 2a-d1 [90% D incorporation, 74:26 enantiomeric ratio (er)]. We expected that if the chiral phosphate were involved in substrate oxidation, different KIEs would be observed for the formation of 4a-d1 when using enantiomeric catalysts. Indeed, (R)- and (S)-1e promoted the reaction with KIEs of 3.42 and 1.08 respectively, suggesting the involvement of the catalyst in the oxidation (Fig. 2B). If enantioselectivity were also established in this step, different enantiomeric excess (ee) values would be expected when using the enantiomeric catalysts. The observation that the final products exhibited equal but opposite levels of enantioselectivity is consistent with enantioselection occurring during the bond-forming event from an oxidized intermediate such as A. This result indicated that the two key steps of this reaction (oxidation and cyclization) proceed under independent Curtin-Hammett control (10, 11), with similar interactions presumably governing selectivity in both cases (Fig. 2C). The KIE values resulting from the less-selective catalysts pyr-1e and 6 (Figs. 1A and 2B) do not display the same enantiomer-dependency as those from 1e. This effect is consistent with the assertion that similar catalyst-substrate interactions are involved throughout the mechanism, as well as the triazole substituent’s superior ability to interact with substrate 2. However, the specific nature of this interaction remained undefined. To this end, a thorough analysis of an extensive data set containing structural perturbations to the catalyst and substrate could serve to illuminate these enantioselectivity-directing interactions.

Experimental design and analysis

The collection and organization of diverse data sets is at the foundation of data-driven analysis strategies (23). Therefore, an effective experimental library should include rational changes to various structural features that affect the reaction outcome of interest. To this end, substrates (2) were modified at positions hypothesized to influence enantioselectivity (at the 2-, 4-, and 6- positions of the benzyl ring and the distal aryl ring) (Fig. 1D), using substituents with varied electronic and steric properties (according to their Hammett σpara and Sterimol B1 values, respectively; additional details are provided in the supplementary materials, p4-8). Similarly, catalysts (1) were modified at the 2-, 4-, and 6- positions of the aryl ring attached to the triazole. Adamantyl-substituted catalysts 1e, pyr-1e, and imid-1e were also included to explore the effect of changes to the heterocyclic ring. In total, 12 substrates and 11 triazolyl catalysts were selected (Fig. 1D). These libraries were then synthesized, and the enantioselectivity of each catalyst-substrate combination was obtained. Simultaneously, a diverse array of molecular descriptor values was collected to describe the structural features of each catalyst and substrate, including Sterimol parameters (24), length measurements from geometry optimized structures, and computationally derived vibrational frequencies and intensities (details are provided in the supplementary materials, p4-8) (Fig. 1D) (25). Linear regression algorithms were then applied to various subsets of the data to identify correlations between molecular structure and the experimentally determined enantioselectivity. Subsequently, analysis and refinement of the resulting models were used to produce explicit mechanistic hypotheses that were then tested and validated experimentally.

Modeling catalyst heterocyclic rings

Given the clear importance of the catalyst heterocyclic ring in enantioselectivity determination (vide supra), we initially sought to understand the subset of results obtained by using catalysts 1e, pyr-1e, and imid-1e. Accordingly, by using linear regression techniques the correlation depicted in Fig. 3B was identified from a training set of 10 different substrate-catalyst combinations (Fig. 3, A and C, black squares). Of the large number of steric (26) and vibrational (25) terms initially investigated as molecular descriptors, four discrete vibrational parameters were sufficient to produce a correlation with enantioselectivity: one catalyst descriptor (νY–N, a stretching frequency on the heterocyclic ring), and two substrate descriptors (the stretching frequency of the amide C=Ο bond, νC=Ο, and stretching frequency/intensity of the C–H bond undergoing oxidation, ν/iC–H) (Fig. 3, B and D). A cross-term between the catalyst and substrate descriptors improves the overall quality of the model (νY–N x iC–H), suggesting a synergistic structural effect. The model was evaluated by plotting measured versus predicted ΔΔG values (Fig. 3C), and as a validation of its robustness, the enantioselectivities of 10 catalyst-substrate combinations not included in the training set were well-predicted (Fig. 3, A and C, red crosses). A slope approaching unity and intercept approaching zero over the training and validation sets indicate an accurate and predictive model, and the R2 value of 0.90 demonstrates a high degree of precision. The largest coefficient in this normalized model belongs to the heterocyclic ring vibrational frequency, signifying its substantial role in the quantification of enantioselectivity.

Fig. 3 Impact of heterocyclic catalyst substituent on enantioselectivity.

(A) Predicted and measured enantioselectivities for various substrates with adamantyl-substituted triazolyl (1e), pyrazoyl (pyr-1e), and imidazoyl (imid-1e) PAs. Values identified with an asterisk are external validations. (B) Mathematical correlation of normalized catalyst and substrate vibrational parameters to enantioselectivity. (C) Predicted versus measured ΔΔG plot. (D) Illustration of the structural features implicated by the identified parameters.

This model is capable of predicting results whose origins are not obvious upon inspection. For example, comparison of the reaction outcomes using 1e and pyr-1e with substrate 2a (Fig. 3, entries 1 versus 13) may lead to the conclusion that pyr-1e generally affords inferior selectivity. Indeed, experimental results for several additional substrates support this notion and are accurately predicted by the model (for example, 2-OMe benzyl substrate 2e, entries 5 versus 16). However, with substrate 2i (R1 = 2-OMe, R2 = Ph, entries 9 versus 17), the triazolyl and pyrazolyl PAs afford the product with similar levels of enantioselectivity. This counterintuitive result is precisely predicted, indicating that the divergent enantioselectivity displayed by 1e, as compared with pyr-1e and imid-1e, is adequately addressed by the model.

Trend analysis

Although the model in Fig. 3B establishes the capacity of the chosen parameters to describe subtle aspects of this system, the ultimate goal of this approach was to discern underlying mechanistic phenomena. This objective could not be achieved by using merely the above correlation because it was produced by using a small subset of data, in which the catalyst heterocyclic rings bore the same substituent (adamantyl). We hypothesized that the complete data set contained invaluable information to this end because it was produced by using strategically modified catalysts and substrates with substituents intentionally introduced to probe subtle effects, resulting in 132 enantioselectivities between –54 and 94% ee, which corresponds to a ΔΔG range of 2.8 kcal/mol. In accordance with a data-intensive strategy, none of these measured values was discarded because even low enantioselectivity carries information regarding the catalyst-substrate interactions at the origin of asymmetric induction.

Yet before producing a global, predictive model, we considered that a series of focused correlations, coupled with an evaluation of overall trends, might serve to reveal fundamental features of the system. To this end, for each individual substrate a correlation was produced relating its observed enantioselectivity values for the entire set of catalysts, with parameters describing the catalyst structure (2a-2l, 12 models in total) (fig. S3). The same strategy was applied to all aryl-substituted triazole catalysts by using parameters describing the substrate structure (1b-1d, 1f-1k, 9 models total) (table S7). This organizational scheme was viewed as a means to facilitate the identification of catalyst features that affect particular substrate types (and vice versa). Substrates or catalysts with similar structures behave analogously not only in a qualitative manner, but also in terms of the molecular descriptors that effectively serve to predict their enantioselectivities (individual substrate and catalyst measured versus predicted ΔΔG plots and equations are available in figs. S2, S3, and S4 and table S7). These quantitative correlations, together with systematically organized trends of experimental outcomes, can guide the development of testable mechanistic hypotheses.

To simultaneously inspect multiple aspects of large and intricate data sets, a communicative visualization of data is crucial (27). Thus, we elected to present information gained from these focused mathematical models, alongside multiple observed enantioselectivity results, organized according to trends in catalyst or substrate structural features. Demonstrating this visualization technique, the enantioselectivity trend for each catalyst (in Fig. 4, each line represents a catalyst) was plotted as a function of the substrates (in Fig. 4, each x axis tick-line represents a substrate), and vice versa (Fig. 5). To afford a quantitative trend analysis, the plots were arranged according to which positions were modified on the catalyst or substrate structures, and the corresponding R1 or R2 substituent’s Sterimol B1 value (additional visualizations are provided in figs. S5 and S6) (24). For example, in the purple frame in Fig. 4, 2-substituted benzyl substrates are displayed from the largest to smallest substituent. The catalyst molecular descriptors required for correlating each subset of substrates are also depicted (Fig. 4), along with the substrate descriptors for each subset of catalysts (Fig. 5). For example, the catalyst molecular descriptors used as parameters for the correlation of enantioselectivity obtained for 2-substituted benzyl substrates are presented below the blue frame in Fig. 4 (νN–N, ∠tor).

Fig. 4 Graphical representation of catalyst structure-selectivity trends as a function of substrate.

Enantioselectivity trends for every catalyst against all substrates. Each trend line represents a catalyst, and each x axis tick-line represents a substrate.

Fig. 5 Graphical representation of substrate structure-selectivity trends as a function of catalyst.

Enantioselectivity trends for every substrate against all catalysts. Each trend line represents a substrate, and each x axis tick-line represents a catalyst.

Analysis of this systematic data arrangement reveals that in general, catalyst performance correlates with the aryl substitution pattern in the order 2,4,6 > 2,6 > 4 (Fig. 5, gray, orange, and blue frames, respectively). Additionally, by juxtaposing Figs. 4 and 5, it appears that the reaction is mainly under catalyst control because catalyst features affect enantioselectivity in a more considerable and systematic manner (Fig. 5), whereas for each substrate, the spread of observed enantioselectivity is broader (Fig. 4). All substrates bearing a 2-substituted benzyl group—even those with substitution at R2 (Fig. 4, purple and blue frames, respectively)—can be modeled by using the torsion angle between the triazole and its substituent, and a triazole vibration frequency (Embedded Image, νN–N) (individual models are provided in table S7). The torsion angle represents a steric effect yet also contains information concerning the conjugation between the triazole ring and its substituent. The vibration frequency can serve as a correction to both of these effects because it takes into account nonadditive features of the substituents’ charge and mass distribution (25).

The models for substrates with 4-benzyl substitution (Fig. 4, green frame) contain the same two terms (Embedded Image, νN–N) and an additional steric descriptor (the catalyst aryl ring minimal width, B1). Similar interactions with the triazole ring should still be present for these 4-benzyl substrates, but the presence of a B1 term suggests an additional steric interaction between the substrate and catalyst substituents, which is avoided in the case of hydrogen at the 4-benzyl position. This claim is supported by the lower enantioselectivities observed for substrates with larger 4-benzyl substituents, especially when using catalysts with larger 2,6-groups. Thus, the lack of the catalyst B1 term in the models describing 2- relative to 4-benzyl substrates, and their overall higher enantioselectivities, are thought to arise from a better accommodation of the former substrates’ shapes in the catalyst active site.

The individual catalyst trends and models carry complementary information, revealing the substrate descriptors relevant to each catalyst subset (Fig. 5). The parameters that correlate with 4-substituted aryl catalysts’ enantioselectivity (Fig. 5, blue frame) include the substrate carbonyl-stretching vibrational frequency and intensity (ν/iC=O), the amide N–H vibrational frequency (νN–H), and a cross-term between the two (νC=O x νN–H). These values vary in response to substitution on the substrate benzyl ring or the distal aryl ring, with the former having a greater effect (table S2). These same parameters effectively correlate with the enantioselectivity observed for 2,6-disubstituted aryl catalysts (Fig. 5, orange frame), along with an additional vibrational term (νBn) describing a benzyl ring stretch. The enantioselectivity spread of catalysts with larger substituents at the 2,4,6-position [for example, 38 to 93% ee for 2,4,6-(Cy)3-Ph catalyst 1d (Fig. 5, gray frame)] can be described by using two terms associated with the benzyl ring (νBn, iN–H), stressing this ring’s role in determining enantioselectivity.

Trend interpretation

Collectively, these results suggest that a π interaction is established between the triazole ring and substrate during the enantioselectivity determining step, the strength of which is modulated by local steric and electronic structural features of both interacting partners (13, 2834). Furthermore, π interactions are often strengthened by heteroatoms (3537), which could explain the reduced enantioselectivity values obtained by using imid-1e, pyr-1e, and TCYP catalysts compared with 1e (Fig. 1A), as well as the similar KIE values obtained when using both the (R)- and (S)-enantiomers of pyr-1e and TCYP as catalysts, compared with the divergent ones displayed by (R)- and (S)-1e (Fig. 2B). In relation to the substrate, participation of both the benzyl group and the distal aryl group in putative π interactions are supported by the presence of molecular descriptors that are sensitive to substitution on these rings in every catalyst model (νC=O, iC=O, νN–H, and νBn) (Fig. 5 and table S2).

The energy stabilization gained from π interactions is affected by the distance and geometry of the rings involved (13, 3037). If a π interaction between the substrate and triazole is at the origin of enantioselectivity determination, the directionality of the triazole—represented by the torsion angle between the triazole and its substituent—is expected to directly affect enantioselectivity. In agreement with this hypothesis, catalysts with more pronounced torsional effects lead to higher enantioselectivity values for substrates with relatively small substituents at the benzyl 4-position. The torsion angle approaches perpendicularity (90°) owing to larger 2,6-substituents on the catalyst aryl ring connected to the triazole (Fig. 4, purple and blue frames, blue lines). Moreover, large catalyst 2,6-aryl substituents are presumed to serve as a steric barrier, docking the substrate in place for an improved overlap with the catalyst triazole ring. Correspondingly, substrates with elongated 4-substituents (R = Me, OMe) lead to lower enantioselectivities by using catalysts with large substituents at the 2,6-position (Fig. 4, green frame, blue lines). For these substrates, the steric repulsion exerted by the large 2,6-substituents affords a weaker or less directing π interaction and, subsequently, lower enantioselectivity. Thus, the importance of the torsion angle and vibration parameters for correlating enantioselectivity in the individual models and overall trends (Fig. 4, fig. S2, and table S7) is proposed to reflect the angle at which the triazole engages the substrate and the steric role of the catalyst aryl group. Lending further credence to this proposal, catalysts with reduced torsion angles (such as catalysts with triazole R substituents: Ph, 4-NO2Ph, 4-OMePh, or 4-SO2MePh) that do not introduce the proposed directional and steric effects lead to diminished enantioselectivities overall (Fig. 5).

Comprehensive model and probes of mechanistic hypotheses

On the basis of these hypotheses, we set out to design a series of new catalysts to specifically probe putative interactions. To facilitate catalyst development, a predictive model (Fig. 6) was generated for the entire substrate set with the aryl-substituted catalysts (1b-1d and 1f-1k). This model contains 108 combinations (9 catalysts times 12 substrates) from the initial library of experiments, where half were used as a training set (Fig. 6B, black squares) and the other half as external validations (Fig. 6B, red crosses). New catalysts were proposed to address hypotheses raised by the focused models and trend analysis, and their enantioselectivity was predicted before synthesis by the comprehensive model (Fig. 6).

Fig. 6 Validation of mechanistic hypotheses through directed catalyst design.

(A) Normalized equation for the prediction of enantioselectivity by using parameters that describe catalyst and substrate structural features. (B) Predicted versus measured ΔΔG plot. (C) Hypothesis-driven external predictions and their comparison with other relevant catalysts.

First, to probe whether the aryl substituent on the triazole ring plays a primarily steric role, rather than directly engaging the substrate in a π interaction, perfluorophenyl catalyst 1l was designed and evaluated. Substituent local sterics and charge distribution have been shown to strongly affect noncovalent π interactions (2830, 3239). Therefore, we expected that if enantioselectivity were predominantly dependent on the aryl substituent directly engaging as a partner in a π interaction (as opposed to taking an ancillary role in π interactions involving the triazole), perfluorophenyl catalyst 1l would deviate significantly from its Ad (1e), 2,6-(F)2-Ph (1k) or 2,6-(MeO)2-Ph (1j) counterparts. However, all four catalysts behave similarly with respect to the magnitude and sign of enantioselectivity (Fig. 6C, entries 1 to 9 and 24 to 26). This result is well predicted by the model (Fig. 6B) and is consistent with the hypothesis that the main function of the aryl substituent is steric.

Next, catalyst 1m, bearing a single isopropyl group at the 2-position of the triazole aryl substituent, was prepared to probe the hypothesis that steric repulsion exists between larger catalyst 2,6-substituents and elongated substrate 4-substituents. We anticipated that an isopropyl group would induce the torsion necessary to enforce the proposed benzyl-triazole π interaction, while avoiding a direct steric interaction between the substrate benzyl 4-position and the catalyst aryl 2,6-substituents. Indeed, for all 4-substituted substrates tested (Fig. 6C, entries 10 to 12), 1m provided the corresponding product in higher enantioselectivity than that of 1c, which bears isopropyl groups at both ortho positions of the triazole aryl substituent (Fig. 6C, entries 13 to 15). For 4-NO2-Bn substrate 2d, the 2-iPr catalyst 1m resulted in the highest enantioselectivity observed to date (Fig. 6, B and C, entry 12), as predicted by the model.

Last, in order to evaluate the capacity to obtain improved enantioselectivity as a result of a data-intensive approach, and the hypothesis that torsion leads to enhanced enantioselectivity for the 2-substituted substrates, several proposed catalysts were evaluated by using the model in Fig. 6. Catalyst 1n was selected because it is synthetically feasible, accommodates a torsion angle close to 90°, and was predicted to give improved enantioselectivity for all substrates bearing hydrogen at the 4-benzyl position. This prediction was verified in practice for all eight substrates evaluated, affording the highest enantioselectivities observed to date (Fig. 6C, entries 16 to 23). These results confirm that a perpendicular geometry of the triazole and the aryl ring can indeed lead to higher enantioselectivities, supporting the premise that the orientation of the triazole ring coupled with its R group’s steric constraints control triazole π interactions.

The overall analysis of the triazole-PA case study demonstrates the complementary manner in which classical physical organic techniques and modern data analysis strategies can be merged toward a more complete mechanistic assessment (40). This approach is based on the use of empirical data, which is often a prerequisite for a rational reaction optimization process, to concomitantly conduct a mechanistic investigation. Information of this sort that could be used for an in-depth analysis is often omitted from reports in the field of catalysis because only results leading to the desired outcomes are generally presented and pursued. Yet because high-throughput (2, 4, 8), automated (7) methods for reaction development are now common, data analysis strategies could be applied in parallel to optimization procedures, allowing for simultaneous mechanistic and structural analysis.

Creatively collecting and organizing data to examine proposed hypotheses affords improved generalizations, particularly as data sets become larger and more complex (23). This idea holds true for the analysis of reaction trends by parameters that reflect structural modification. Indeed, the focused catalyst and substrate models—and their organization according to fundamental, quantitative, physical-organic trends—provided nonintuitive insights regarding interactions involved in enantioselectivity determination. Although this approach is general for the prediction and study of chemical reaction outcomes, this case study was chosen as a stringent benchmark because it uses weak, noncovalent interactions for asymmetric induction. These interactions are typically in the energy range required to distinguish a highly enantioselective reaction from its racemate forming counterpart (2 to 3 kcal/mol) (12, 13, 32, 37), providing seemingly endless approaches to rational catalyst design. However, controlling noncovalent interactions represents a notable challenge in the design of catalytic systems because of multiple energetically accessible orientations (13). Complemented with rigorous experimental analysis, the disclosed data-intensive approach is suited to addressing such intricacies and holds potential for the analysis of increasingly complicated catalytic systems streamlining both reaction and catalyst development.

Supplementary Materials

Materials and Methods


Figs. S1 to S7

Tables S1 to S6

References (4153)

References and Notes

  1. J. A. Quinn, Molecular Modeling Pro, 6.36 (Norgwyn Montgomery Software, North Wales, PA).
  2. MATLAB (The MathWorks, Natick, MA).
  3. Acknowledgments: We thank the NSF (CHE-0749506 and CHE-1361296) and the National Institute of General Medical Sciences (R01 GM104534) for partial support of this work. The support and resources from the Center for High Performance Computing at the University of Utah are gratefully acknowledged. A.J.N. gratefully acknowledges an Amgen Fellowship in Organic Chemistry for funding and Jörg Hehn for early contributions to this work.
View Abstract

Stay Connected to Science

Navigate This Article