Report

The Biochemical Architecture of an Ancient Adaptive Landscape

See allHide authors and affiliations

Science  21 Oct 2005:
Vol. 310, Issue 5747, pp. 499-501
DOI: 10.1126/science.1115649

Abstract

Molecular evolution is moving from statistical descriptions of adaptive molecular changes toward predicting the fitness effects of mutations. Here, we characterize the fitness landscape of the six amino acids controlling coenzyme use in isopropylmalate dehydrogenase (IMDH). Although all natural IMDHs use nicotinamide adenine dinucleotide (NAD) as a coenzyme, they can be engineered to use nicotinamide adenine dinucleotide phosphate (NADP) instead. Intermediates between these two phenotypic extremes show that each amino acid contributes additively to enzyme function, with epistatic contributions confined to fitness. The genotype-phenotype-fitness map shows that NAD use is a global optimum.

The role of epistasis—interactions among mutations that produce nonadditive effects on phenotype and fitness—in evolution remains hotly debated (18). Although routinely detected in natural and in experimental populations (4, 9, 10), its presence need not imply the existence of multiple peaks in an adaptive landscape (11). Indeed, the question remains: Are adaptive landscapes rugged, or are they smooth?

Characterizing the adaptive landscape of an enzyme is conceptually simple. Mutations controlling a phenotype must be identified. Mutants of intermediate phenotype must be engineered so that the connections between genotype and phenotype (the genotype-phenotype map) can be explored. The fitness of each mutant must be determined so that the relationships between genotype and fitness (the genotype-fitness map) can be established. Finally, a model relating phenotype to fitness (the phenotype-fitness map) is needed to specify the mechanism of selection.

We characterized the adaptive landscape governing coenzyme use by isopropylmalate dehydrogenase (IMDH), an enzyme that catalyzes a step in the biosynthesis of leucine, an essential amino acid. All IMDHs use nicotinamide adenine dinucleotide (NAD) as a coenzyme, although some related isocitrate dehydrogenases (IDHs) lie at the other phenotypic extreme and use nicotinamide adenine dinucleotide phosphate (NADP) instead. Six amino acid residues critical to coenzyme use have been identified (1215) (Fig. 1). Enzyme performance (P = kcat/Km) and preference (PNAD/PNADP—the number of NADs turned over for each NADP turned over when both coenzymes are present in equimolar concentrations) are phenotypes relevant to fitness (16). The fitnesses of engineered mutants are estimated using the Escherichia coli chemostat competition assay (17). Finally, the physiological basis of fitness is described using a simple model of metabolism.

Fig. 1.

Crystallographic structures identify amino acids determining coenzyme use. Only key residues are shown (gray, carbon; red, oxygen; blue, nitrogen; yellow phosphorus). (A) Structural alignment of E. coli IMDH (13) (brown main-chain; labels designate the amino acid followed by the site number) and Thermus thermophilus IMDH (14) (blue main-chain) showing the double H-bond (pink lines) critical to NAD use. (B) Structural alignment of E. coli IMDH and E. coli IDH (green main-chain) with NADP bound (15) showing IDH residues (following the IMDH site number) H-bonding to the 2′-phosphate (2′P) of bound NADP (H-bonds from the disordered 289Lys not shown).

Protein engineering (18) was used to switch the coenzyme specificity of E. coli IMDH from NAD to NADP. Unlike most IMDHs, E. coli IMDH already has the Arg-341 found in all NADP-dependent IDHs. The remaining five replacements (Asp236Arg, Asp289Lys, Ile290Tyr, Ala296Val and Gly337Tyr) were introduced into the coenzyme-binding pocket of E. coli IMDH by site-directed mutagenesis. Specificity was changed by a factor of 20,000, from a 100-fold preference for NAD (Math and Math) to a 200-fold preference for NADP (PNAD = 0.18 × 103 M–1 s–1 and PNADP = 37 × 103 M–1 s–1). The engineered “RKYVYR” enzyme is both as active and as specific toward NADP as the wild type enzyme is toward NAD.

To characterize the genotype-phenotype map, we engineered various combinations of amino acids at the six sites (table S1) (18). The kinetic performances of 164 mutant enzymes toward NAD and NADP were estimated. Nested analyses of variance (NANOVA) (19) of loge-transformed performances and preferences show that a simple linear additive model of the form Math(1) explains most of the data (y is performance or preference, m is the sample mean and ai.j is the additive deviation caused by amino acid i at site j): r2 = 0.95 for loge(PNAD), r2 = 0.92 for loge(PNADP), and r2 = 0.97 for loge(PNADP/PNAD). Performance and preference are dominated by additive effects (Table 1). There is no evidence for epistasis in these genotype-phenotype maps.

Table 1.

Additive effects (ai.j) of amino acid replacements on coenzyme use.

Site Residue Performance effects Preference effectView inline
NAD SE NADP SE NAD-NADP SE
236 Arg -0.250 ±0.039 0.735 ±0.046 -0.985 ±0.041
Asp 0.250 ±0.039 -0.735 ±0.046 0.985 ±0.041
289 Lys -0.657 ±0.060 1.547 ±0.071 -2.204 ±0.062
Asp 0.850 ±0.057 -0.722 ±0.069 1.572 ±0.062
AsnView inline -0.019 ±0.059 0.506 ±0.072 -0.509 ±0.045
GluView inline -0.183 ±0.074 -1.332 ±0.089 1.148 ±0.079
290 Tyr -0.680 ±0.082 0.659 ±0.092 -1.367 ±0.102
Ile 2.218 ±0.078 0.255 ±0.094 1.963 ±0.083
HisView inline -0.824 ±0.099 0.684 ±0.119 -1.508 ±0.106
LeuView inline 2.267 ±0.076 1.058 ±0.120 1.209 ±0.109
PheView inline -0.516 ±0.099 -0.869 ±0.119 0.353 ±0.102
LysView inline 0.159View inline ±0.109 0.911 ±0.123 -0.729 ±0.107
AsnView inline -1.981 ±0.096 -1.713 ±0.116 -0.268 ±0.112
GlnView inline -0.633 ±0.094 -1.039 ±0.112 0.406 ±0.109
296 Val -0.672 ±0.038 -0.577 ±0.047 -0.095 ±0.041
Ala 0.672 ±0.038 0.577 ±0.047 0.095 ±0.041
337 Tyr -0.193 ±0.094 0.058View inline ±0.046 -0.139 ±0.040
Gly 0.193 ±0.094 -0.058View inline ±0.046 0.139 ±0.040
341 Arg 0.291 ±0.047 0.375 ±0.058 -0.084View inline ±0.050
Ser -0.291 ±0.047 -0.375 ±0.058 0.084View inline ±0.050
m -0.095 ±0.055 0.013 ±0.065 -0.108 ±0.056
  • View inline* The preference effect is defined as awt.jai.j.

  • View inline Possible transitional replacements attributable to multiple base substitutions needed to exchange Asp for Lys at site 289 and Ile for Tyr at site 290.

  • View inline Not significantly different from zero.

  • Statistical additivity implies thermodynamic additivity. Simple enzyme transition state theory (16) suggests Math(2) where ΔΔG‡mut = ΔG‡wt – ΔG‡mut is the total difference in free energies between the enzyme transition states of the mutant (mut) and the wild type (wt), and Math represents the difference attributable to replacing a single wt amino acid at site j with an amino acid i. Thermodynamic additivity has been seen in a number of studies of protein folding (20), protein-protein interactions (21), and catalysis (22). The lack of epistasis in coenzyme performance by IMDH is typical of many molecular genotype-phenotype maps, although nonadditive effects arise in some (2325).

    No enzyme performs well with both coenzymes (Fig. 2A). Given thermodynamic additivity, the performances of each of the remaining 512 – 164 = 348 mutant intermediates can be predicted by summing the additive effects (Table 1). Again, the interior of the plot is empty (Fig. 2B). Evidently, a performance trade-off restricts severely the possible phenotypes upon which selection can act.

    Fig. 2.

    Performances (103 M–1s–1) of engineered IMDH mutants toward NAD and NADP reveal a trade-off in enzyme function. (A) Distribution of performances for the 164 engineered enzymes constructed. (B) Distribution of performances for 512 genotypic intermediates predicted on the assumption of thermodynamic additivity. Symptomatic of a trade-off in performance, the interiors of both plots are devoid of mutants.

    The genotype-fitness map reveals strong epistasis in fitness. Ninety IMDH mutants, representing a stratified sample of kinetic performances, were recombined individually into the leu operon on the E. coli chromosome and their fitnesses relative to those of the wild type determined in chemostat competition (17, 18). A NANOVA (19) (residues within sites) of fitness assuming only additive effects produced a poor fit (r2 = 0.85). Interactions were not modeled because they required many more degrees of freedom than our NANOVA design permited. Eliminating “transitional” residues (table S1) (18) from the analysis allows pairwise interactions to be modeled. The resulting NANOVA included six significant pairwise interactions (r2 = 0.99) and was a marked improvement over the strictly additive model (r2 = 0.87). Hence, epistasis is present in the genotype-fitness map.

    The phenotype-fitness map shows how epistasis, absent in the genotype-phenotype map, arises in the genotype-fitness map. Fitness is commonly a concave function of enzyme performance (26, 27). Assume fitness (w) is a hyperbolic function of intracellular IMDH performance toward isopropylmalate Math Math(3) where wmax is maximum fitness when Math, K is the performance necessary to produce wmax/2, Vmax is the maximum intracellular rate when isopropylmalate is saturating, and Math is the concentration of isopropylmalate necessary to produce Vmax/2. The concave nature of Eq. 3 is typical of the nonlinear responses in metabolic flux to changes in enzyme activities that produce genetic dominance, phenotypic robustness, and epistasis at higher levels of biological organization (28, 29). Epistasis in fitness arises because the same mutation producing the same proportional increase in activity in a wild type as in a mutant compromised by mutation will cause a smaller increase in fitness in the wild type (because wwmax when Math) than it will in the mutant [because Math when Math].

    Substituting a kinetic model describing the IMDH random bi-ter kinetic mechanism for Math in Eq. 3 and collecting terms produces the phenotype-fitness map in terms of the coenzyme kinetics, Math(4) Math Math where A, B, C, D, and R are constants associated with kinetic terms and coenzyme pools unaffected by our mutations (18). Equation 4 is a hypothesis that describes fitness in terms of the kinetic parameters obtained for each mutant enzyme. It fits the data well—nonlinear regression yields r2 = 0.97.

    Noting that the Math values are necessarily correlated with—and hence can be collapsed into—the Math values allows the phenotype-fitness map to be visualized (Fig. 3A). IMDH fitness is maximized exclusively by high performance with, and high preference for, NAD (the wild type on the right fitness plateau). NADP use is suboptimal.

    Fig. 3.

    The phenotype-fitness map of IMDH. (A) The fitnesses (green spheres of fitness radius w = 0.5 ≈ 2 SE) of 90 engineered mutants plotted against their coenzyme performances ((103 M–1s–1)). The fitted surface is the estimated phenotype-fitness map (Eq. 4). It reveals a single broad adaptive peak on which resides the NAD-specific wild-type enzyme (red sphere). NADP use is advantageous only in mutants with very poor NAD performance (e.g., the RKYVYR mutant, white sphere). (B) Escape from the lower NADP-use plateau to the higher NAD-use plateau is possible because some single amino acid replacements (blue spheres from the RKYVYR mutant, pink spheres from the wild type; asterisks denote fitness values predicted from kinetic data) produce sufficiently large effects on performance and fitness that the maladaptive valley near the origin is bypassed.

    Product inhibition by NADPH lowers the fitness of NADP users. Most intracellular NADP is in the reduced form, NADPH, which has a 30-fold higher affinity for IMDH Math. Thermodynamic additivity ensures that mutations in the coenzyme-binding pocket that improve performance with NADP also increase affinity for NADPH (fig. S1) (18). Consequently, any benefit gained by improved performance with NADP is offset by intensified product inhibition by abundant NADPH. A similar correlation for NAD use does not generate a measurable cost because so little intracellular NAD is in the reduced NADH form (B « C, Eq. 4) (18).

    The phenotype-fitness map has a single peak (the broad NAD-use plateau in Fig. 3A). In principle, the trade-off in performance (Fig. 2) could combine with mutations of small functional effect to force all paths from the NADP-dependent RKYVYR mutant to the NAD-dependent wild type through the maladaptive valley at the origin. The result would be two peaks on the genotype-fitness map, with the higher NAD-use plateau inaccessible from the lower NADP-specific allele.

    The genotype-fitness map has just one adaptive peak. The fitnesses of all 512 mutant genotypes were predicted using Eq. 4 with enzyme performances calculated assuming thermodynamic additivity (Table 1 and Fig. 2B). A fitter genotype was mutationally accessible to every genotype—except for the NAD-dependent wild type, which was predicted to be fittest. Only one peak is expected because some mutations have sufficiently large phenotypic effects that the maladaptive valley at the origin of the phenotype-fitness map is bypassed (Fig. 3B).

    Defining all mutational connectivities between the genotypes on the phenotype-fitness map completes the IMDH adaptive landscape. With its single peak, the landscape is far less rugged than those envisioned by Wright (1). With epistasis consigned to a minor role, this landscape lies closer to Fisher's conception (2) than to any other. Ironically, the landscape might have been more rugged, with two adaptive peaks, had another of Fisher's assumptions, that of many mutations each of small effect (2), proven correct.

    Coenzyme use by IMDH is likely representative of a large class of adaptive landscapes in which thermodynamic additivity in molecular function (Table 1) (2022) combines with concave fitness functions at the organismal level (Eq. 3) (26, 27). Nevertheless, landscapes are likely to be more rugged whenever epistasis in genotype-phenotype maps (2325) combines with complex phenotype-fitness maps (26).

    Our landscape provides a mechanism sufficient to explain why all IMDHs use NAD. Conservation of this phenotype implies that we have characterized an ancient adaptive landscape—unchanged in all lineages, in all habitats, since the last common ancestor. Such ancient landscapes can explain adaptive processes at the very dawn of life's diversity.

    Supporting Online Material

    www.sciencemag.org/cgi/content/full/310/5747/499/DC1

    SOM Text

    Materials and Methods

    Figs. S1 and S2

    Table S1

    References and Notes

    References and Notes

    View Abstract

    Navigate This Article