Assembly of an Evolutionarily New Pathway for α-Pyrone Biosynthesis in Arabidopsis

See allHide authors and affiliations

Science  24 Aug 2012:
Vol. 337, Issue 6097, pp. 960-964
DOI: 10.1126/science.1221614


Plants possess arrays of functionally diverse specialized metabolites, many of which are distributed taxonomically. Here, we describe the evolution of a class of substituted α-pyrone metabolites in Arabidopsis, which we have named arabidopyrones. The biosynthesis of arabidopyrones requires a cytochrome P450 enzyme (CYP84A4) to generate the catechol-substituted substrate for an extradiol ring-cleavage dioxygenase (AtLigB). Unlike other ring-cleavage–derived plant metabolites made from tyrosine, arabidopyrones are instead derived from phenylalanine through the early steps of phenylpropanoid metabolism. Whereas CYP84A4, an Arabidopsis-specific paralog of the lignin-biosynthetic enzyme CYP84A1, has neofunctionalized relative to its ancestor, AtLigB homologs are widespread among land plants and many bacteria. This study exemplifies the rapid evolution of a biochemical pathway formed by the addition of a new biological activity into an existing metabolic infrastructure.

As sessile organisms, land plants evolved the ability to synthesize specialized metabolites that are key to their adaptation to terrestrial ecosystems (1). The specialized metabolic pathways in plants typically comprise multiple catalytic steps that are spatially and temporally regulated and range from being widespread across land plants examined to date to lineage-specific (2). For example, flavonoids are ubiquitous in land plants, but the anticancer drug taxol is made only in certain yew species (3). The latter observation, and others like it, suggest that specialized metabolic systems are highly evolvable and can readily explore new chemical space in a relatively short evolutionary time scale (1, 3). However, it remains unclear how elaborate specialized metabolic pathways emerge.

Plant phenylpropanoids are phenylalanine-derived metabolites with diverse functions, which mediate plants’ interaction with biotic and abiotic environments, regulate growth, and serve as an important component of the secondary cell wall in vascular plants: lignin. Phenylpropanoid metabolism contains a core pathway conserved among land plants (fig. S1) (4) and additional metabolic offshoots derived from various intermediates of this core pathway (5). Some branches evolved relatively early and have become widespread in major lineages, such as syringyl (S) lignin biosynthesis in angiosperms (6). In contrast, later innovations remain restricted to a few species, such as hydroxycinnamoyl spermidine biosynthesis in the Brassicaceae family (7) (fig. S1).

All vascular plants contain p-hydroxyphenyl lignin and guaiacyl (G) lignin, whereas angiosperms have evolved to synthesize S lignin through additional aromatic ring modifications of G lignin precursors (fig. S1). The first step of this branch is catalyzed by the cytochrome P450 enzyme ferulate 5-hydroxylase (F5H, or CYP84A1 in Arabidopsis), which converts coniferaldehyde and coniferyl alcohol to their 5-hydroxylated derivatives (8, 9). Arabidopsis mutants defective in CYP84A1 (fah1 mutants) do not accumulate downstream products, including S lignin and sinapate esters (10). Arabidopsis has a paralog of CYP84A1, designated CYP84A4 (At5g04330). To elucidate its function, we identified a CYP84A4 transferred DNA (T-DNA) insertional mutant (SALK_064404), which we named apd1-1 for arabidopyrone deficient1 (fig. S2). apd1 accumulates normal amounts of sinapate esters (Fig. 1A and fig. S3), which, together with the strong phenotype of fah1, indicate that CYP84A1 and CYP84A4 are not redundant genes. We found that overexpression of CYP84A4 under the control of the cinnamate 4-hydroxylase (C4H) promoter does not complement the fah1 mutant phenotype (Fig. 1A and fig. S3), whereas the pC4H::CYP84A1 construct does (8). Finally, yeast-expressed CYP84A4 does not show detectable activity toward either coniferyl alcohol or coniferyl aldehyde, the primary substrates of CYP84A1. These data suggest that the catalytic activity of CYP84A4 is distinct from that of CYP84A1.

Fig. 1

(A) Rosette-stage wild-type, mutant, and transgenic Arabidopsis plants under visible light (upper panel) and ultraviolet (UV) light (lower panel). (B) High-performance liquid chromatography (HPLC) chromatograms of Arabidopsis stem extracts showing accumulation of four APs (labeled by dotted lines). Compounds 3 and 4 partially overlap chromatographically, but can be differentiated by their mass spectra (fig. S7). Because of the difference in the maximum UV absorbance wavelengths among the APs (fig. S7), their peak areas do not perfectly reflect their relative abundance [see fig. S4B for quantification of APs from mass spectrometry (MS)]. (C) Structures of the four APs, corresponding to compounds 1 to 4 as indicated in (B). (D) qRT-PCR analysis of CYP84A4 and AtLigB expression in various tissue types. Error bars, ±SE for six to eight biological replicates.

We identified four unknown metabolites present in stems of wild-type and fah1, but absent in both apd1-1 (Fig. 1B) and another T-DNA insertional mutant line of CYP84A4 (SALK_076723 or apd1-2) (figs. S2 and S4A). When apd1-1 plants were transformed with a CYP84A4 genomic construct, but not a p35S::CYP84A1 construct, the accumulation of the four compounds was restored (fig. S4A). These data indicate that CYP84A4, but not CYP84A1, is required for biosynthesis of the four compounds. When plants carrying the pC4H::CYP84A4 construct were analyzed, the compounds accumulated to higher amounts in stems and also appeared in leaves (Fig. 1B and S5), suggesting that CYP84A4 is the rate-limiting step in their biosynthesis and that the tissue specificity of its expression limits the distribution of these metabolites. All CYP84A4-overexpressing plants in a fah1 background exhibited various levels of stunted growth, whereas such an effect was not observed in wild type (Fig. 1A and fig. S6),

Using liquid chromatography–mass spectrometry (LC-MS) andnuclear magnetic resonance, we found that these unknown compounds are 6-carboxy-2-pyrones with side chains reminiscent of those of phenylpropanoid alcohols or acids (Fig. 1C, fig. S7, and table S1). We named them arabidopyl alcohol, iso-arabidopyl alcohol, arabidopic acid, and iso-arabidopic acid and refer to them collectively as arabidopyrones (APs). The structure of arabidopyl alcohol was confirmed by synthesis (fig. S8 and table S1).

APs are nine-carbon metabolites that, like phenylpropanoids, bear a three-carbon side chain, suggesting that they may be derived from phenylalanine or tyrosine. Using isotopic labeling, we showed that all nine carbons of arabidopyl alcohol are derived from phenylalanine, but not tyrosine (Fig. 2 and fig. S9) (11). We further demonstrated efficient labeling of arabidopyl alcohol using deuterium-labeled cinnamic acid (Fig. 2 and fig. S9). These results suggest that APs are synthesized from phenylalanine via phenylalanine ammonia-lyase (PAL), the first enzyme in phenylpropanoid metabolism.

Fig. 2

Isotopic tracing of arabidopyl alcohol biosynthesis with 2H- or 13C-labeled precursors. Arabidopsis seedlings grown on media containing 100 μM isotopic labeled precursors were extracted and analyzed by LC-MS. (Left) Schematic summary of the results showing that the ortho-deuteriums from 2H-5-Phe and 2H-5-cinnamic acid, as well as the side-chain13C from 13C-1-Phe, were incorporated efficiently into the corresponding positions on arabidopyl alcohol (depicted as dotted circles), whereas deuteriums from 2H-4-Tyr were not. (Right) Quantification of the isotopic distribution of the [M-H] negative ions of arabidopyl alcohol was based on peak areas for each isotope as determined by LC-MS. Error bars, ±SE for three replicate measurements.

Known metabolic routes leading to heterocyclic compounds like APs include those that involve enzyme-catalyzed ring cleavage of catechol-substituted intermediates, although pyrones can also be synthesized by polyketide synthases (12). For example, in Portulaca grandiflora, 3,4-dihydroxyphenylalanine (DOPA) is cleaved by an extradiol ring-cleavage dioxygenase, and the product of this reaction nonenzymatically rearranges to form betalamic acid, a naturally occurring pigment in plants (13). Similarly, stizolobinic acid and stizolobic acid, phytoalexins in Stizolobium hassjoo, are synthesized from DOPA via a ring-cleavage dioxygenase (14). Knowing that CYP84A1 is a phenylpropanoid 5-hydroxylase, we speculated that CYP84A4 might have evolved into a phenylpropanoid hydroxylase responsible for the production of a 3,4-dihydroxylated phenolic intermediate that is subsequently cleaved by a downstream extradiol ring-cleavage dioxygenase. To test this hypothesis, we assayed CYP84A4 against a range of phenylpropanoid pathway intermediates. Among the compounds tested, only p-coumaraldehyde was a substrate for CYP84A4, yielding caffealdehyde as a product with a Michaelis constant (Km) of 41 ± 5.3 μM (Fig. 3A and fig. S10). To test these in vitro data for their relevance in planta, we attempted to biochemically complement the CYP84A4-deficient apd1-1 mutant by exogenous application of caffealdehyde. This treatment restores AP accumulation, but application of the structurally related caffeic acid and caffeyl alcohol does not (Fig. 3B). It is interesting that CYP84A1, which presumably represents the ancestral form from which CYP84A4 is derived, exhibits slight activity as a p-coumaraldehyde 3-hydroxylase in vitro, with Km values several hundred times those for its optimal substrates, coniferaldehyde and coniferyl alcohol, and 10 times the Km of CYP84A4 toward p-coumaraldehyde (fig. S10) (15). This observation, in agreement with the theory that catalytic promiscuity serves as the starting point for divergence of new enzyme activities (16), may represent a general means by which rapid functional diversification of specialized metabolic enzymes can occur.

Fig. 3

(A) HPLC chromatograms of in vitro enzyme assays showing the p-coumaraldehyde 3-hydroxylase activity of yeast-expressed CYP84A4. Microsomes prepared from yeast containing pYeDP60 empty vector were used as a negative control. (B) HPLC analysis of apd1-1 seedling extracts. Seedlings were grown on media containing 100 μM caffeic acid, caffealdehyde, or caffeyl alcohol for 4 days before analysis. Only caffealdehyde restored the accumulation of APs in apd1-1. AP numbering is the same as in Fig. 1C.

Amino acids that are important for maintaining protein function are often under evolutionary constraint (17). On the basis of multiple sequence alignment, we calculated the degree of conservation for each position of the aligned sequences and mapped this information onto a homology model of CYP84A1. We found that the protein inner core encompassing the active site and the substrate channel is highly conserved in the CYP84 family (fig. S11A). However, CYP84A4 contains several mutations in these highly conserved regions (figs. S11B and S12). These residues have apparently escaped from the constraints imposed by F5H function and are potentially key residues that are associated with the neofunctionalization of CYP84A4.

In addition to the deviation of enzymatic function, CYP84A4 also exhibits a tissue-specific expression pattern distinct from that of CYP84A1. Using a promoter::GUS reporter line, we detected CYP84A4 promoter activity in seedlings, roots, stems (mainly in phloem), and inflorescence nodes, but this staining was low or absent from leaves, flowers, seeds, and lignifying tissue in which CYP84A1 is expressed (fig. S13). Quantitative real-time polymerase chain reaction (qRT-PCR) analysis indicates that CYP84A4 is more highly expressed in seedlings and stems than in leaves or flowers, consistent with the distribution of APs (Fig. 1D).

In the Arabidopsis genome, At4g15093 or AtLigB is annotated as encoding an extradiol ring-cleavage enzyme, a protein that may be involved in AP biosynthesis. We identified two independent AtLigB T-DNA insertional lines, SALK_141715 (apd2-1) and SAIL_7_F11 (apd2-2) (fig. S2), and in these plants AP accumulation is near or below the detection limits (Fig. 1B and fig. S4A). Furthermore, overexpression of AtLigB under the control of the 35S promoter in apd2-1 restored AP biosynthesis to the wild-type level (fig. S4A), indicating that AtLigB is required for AP biosynthesis (Fig. 4A). qRT-PCR data and microarray data show that AtLigB is expressed not only in seedlings and stems, but also in roots, leaves, and flowers where AP accumulation was not detected (Fig. 1D and fig. S14), consistent with the hypothesis that tissue-specific expression of CYP84A4 is the major limiting factor for the distribution of APs.

Fig. 4

(A) A proposed pathway for AP biosynthesis in Arabidopsis. Enzymes that may be involved, but have not yet been identified, are italicized and denoted with question marks. 4CL, 4-hydroxycinnamoyl coenzyme A (CoA) ligase; CCR, (hydroxy)cinnamoyl-CoA reductase; GOLP, glucose oxidase-like protein; ADH, alcohol dehydrogenase; ALDH, aldehyde dehydrogenase. (B) Maximum likelihood (ML) phylogenetic analysis of CYP84A subfamily and (C) plant extradiol ring-cleavage dioxygenases (DOX). The CYP84A tree was rooted with the Selaginella moellendorffii CYP788A1, an independently evolved lycophyte F5H (6), and the DOX tree was rooted on a DOX from the moss Physcomitrella patens. Bootstrap values (based on 500 replicates) are indicated at the tree nodes. Nodes supported with bootstrap values and posterior probabilities above 60% in both ML and Bayesian analyses (fig. S15), respectively, are indicated as red circles. The scale measures evolutionary distance in substitutions per amino acid.

Phylogenetic analysis suggests that CYP84A4 originated from a recent gene-duplication event after the divergence of Arabidopsis from other members of the Brassicaceae family: Arabidopsis lyrata contains a CYP84A4 ortholog, whereas Brassica napus, Thellungiella halophila, and Capsella rubella do not (Fig. 4B and fig. S15A). The elongated branch length underlying the CYP84A4 cluster suggests accelerated evolution, which could be due to positive selection or neutral drift after escape from the evolutionary constraints imposed on its ancestral version. To examine the first scenario, we performed a codon-based test, which allows detection of positive selection on residues in the CYP84A4 cluster against a background of CYP84A members that are likely undergoing purifying selection (11, 18). No significant sign for positive selection was revealed from this analysis (fig. S16). In contrast to the recent duplication of CYP84A4 members in Arabidopsis, AtLigB is widespread among land plants and can be traced back to a single gene, predating the mosses (Fig. 4C and fig. S15B). This suggests that LigB homologs have conserved functions and have only recently been recruited to serve in AP biosynthesis in Arabidopsis.

The evolution of AP biosynthesis in Arabidopsis illustrates the emergence of a multistep metabolic pathway in plant specialized metabolism. A gene-duplication event yielded an extra enzyme copy in the ancestral genome, which was relieved from the evolutionary constraints imposed on its progenitor. The catalytic promiscuity intrinsic to this specialized enzyme was subsequently exploited by mutations, resulting in neofunctionalization and the synthesis of novel metabolites. In the context of a complex metabolic system, these emergent metabolites could be further converted by preexisting enzymes and gave rise to the pathway’s final products. These traits, if selectively neutral or advantageous, may be fixed within the lineage. Conjugation of the new activity of a new enzyme to a preexisting catalytic repertoire maximized the exploitable chemical space and may be representative of a fundamental evolutionary mechanism underlying the rapid expansion of chemodiversity in plants.

Supplementary Materials

Materials and Methods

Figs. S1 to S16

Tables S1 and S2

Multiple sequence alignments MSA1 to MSA3

References (1931)

  • * Present address: Howard Hughes Medical Institute, Jack H. Skirball Center for Chemical Biology and Proteomics, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA.

References and Notes

  1. Supplementary materials are available on Science Online.
  2. Acknowledgments: This work is supported by NSF grant MCB-1121925 and the Global Climate and Energy Project (to C.C.). We thank D. A. Colby for helpful discussions about the chemical synthesis of APs and J. Ralph and H. Kim for providing authentic standards. Supporting data and figures relevant to this paper are presented online in the supplementary materials.
View Abstract

Navigate This Article