Genetic Properties Influencing the Evolvability of Gene Expression

See allHide authors and affiliations

Science  06 Jul 2007:
Vol. 317, Issue 5834, pp. 118-121
DOI: 10.1126/science.1140247


Identifying the properties of gene networks that influence their evolution is a fundamental research goal. However, modes of evolution cannot be inferred solely from the distribution of natural variation, because selection interacts with demography and mutation rates to shape polymorphism and divergence. We estimated the effects of naturally occurring mutations on gene expression while minimizing the effect of natural selection. We demonstrate that sensitivity of gene expression to mutations increases with both increasing trans-mutational target size and the presence of a TATA box. Genes with greater sensitivity to mutations are also more sensitive to systematic environmental perturbations and stochastic noise. These results provide a mechanistic basis for gene expression evolvability that can serve as a foundation for realistic models of regulatory evolution.

Regulatory variation underlies much of phenotypic diversity, and gene expression is the first step in making ecologically and evolutionarily relevant phenotypes. Differences among genes both in standing genetic variation and in interspecies divergence in gene expression have been linked to their particular roles in biological networks (14) and may reflect a history of selection. However, the influence of specific evolutionary forces cannot be inferred solely from the distribution of natural variation, because selection interacts with demography and mutation to shape polymorphism and divergence (5). Measuring the effects of spontaneous mutations without the confounding effect of natural selection makes it possible to isolate the contribution of mutation to natural variation and is a fundamental step toward building models for the evolution of gene expression. The relationship between divergence and mutational effects on gene expression has been measured in the fruit fly Drosophila melanogaster and the worm Caenorhabditis elegans (6, 7), revealing that stabilizing selection plays a dominant role in limiting the extent of polymorphisms in gene expression in nature (8). We used Saccharomyces cerevisiae to investigate how the structural properties of genes and regulatory networks shape the relation between mutations and gene expression and thereby affect the course of evolution.

We performed a mutation-accumulation (MA) experiment (Fig. 1A) in S. cerevisiae in order to isolate the contribution of the mutational process to gene expression evolution. With serial transfer of random colonies, we accumulated spontaneous mutations by maintaining parallel lines with effective population sizes of ∼10 individuals. The lines diverged from an isogenic common ancestor for 4000 generations. At this population size, the fate of most nonlethal mutations is largely governed by random genetic drift (9), and the divergence observed among the lines allows us to estimate the rate at which gene expression would evolve in the near absence of selection. Lethal mutations would be eliminated through our experimental protocol, but they are unlikely to contribute to standing genetic variation produced by mutations in natural populations. We randomly selected four MA lines, measured their gene expression levels with DNA microarrays (10), and estimated rates of gene expression evolution.

Fig. 1.

(A) MA experimental design. (B) Number of genes differentially expressed among the four MA lines as a function of the Bayesian posterior probability of differential expression. Black bars indicate the estimated fraction of genes expected by chance. (C) Relative-fold change in expression level for genes with significant differences among the four MA lines.

The rate of phenotypic evolution due to mutation alone can be measured by the mutational variance (Vm), which is defined as the increase in the variance of a trait introduced by mutations each generation. It can be calculated from the variance of traits among MA lines. For haploid asexual organisms, Vm = 2σ2b/t, where σ2b is the between-line variance and t is the number of generations (5). We estimated the Vm of gene expression for genes that showed significant statistical differences (Bayesian posterior probability > 0.99) in expression among any pair of the four MA lines by using logtransformed relative expression levels (Fig. 1B). This resulted in 2031 genes differentially expressed across strains, with 85 showing differences higher than threefold (Fig. 1C). We found that the median Vm in gene expression in yeast is 4.7 × 10–5 (fig. S1), which is comparable to that previously estimated in fruit flies [∼2 × 10–5 (6)] and about two orders of magnitude below those typically observed for morphological phenotypes (11). Hence, there are common characteristics that determine the mutational variation in gene expression in spite of large differences between these organisms. Furthermore, our estimates of Vm correlate positively with genetic variation in gene expression among natural isolates of S. cerevisiae (12) (ρ = 0.25, P < 2.2 × 10–16, n = 1888). Therefore, variation in levels of expression among genes and regulatory pathways in natural populations are shaped in part by variation in the transcriptional sensitivity to mutations. Also, we found that genes with high Vm tend to be under represented in biological processes such as cell growth and maintenance, metabolic process, cell cycle, and transcription (fig. S2).

Three main factors influence the probability that a mutation affects the expression level of a gene: (i) the number of other genes that influence the expression of the focal gene (trans-mutational target size), (ii) the number of regulatory elements controlling the expression of the gene (cis-mutational target size), and (iii) the distribution of effects of mutations on expression. We examined whether features of these first two components could affect the sensitivity of expression levels to mutation (Fig. 2A).

Fig. 2.

(A) Schematic of trans- and cis-mutational target sizes. On the left of each image are cases of smaller mutation target sizes, and on the right are larger mutational target sizes. The trans-mutational target size does not solely include transcription factors but all genes acting upstream of the focal gene. (B) Positive relationship between trans-mutational target size and Vm. The averages of 10 bins are plotted, with error bars denoting two standard errors. (C) Mean Vm of genes with and without a TATA box in their promoters. Error bars denote two standard errors.

The trans-mutational target size of a gene is composed of the number of genes in the genome that affect the expression level of the focal gene, weighted by their influence and their own mutational parameters (Fig. 2A). We used expression profiling of 297 gene knockouts (13, 14) to estimate the trans-mutational target size of a gene as the fraction of deletions of other genes in the genome that affect its expression level. We found that Vm correlates strongly with the trans-mutational target size (ρ = 0.33, P <2 × 10–16, n = 1951) (Fig. 2B and fig. S3). Hence, larger transmutational target sizes may indeed result in higher sensitivities of gene expression to mutations.

The cis-mutational target size of a gene scales with the number and sizes of transcription factor binding sites, either directly through the number of nucleotides in the sites or indirectly through the number and variety of regulatory molecules binding to these sites. We mapped transcription factor binding sites to yeast promoters and determined the number of binding sites per promoter (10, 15). Genes that changed significantly in expression among the MA lines had a larger number of binding sites than those that did not change significantly (2.9 versus 2.4, Wilcoxon rank test, P = 1 × 10–5), and the Vm globally increased with the number of binding sites (ρ = 0.14, P = 0.0007, n = 608). Genes with a large number of transcription factor binding sites are more sensitive to spontaneous mutations affecting the level of gene expression.

Eukaryotic genes differ in the composition of their cis-regulatory targets. About one-fifth of yeast genes contain a TATA box (16), which modifies several aspects of their transcriptional regulation (17, 18). TATA-containing genes are more likely to be subtelomeric, highly regulated by nucleosomes and chromatin regulators (16), and associated with elevated rates of gene-expression divergence among species (4) and adaptation during experimental evolution (16). This divergence may be the result of diversifying selection (4, 16), but it could also reflect a bias in the sensitivity of TATA-containing genes to spontaneous mutations.

We found that genes with a TATA box were significantly more likely to change in expression among the MA lines (49% versus 32%; Fisher's exact test, P = 2.5 × 10–16) (table S1) and had a mean Vm that was twice as high as that of genes lacking a TATA box (Vm of 1.17 × 10–4 versus 0.52 × 10–4; Wilcoxon rank test, P <2 × 10–16) (Fig. 2C). Although stress-response genes are particularly enriched for TATA boxes (4, 18), eliminating stress-response genes from the analysis did not change the result (Vm of 1.0 × 10–4 versus 0.51 × 10–4; Wilcoxon rank test, P < 2.2 × 10–16). Hence, genes with a TATA box are more sensitive to genetic perturbations, and their overrepresentation among genes responding rapidly to artificial selection (16) and genes that show increased divergence among species (4) can be partly explained by their higher regulatory evolvability. Indeed, the larger trans-mutational target sizes of TATA-containing genes (0.02 versus 0.007; Wilcoxon rank test, P = 3 × 10–16) suggest a mechanism by which this may be achieved.

Because TATA box–containing genes have large cis- and trans-mutational target sizes relative to TATA-less genes, we used a series of generalized linear models to simultaneously assess the effects of the trans- and cis-mutational target sizes and the presence of a TATA box on the sensitivity of expression levels to mutations. First, we found that the larger number of binding sites in the promoters of TATA-containing genes [(4); in our data set, 3.3 versus 2.2; Wilcoxon rank test, P = 7 × 10–11] could fully account for the previous correlation between cis-mutational target size and Vm. When other factors are considered simultaneously, the number of transcription factor binding sites has no effect on the Vm of the gene (0.5% of the variance explained, P = 0.11). Second, we found that the larger transmutational target size of TATA-containing genes cannot fully account for the relationship between trans-mutational target size and Vm. Instead, trans-mutational target size and Vm are associated even when the effects of the TATA box are first removed (tables S2 and S3). Furthermore, we find a significant correlation between the Vm and the trans-mutational target size even after excluding TATA-containing genes (ρ = 0.2, P < 0.00001, n = 811), thus lending unambiguous support to the conclusion that effects of trans-mutational target size are independent of the TATA box.

A fundamental feature of organisms is their capability to cope with genetic and environmental perturbations (19). Whereas genetic and environmental canalization have often been conceptualized (20) and modeled (21) as distinct phenomena, mechanisms that produce canalization may act simultaneously to modulate the effects of both kinds of perturbations (22). Hence, phenotypes that are buffered against the effects of environmental perturbations might also be buffered against the effects of mutations. With public data on the amount of variation in gene expression over different environmental conditions (4), we found that Vm in the expression of a gene and its transcriptional plasticity to macro-environmental perturbations are positively correlated (Fig. 3A) (ρ = 0.37, P = 2 × 10–16, n = 1735). Furthermore, we found that protein expression noise, a measure of the sensitivity of gene expression to microenvironmental variation such as fluctuations in the amount of upstream cellular components (2325), and Vm are also positively correlated (Fig. 3B) (ρ = 0.27, P = 1 × 10–14, n = 776). These relationships are not confounded by the effects of gene expression level, because neither mRNA nor protein abundances correlate with Vm (fig. S4, A and B). Hence, the effects of mutational and both environmental perturbations and stochastic noise are related such that mechanisms that evolve to promote or buffer transcriptional responses to one source of variation may also affect the others. Lastly, the strength of the relationship between environmental and genetic perturbations vary across sets of genes (table S4 and SOM), indicating that the relative contributions of these sources of perturbation toward the evolution of canalization may differ substantially from one gene or metabolic network to the next.

Fig. 3.

Mutational variance of gene expression correlates with plasticity of transcriptional response (A) and stochastic noise in protein abundance (B). In each case, the averages of 10 bins of equal sizes are plotted, with error bars denoting two standard errors.

We show that not all genes are equally sensitive to the effects of random spontaneous mutations and identify structural properties (presence of a TATA box and trans-mutational target sizes) that greatly influence a gene's potential to undergo regulatory change. These determinants provide a mechanistic basis to serve as a foundation for more-realistic models of gene expression evolution that account for levels of polymorphism and divergence in cis and trans gene regulation.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S6

Tables S1 to S4

References and Notes

View Abstract

Stay Connected to Science

Navigate This Article