Genetic Determinants and Cellular Constraints in Noisy Gene Expression

See allHide authors and affiliations

Science  06 Dec 2013:
Vol. 342, Issue 6163, pp. 1188-1193
DOI: 10.1126/science.1242975


In individual cells, transcription is a random process obeying single-molecule kinetics. Often, it occurs in a bursty, intermittent manner. The frequency and size of these bursts affect the magnitude of temporal fluctuations in messenger RNA and protein content within a cell, creating variation or “noise” in gene expression. It is still unclear to what degree transcriptional kinetics are specific to each gene and determined by its promoter sequence. Alternative scenarios have been proposed, in which the kinetics of transcription are governed by cellular constraints and follow universal rules across the genome. Evidence from genome-wide noise studies and from systematic perturbations of promoter sequences suggest that both scenarios—namely gene-specific versus genome-wide regulation of transcription kinetics—may be present to different degrees in bacteria, yeast, and animal cells.

The advent of rapid, inexpensive DNA sequencing methods allows scientists to map not only the protein-coding genes in the genomes of many organisms, but also the regulatory sequences present in those genomes. A key challenge for biologists in the next few decades is understanding how these regulatory sequences control the expression of every gene in the cell and how they collectively determine the topology and dynamics of gene regulatory networks.

The regulation of gene expression traditionally has been studied in experiments that measured the average gene expression level in populations containing millions of cells. These studies relate the average rate of gene expression for a gene to its regulatory DNA sequence (the promoter architecture) (1). This approach has a major shortcoming, however, because averaging over populations masks differences in gene expression that may occur between individual cells (2). These differences may in turn have consequences for the whole multicellular community or organism, which makes it important to understand gene expression in single cells.

Fig. 1 Gene-specific versus global determinants of transcription kinetics.

Two alternative scenarios are presented, representing different ways in which stochastic promoter activity may be governed. (Top)In the first scenario, the rates of promoter activation and inactivation are controlled exclusively by gene-specific mechanisms. (Bottom) In the second scenario, gene-specific regulation occurs in the presence of a global constraint on the kinetics. We illustrate the consequences of the two scenarios using simple mathematical “toy models” of promoter kinetics (supplementary materials). For each model, we calculate both the mean expression level and the size of transcription bursts for a large set of promoters and plot these two variables against each other. In the plot, black markers denote 300 randomly chosen individual promoters, the red curve represents the smoothed behavior of 104 promoters, and the shaded region is a guide-to-the-eye that depicts the full range of possible burst sizes for a given mean expression (supplementary materials).

Within a single cell, gene expression is inherently stochastic, or random (2). Protein-coding genes are typically present in only one or two copies per cell. Whether a gene is transcribed at any given moment depends on the arrival, by diffusion, of multiple regulatory proteins to their designated binding sites, as well as the occurrence of multiple biochemical steps required for initiation of transcription (3). These biochemical reactions are all essentially single-molecule events and thus stochastic, resulting in substantial randomness in the production of mRNA. Broadly, two stochastic kinetic modes of transcription have been observed in individual cells: “Poissonian,” in which mRNAs are synthesized in random, uncorrelated events, with a probability that is uniform over time (4, 5); and “bursty,” in which mRNA is produced in episodes of high transcriptional activity (bursts) followed by long periods of inactivity (Fig. 1 and Box 1) (58). The kinetic features of mRNA production are in turn propagated to the proteins translated from them. The end result is temporal fluctuations, and corresponding cell-to-cell variability, in mRNA and protein numbers. This cell-to-cell variability is referred to as gene expression noise (Box 2) (2).

Box 1

Methods for probing gene expression at the single-cell level.

Embedded Image

(A) Transcription can be followed in real time in live cells by labeling nascent mRNA with fluorescently tagged RNA-binding proteins that are strongly expressed in the cell (46). ( Transcriptional kinetics in E. coli, L. So and I. Golding; adapted with permission from Physical Biology of the Cell, R. Phillips, J. Kondev, and J. Theriot; Garland Science.)

(B) Distributions of the numbers of mRNA molecules per cell can be measured with single-mRNA counting techniques, such as single-molecule fluorescent in situ hybridization (smFISH) applied to fixed cells (8, 27, 35). [mRNA distribution from animal cells (8)]. These mRNA distributions carry the signature of the transcriptional kinetics.

Bursty transcription typically leads to higher noise than does Poissonian transcription. In particular, the burst size controls the magnitude of the noise, and it is approximately proportional to the Fano factor (the ratio between the variance and the mean of the distribution of mRNA or protein numbers per cell). Thus, genes with large burst sizes are characterized by broader distributions of protein and mRNA, larger Fano factor, and higher noise as compared with those of genes with small burst sizes.

The inherent randomness associated with gene expression raises an important question: Are the stochastic kinetics of transcription—and therefore the resulting variability in mRNA and protein levels—encoded by the promoter regulatory sequence, just as the mean expression level of a gene appears to be? Two alternative answers have been put forward. One view is that the stochastic kinetics of gene activity are genetically determined by the promoter architecture and governed by the binding and unbinding of various regulatory elements such as histones and transcription factors to their corresponding binding sites (915). In this view, it is the process of gene regulation, as it acts on each promoter individually, that causes bursting in some genes but not in others.

An alternative view is that transcription kinetics are dominated by genome-wide constraints that lead to general—as opposed to gene-specific—modulation of transcriptional kinetics (1619). These constraints may reflect any number of physiological or biophysical mechanisms. Proposed mechanisms include cell-cycle–dependent regulation of promoter activity (20) as well as inherent features of the transcription process, such as the cooperative recruitment of RNA polymerases (21). Notwithstanding the specific details, this view implies that gene-specific transcriptional regulation acts on top of these gene-nonspecific constraints and has only a secondary effect on the observed kinetic features, such as transcription bursts.

In the first view described above, transcriptional kinetics (in particular, bursting) are gene-specific and free from global constraints. One consequence of this is that at both very low and very high rates of transcription promoter activity is expected to be regular in time and well-described by Poisson statistics (Fig. 1, top). At intermediate rates of transcription, when genes are neither expressed at full capacity nor very infrequently, different genes may vary greatly in their temporal kinetics and exhibit either regular (Poissonian) or bursty behavior, depending on the particular mechanisms of transcriptional regulation for each gene (Fig. 1, top).

In contrast, global constraints on transcriptional kinetics that affect all genes (Fig. 1, bottom) result in a more limited space of possible kinetic behaviors. For instance, one such constraint that has been recently reported (1618) is the presence of inherent and global bursting kinetics even in fully active gene loci. Thus, highly expressed genes are not Poissonian but instead characterized by large burst sizes (Fig. 1, bottom) (16, 17) and therefore large cell-to-cell variability in mRNA and protein expression. Because this constraint operates globally, all genes in the cell follow a characteristic trend between the burst size and the mean amount of expression (Fig. 1, bottom).

The two views described above represent two limiting cases; it is possible that single-cell transcriptional kinetics are affected, to varying degrees, by both gene-specific (promoter architecture) and genome-wide processes. This Review critically examines both views and the evidence supporting each of them. An organism for which there is strong evidence for a genetic origin of transcriptional noise is the yeast Saccharomyces cerevisiae. In the prokaryote Escherichia coli, as well as other eukaryotic systems, the picture is less definitive than in yeast, but some evidence exists pointing to the presence of global constraints.

Transcriptional Kinetics in Yeast are Gene-Specific

The relation between noise and the mean expression level has been examined in several studies that measured gene expression at the single-cell level for a complete yeast genomic library (22) or a large set of promoters (14, 23, 24). As noted above, it is possible to estimate the burst size from the degree of cell-to-cell variability in protein concentration (supplementary materials). The results of all of these experiments are consistent in that they find no obvious trend between the mean expression level from a promoter and the estimated burst size (Fig. 2A). The only global constraint observed in the noise measurements is the one corresponding to the limiting case of Poissonian transcription, followed by Poissonian translation of mRNA into protein (2).

Fig. 2 Evidence for genome-wide constraints on promoter kinetics in bacteria and mammalian cells, but not yeast.

(A) Data from three studies in yeast (14, 23, 24) show no obvious correlation between the mean expression level of a gene and the transcription burst size. In particular, both low-expression and high-expression genes can exhibit nonbursty behavior, which is consistent with a scenario of gene-specific regulation. In contrast, (B) two studies in E. coli (17, 32) and (C) three in mammalian cells (16, 34, 38) show that higher expression is accompanied by increased burstiness, which is consistent with the presence of a global cellular constraint on promoter kinetics. Excluding live-cell measurements [(B), iii and (C), i], the burst sizes were estimated by using the Fano factor of the corresponding distribution [protein measurements, excluding (B), i], after correcting for the level of extrinsic noise. Black circles designate the individual measurements. Red curves are calculated trend lines. The shaded area highlights the full range of burst sizes covered by each data set. More details are available in the supplementary materials.

These global noise experiments also reveal relationships between noise measured for a given promoter and the known properties of that promoter. The majority of low-noise promoters have a characteristic architecture [depleted proximal nucleosome (DPN)] that is defined by a nucleosome-free region immediately downstream of the initiation site. In contrast, the majority of high-noise promoters have a second type of architecture [occupied proximal nucleosome (OPN)], characterized by the lack of a nucleosome-free region. They are also enriched for strong TATA boxes (25, 26). The model emerging from these findings is that promoter switching between inactive (promoter occluded by nucleosome) and active (nucleosome free, preinitiation complex formed) states may result in bursty transcription, in turn leading to the higher noise observed in nucleosome-covered promoters. Within this picture, the strong TATA box would ensure that the promoter is expressed strongly when active, increasing the burst size (11).

This model is further supported by single-mRNA counting experiments in fixed cells (Fig. 3A) (27, 28). These studies examined the mRNA distributions for 12 different constitutive promoters that have low nucleosome density and were characterized as “low noise” in genomic studies. All of these promoters exhibited close-to-Poissonian mRNA copy-number distributions and sub-Poissonian (with lower variance than a Poissonian distribution of the same mean) nascent RNA copy number distributions, which is consistent with the absence of bursts. In contrast, promoters that are not constitutive, but regulated by nucleosomes and transcription factors, exhibited broader mRNA distributions consistent with bursty transcription (Fig. 3A and Box 1). These single-cell mRNA counting experiments substantiated the notion that the observed differences in protein noise between promoters [as in the studies above (2224)] reflect the fluctuations of the mRNA species (which in turn are driven by transcriptional kinetics). The mRNA-counting studies specifically predicted that in the low-noise, nucleosome-depleted promoters, transcription events should occur regularly in time, in a simple Poissonian manner. This prediction was confirmed in experiments in which the kinetics of transcription were followed in real-time by using temporal correlation analysis of fluorescently labeled mRNA molecules in live yeast cells (4). Both the constitutive promoter MDN1 and the cell-cycle–activated promoter POL1 were transcribed in random, uncorrelated events with a single rate of initiation that varied during the cell cycle (4)

Fig. 3 Bursting kinetics in yeast are promoter-dependent and not subject to strong constraints.

(A) Both Poissonian and bursty kinetics are found in endogenous yeast promoters. Copy-number statistics of mature (left) and nascent (right) mRNA are consistent with Poissonian promoter activity for MDN1 (top) while indicating bursty activity of PDR5 (bottom) [adapted by permission from Macmillan Publishers: Nature Structural & Molecular Biology (27) copyright 2008]. (B to E) Manipulating promoter architecture leads to a different relation between the mean expression and the burst size, so that two promoters with the same mean expression level can exhibit different burst sizes. The burst size is plotted as a function of the mean amount of expression for various perturbations of promoter architecture—specifically, (B) changing of the number of operator sites from one to seven (29); (C) changing of the position of the operator site from a location proximal to the first transcribed nucleotide, to a more distant location (15); (D) the presence (red dots) or absence (black dots) of a TATA box (24), including spontaneous mutations that delete it (red dots that overlap with the promoters lacking the TATA box); or (E) promoters engineered to yield the same mean level of expression by either adding a nucleosome disfavoring poly dA:dT sequence or by increasing the strength of a transcription factor binding site (+BS) (13). (F) The same transcription factor acting as a repressor (red dots) or as an activator (black dots) of a promoter leads to different burst sizes for the same mean level of expression (14). More details are available in the supplementary materials.

The evidence above indicates that the kinetic mechanism of transcription for a yeast promoter is mainly encoded by the DNA sequence of the promoter. To substantiate this picture, investigators deliberately altered yeast promoter architecture and examined the resulting change in gene expression noise. The systematic alterations included the presence or absence of a TATA box and its strength (12, 24); the number (15, 29), location (15), and nucleosomal coverage (30) of transcription-factor binding sites; the presence of nucleosome-disfavoring sequences (13); and the mode of action of a transcription factor [whether it was acting as an activator or as a repressor (14)]. All of these architectural elements were found to strongly affect the relationship between the mean amount of expression and the burst size, in a way that is consistent with the expectation from simple models of transcriptional kinetics (Fig. 3) (10).

The evidence indicates that the stochastic transcriptional kinetics for a given gene in yeast is mainly determined by its promoter architecture, and no strong global constraints have been observed. Promoter switching introduced by transcription factors and nucleosomes stochastically associating and dissociating leads to a bursty transcription and correspondingly higher degree of noise than that observed in constitutive promoters (11). Additional promoter features such as the strength (12, 13) and copy number (15, 29) of transcription-factor binding sites, their location within the promoter (15), and the specific mechanism of gene regulation by the transcription factors bound to them (14) all affect noise in gene expression.

Evidence for Global Noise Constraints in E. coli

A small number of studies have directly examined transcriptional kinetics in E. coli. Live-cell mRNA labeling (Box 1) has been used to visualize the synthesis of individual mRNA molecules in real time from a synthetic E. coli promoter (6, 17). Analysis of how adding an inducer of gene expression affected the bursting parameters revealed that the burst frequency was modulated at low concentration of inducer, whereas the burst size was modulated at high concentration of inducer (Fig. 2B, iii) (17). Transcription was bursty even for fully induced conditions (6). The Plac/ara promoter investigated in this study is a derivative of the lac promoter (Plac), whose mechanism of regulation by lac repressor is thought to be well understood (1). However, simple theoretical models, based on these well-characterized mechanisms of gene regulation, failed to explain the observed effect of inducer on the bursting parameters (9). The transcriptional kinetics of the wild-type lac promoter have also been investigated as reflected in protein synthesis (31). These studies also indicate an increase in transcriptional burst size at high inducer concentration (31).

A predicted consequence of the observed modulation of the bursting parameters described above is that the Fano factor of the mRNA distribution should increase as the mean increases. This was observed for the lac promoter in the presence of various concentrations of inducer (17). Six other promoters were also studied under diverse conditions. Unexpectedly, it was found that all of them yielded values of the Fano factor that were very similar to those observed for the lac promoter (Fig. 2B, i) (17). The authors interpreted this finding as indicating that the promoters investigated all modulate their bursting parameters in a similar fashion: modulation of the burst frequency when expression is low and of the burst size when expression is higher.

An examination of the protein copy-number distribution for a genomic library in E. coli (32) found that the Fano factor did indeed increase with the mean level of gene expression, in a pattern very reminiscent of the RNA data described above (Fig. 2B, i and ii) (17) and consistent with the earlier protein distributions measured for Plac (31). The authors offered an alternative interpretation for the observed relationship between the Fano factor and the mean protein concentration: that the increase in the Fano factor does not result from transcription bursts of increased size, but rather reflects the dominance of extrinsic noise (that caused by cell-to-cell differences in gene expression parameters) at high rates of expression. In support of this hypothesis, measurements of extrinsic noise coincided with the noise baseline observed at high rates of expression (32).

The difficulty to discriminate between alternative models on the basis of the relationship between the noise and the mean amount of expression underscores the potential pitfalls associated with overinterpreting static copy-number distributions. In addition to extrinsic noise, other effects may be confounded with stochasticity in transcription and translation. For instance, the statistics of protein and mRNA partitioning during cell division may in some cases lead to noise-mean scaling that is indistinguishable from that of stochastic transcription and translation (33). Effects due to the cell cycle may also be mistakenly attributed to stochastic transcription (20).

To achieve a mechanistic understanding of how promoter architecture affects transcriptional kinetics and noise in E. coli, the path previously taken in yeast will have to be emulated by measuring gene expression noise in promoters whose architecture is systematically perturbed (1215, 29). Such noise studies need to be accompanied by direct measurements of transcription kinetics in live cells.

Mammalian Cells and Other Eukaryotes

The studies reviewed above support the idea that constitutive genes in budding yeast are expressed in a Poissonian, nonbursty manner. However, it would be wrong to assume that this result is typical of all eukaryotic cells. Transcriptional kinetics of multiple genes in Dictyostelium discoideum (another single-celled eukaryotic organism), including housekeeping genes, were found to be bursty (7). Beyond protozoans, bursting kinetics seem to be the rule, rather than the exception, in many different types of animal cells, including cultured mammalian cells (5, 8, 16, 34), fly embryos (18, 35, 36), and in the mouse (37).

Real-time transcriptional kinetics of both natural and synthetic promoters inserted into mouse fibroblasts (34) revealed transcriptional bursting from these promoters, and a refractory period after the bursts. Synthetic promoters engineered to differ in the number of binding sites for a transcriptional activator or their binding strength showed that both of those architectural features can affect the burst size. This might be interpreted as evidence that promoter architecture can allow decoupling the mean amount of expression from the noise. However, when the bursting parameters were measured for these different promoters as a function of their mean level of expression, a single trend line was followed closely by all of the promoters regardless of their regulatory sequence (Fig. 2C, i).

The existence of a nontrivial trend line between the noise and the mean amount of protein in the cell was also observed in a study that analyzed single-cell gene expression from a weak viral promoter randomly inserted at over 8000 genomic loci in a line of human T lymphocytes (16). Evidence of bursting was found across the whole set of loci in which the lentiviral reporter was able to integrate. Similar to the modulation of bursting discussed above for E. coli (17), the data were consistent with modulation of burst frequency at low expression, and of burst size at high expression (Fig. 2C, ii). A similar study of HIV promoters integrated at random locations in the genome of T cells also showed a trend-line between the burst size and the mean expression level (Fig. 2C, iii) (38).

Cellular Constraints That Impose Bursting?

The widespread observation of transcriptional bursting from bacteria to animal cells has prompted the idea that bursting may be a beneficial trait, possibly allowing an optimal allocation of resources or better processing of information from the environment (17). Bursting might also be an unavoidable kinetic feature of transcription, reflecting biophysical constraints that apply to most genes within a given organism (19, 21, 39).

But are transcription bursts really unavoidable? A good test case is the behavior of promoters acting at full capacity. In the absence of any constraints, such promoters are expected to exhibit Poissonian, nonbursty kinetics (Fig. 1). In E. coli, ribosomal promoters are transcribed at an average rate that is very close to the upper boundary imposed by RNA polymerase elongation (40). This suggests that they can be expressed continuously and without large bursts at full capacity. On the other hand, the hypothesis that biophysical or cellular constraints impose bursting even in highly expressed promoters is consistent with the finding that fully induced mRNA-coding promoters display strong bursting (6) and highly non-Poissonian mRNA distributions (17). More experiments in which the architecture of a promoter is systematically varied are needed for elucidating the mechanisms governing transcription kinetics in bacteria—in particular, whether bursting is governed by universal or gene-specific constraints.

As for animal cells, bursting is prevalent in endogenous promoters (5, 8, 18, 3436), but a few counter examples also exist. A highly expressed viral promoter inserted in mammalian cells was found to be expressed continuously in a burst-free manner (5). RNA counting in Drosophila embryos also suggests that highly expressed genes can act close to full transcriptional capacity and exhibit nearly Poissonian statistics (36) [a counterexample is available at (18)]. These counterexamples appear to indicate that bursting is not an unavoidable by-product of transcription in all animal cells. Only a limited number of promoters, many of which are synthetic, have been studied in animal cells. Moreover, most experiments have been done in vitro with cell lines. In their natural context, animal cells are arranged in a complex multicellular environment, which may affect gene expression. Experimental advances that make it possible to count individual mRNA molecules in fixed animals or embryos (18, 3537) offer a promising avenue to better understand transcriptional kinetics in multicellular environments.

Supplementary Materials

Box 2


Fano factor: The ratio between the variance (standard deviation squared) and the mean of a measured quantity. The Fano factor of mRNA number per cell can serve as an estimate for the size of transcription bursts.

Gene expression noise: Variability in the level of gene expression between genetically identical cells, due in part to random fluctuations in the production of mRNA and protein.

Poisson process: The simplest random process, in which events occur with a constant probability over time.

Promoter: A region of DNA that controls transcription from a gene. An important element of eukaryotic promoters is the TATA box, which strongly affects the strength of the promoter.

Transcription bursts: The production of multiple mRNAs within a short time, followed by a period of promoter inactivity.

Reference and Notes

  1. Acknowledgments: We apologize to our colleagues whose work could not be cited because of space limitations. We thank the following for sharing data and insights with us: A. Bar-Even, N. Barkai, A. Boettiger, M. Dadiani, A. Depace, T. Gregor, J. Kondev, D. Larson, R. Philips, A. Raj, E. Segal, and D. Zenklusen. I.G. thanks his laboratory members for help preparing the manuscript. Work in the Sanchez lab is supported by a Junior Fellowship from the Rowland Institute at Harvard. Work in the Golding lab is supported by the U.S. National Institutes of Health grant R01 GM082837, U.S. National Science Foundation grants 082265 (Physics Frontiers Center: Center for the Physics of Living Cells) and PHY-1147498 (CAREER), and Welch Foundation grant Q-1759. The authors declare no conflicts of interest.
View Abstract

Navigate This Article