Probing Gene Expression in Live Cells, One Protein Molecule at a Time

See allHide authors and affiliations

Science  17 Mar 2006:
Vol. 311, Issue 5767, pp. 1600-1603
DOI: 10.1126/science.1119623


We directly observed real-time production of single protein molecules in individual Escherichia coli cells. A fusion protein of a fast-maturing yellow fluorescent protein (YFP) and a membrane-targeting peptide was expressed under a repressed condition. The membrane-localized YFP can be detected with single-molecule sensitivity. We found that the protein molecules are produced in bursts, with each burst originating from a stochastically transcribed single messenger RNA molecule, and that protein copy numbers in the bursts follow a geometric distribution. The quantitative study of low-level gene expression demonstrates the potential of single-molecule experiments in elucidating the workings of fundamental biological processes in living cells.

The central dogma of molecular biology states that DNA is transcribed into mRNA, which is then translated into protein. Ever since the pioneering work on the lac operon (1), our knowledge of gene expression has come primarily from genetic and biochemical studies (24) conducted with large populations of cells and molecules. Recently, many in vitro single-molecule experiments have probed real-time dynamics and yielded valuable mechanistic insights into macromolecules (58), including transcriptional (9) and translational (10) machineries. In order to understand the workings of these machineries in their physiological contexts, we set out to probe gene expression at the single-molecule level by real-time monitoring of protein production in live cells.

Gene expression is often stochastic (1114), because most genes exist at single or low copy numbers in a cell. Some genes are expressed at high levels and others at low levels. The mRNA expression can now be tracked in a single cell with single-molecule sensitivity (15, 16). The protein expression has been traditionally characterized by averages of cell populations, in which stochasticity is masked. More information is available from both the distribution of expression levels among a cell population (1719) and the temporal evolution of a single cell by using fluorescent reporters (20). However, these studies have been restricted to high expression levels because of the low sensitivity for protein detection, yet many important proteins are produced at small copy numbers (21, 22). Here, we demonstrate probing protein expression in individual Escherichia coli cells under the control of a repressed lac promotor, one molecule at a time (23).

The most popular reporters for monitoring gene expression in live cells are green fluorescent protein (GFP) and its derivatives, such as yellow fluorescent protein (YFP) (2426). We use a YFP variant, Venus, as the reporter because of its short maturation time (27). However, it is difficult to image a single GFP or YFP molecule in cytoplasm, because its fluorescence signal spreads to the entire cytoplasm by fast diffusion during the image acquisition time and is overwhelmed by cellular autofluorescence. On the other hand, single YFP fusion protein molecules on cell membranes can be detected (28, 29) because their diffusion is slowed. Therefore, we designed a fusion protein consisting of Venus and a membrane protein, Tsr, as the reporter for monitoring lac promoter activity. A well-studied methylation-dependent chemotaxis receptor protein (MCP) (30), Tsr contains two transmembrane domains and is fused to the N terminus of Venus.

We constructed an E. coli strain SX4 in which a single copy of the chimeric gene tsr-venus was incorporated into the E. coli chromosome, replacing the native lacZ gene. The endogenous tsr gene of E. coli was left intact. Because the tsr gene is highly expressed (30), a small amount of exogenous Tsr-Venus poses minimal perturbation to cells' normal functions. Western assay of induced SX4 cells showed the presence of Venus only in the membrane fraction and not in the cytoplasmic fraction, suggesting efficient membrane localization of Tsr-Venus. We also compared the levels of induced expression of Tsr-Venus and Venus in two strains, both under the control of the lac promoter [Supporting Online Material (SOM) Text and fig. S1]. No notable difference was observed, indicating that the introduction of the tsr sequence does not change the yield of Venus production, which is not the case for many other membrane-targeting sequences that we tested.

We first show the ability to detect single Tsr-Venus fluorescent protein molecules expressed in SX4 cells (Fig. 1). Figure 1A shows two diffraction-limited fluorescent spots [full width at half maximum (FWHM) ∼ 300 nm] in the left cell. A line cross section of the fluorescence image along the cells' long axes shows the signal distinctly above the cells' autofluorescence background (Fig. 1C). We attribute each fluorescent peak to an individual Tsr-Venus molecule on the basis of abrupt disappearance of the signal upon photobleaching, which is characteristic of single molecules. Figure 1D shows such a photobleaching time trace. Had the signal arisen from multiple molecules, its disappearance would be in multiple steps. In addition, the fluorescence intensity of each peak is consistent with in vitro measurements of purified single Venus molecules (fig. S2).

Fig. 1.

Single-molecule detection of a fluorescent fusion protein, Tsr-Venus, in live E. coli cells. (A) Fluorescence and (B) DIC images of two E. coli cells (strain SX4) expressing Tsr-Venus. Two single fusion protein molecules were detected as diffraction-limited fluorescent spots (FWHM at ∼300 nm) in the left cell. The fluorescence image is taken with 514-nm laser excitation and a 100-ms exposure time at 0.3 kW/cm2. (C) Line cross section of the fluorescence signal along long axes of the two E. coli cells. a.u., arbitrary units. (D) Fluorescence time trace of a single Tsr-Venus molecule in an E. coli cell, showing abrupt photobleaching (40-ms exposure at 0.5 kW/cm2).

A sketch of our live-cell experiment is shown in Fig. 2. Upon an infrequent and spontaneous dissociation event of the repressor from the operator region of DNA, transcription by RNA polymerase is initiated, generating one mRNA molecule. A few ribosome molecules bind to the mRNA, producing a burst of fusion protein molecules. These molecules can be detected after the completion of their assembly process, including protein folding, incorporation onto the inner cell membrane, and maturation of the Venus fluorophore. Meanwhile, the repressor quickly rebinds to the operator under the highly repressing condition until the next event of protein production.

Fig. 2.

Experimental design for live-cell observations of gene expression. Tsr-Venus is expressed under the control of lac repressor, which binds tightly to the lac operator on DNA. Transcription of one mRNA by an RNA polymerase results from an infrequent and transient dissociation event of repressor from DNA. Multiple copies of protein molecules are translated from the mRNA by ribosomes. Upon being assembled into E. coli's inner membrane, Tsr-Venus protein molecules can be detected individually by a fluorescence microscope.

An epifluorescence microscope and a charge-coupled device (CCD) camera were used to image Venus with 514-nm laser excitation, while differential interference contrast (DIC) images were taken simultaneously to record the cell contours during growth. To count the fusion protein molecules as they were continuously generated, we photobleached the Venus fluorophores after their detection. Specifically, we applied a 1200-ms laser exposure every 3 min. The laser power used was 0.3 kW/cm2, at which the sample photobleaching time constant is ∼250 ms. Fluorescence images were recorded only in the first 100 ms, during which photobleaching is minimal, and were discarded in the following 1100 ms in order to avoid variation in the integrated signal and reduction of the signal-to-background ratio due to photobleaching. The 3-min dwell time, which defines the temporal resolution, was chosen to avoid photodamage to the cells. Cells grown under such laser illumination in a temperature-controlled sample chamber have an average cell division time of τcell = 55 min, the same τcell as in shaking M9 liquid culture without laser illumination. In each image, one cell usually produces no more than five fluorescent protein molecules, which can be spatially resolved. In principle, higher expression levels can be quantified with integrated fluorescence signals.

The activity of the lac promoter in the SX4 strain under the highly repressing condition was monitored for cells immobilized by an agarose gel pad of M9 media maintained at 37°C through several cell cycles, and time-lapse movies were recorded (movies S1 and S2). A sequence of images from one of them is shown in Fig. 3A. In each fluorescence image, the fluorescent spots correspond to newly synthesized fluorescent molecules during the last 3 min. Although Tsr is known to cluster at one of the cell poles (SOM Text and fig. S6), we found that Tsr-Venus protein molecules initially land on random positions in the membrane and migrate to and cluster at the cell poles at a longer time scale (SOM Text and fig. S6). Time traces of the fluorescent protein molecules along cell lineages were extracted from the time-lapse fluorescence-DIC movies (Fig. 3B). More than 60 time traces have been collected for statistical analyses.

Fig. 3.

Real-time monitoring of the expression of tsr-venus under the control of repressed lac promoter. (A) Sequence of fluorescent images (yellow) overlaid with simultaneous DIC images (gray) of E. coli cells expressing Tsr-Venus on agarose gel pad of M9 medium. The cell cycle is 55 ± 10 min in a temperature-controlled chamber on a microscope stage. The eight frames are from time-lapse fluorescence movie S1 taken over 195 min with 100-ms laser exposures (0.3 kW/cm2) every 3 min. An 1100-ms exposure is applied after each image collection to photobleach the Venus fluorophores. (B) Time traces of the expression of Tsr-Venus protein molecules (left) along three particular cell lineages (right) extracted from the time-lapse fluorescence-DIC movie of (A). The time resolution is 3 min. The vertical axis is the number of protein molecules newly synthesized during the last three minutes. The dotted lines mark the cell division times. The time traces show that protein production occurs in random bursts, within which variable numbers of protein molecules are generated. Each gene expression burst lasts ∼3 to 15 min.

Several qualitative features are evident from these time traces. First, protein molecules are generated in bursts. Second, the number of protein molecules in each burst varies. Third, the bursts exhibit particular temporal spreads. Analysis of the data allows us to address the following four questions: Do these gene expression bursts occur randomly in time? How many mRNA molecules are responsible for each gene expression burst under the repressed condition? What is the distribution of the number of protein molecules in each burst? And what is the origin of the temporal spread of the individual bursts?

To address the first question, we show (Fig. 4A) the distribution of the number of gene expression bursts per cell cycle for all cells. The histogram is well fit with a Poisson distribution, which suggests that gene expression bursts occur randomly and are uncorrelated in time. We also observed a weak cell cycle dependence of the burst frequency (fig. S3), which might arise from an increase of gene copy number associated with DNA replication during cell growth and does not change the Poissonian distribution. The average number of bursts is nburst = 1.2 per cell cycle, yielding an average time of 46 min between two adjacent bursts. This time is comparable to in vitro dissociation times of lac repressor from lac operator O1 (20 to 50 min) (31, 32); albeit the dissociation time in live cells can be different, and each repressor dissociation event may not lead to successful transcription because of either temporary unavailability of RNA polymerase or failed transcription.

Fig. 4.

Statistical analyses of the protein production time traces. (A) Histogram (gray bars) of the number of expression events per cell cycle. The data fit well to a Poisson distribution (solid line) with an average of 1.2 gene expression burst per cell cycle. (B) Distribution of the number of fluorescent protein molecules detected in each gene expression burst, which follows a geometric distribution (solid line), giving a probability of ribosome binding of 0.81 ± 0.05 and an average number of molecules per burst of 4.2. (C) Autocorrelation function of the protein production time traces calculated according to Eq. S9. The result is averaged from 30 individual cell lineages because of the insufficient statistics of a single time trace. The fitting to Eq. 2 (solid line) gives 1/κ = 7.0 ± 2.5 min, which is attributed to posttranslational assembly of the fluorescent fusion protein.

In order to answer whether the bursts arise from one copy or multiple copies of mRNA, we determined the average number of mRNA molecules per burst (m) according to m = nmRNAτcell/(nburstτmRNA), where nmRNA is the steady-state abundance of tsr-venus mRNA molecules averaged over a cell population, τcell is the average cell division time, nburst is the average number of expression bursts per cell division cycle, and τmRNA is the cellular lifetime of the tsr-venus mRNA. By using real-time reverse transcription polymerase chain reaction (RT-PCR), we obtained nmRNA = 0.037 ± 0.013 for SX4 cells (table S1 and fig. S5). We also measured the cellular lifetime of tsr-venus mRNA to be τmRNA = 1.5 ± 0.2 min (fig. S4) by a real-time RT-PCR assay after a pulse induction of the mRNA production. It follows that m = 1.14 ± 0.42 (a 95% confidence interval; see SOM Text for more details). We thus conclude that under the repressed condition each gene expression burst results from one mRNA molecule, implying that Lac repressor quickly rebinds the exposed operator region of DNA, allowing transcription initiation of one mRNA molecule.

Next, we show the histogram of the number of protein molecules (n) produced per mRNA molecule (Fig. 4B). The distribution fits well with a single exponential decay, which is termed the geometric distribution for integer n. This distribution arises from the stochastic cellular lifetime of an mRNA molecule with mean τmRNA = 1.5 min (fig. S4) due to degradation of mRNA by ribonuclease (RNase) E, which competes with ribosomes for mRNA binding (33). It was shown theoretically that the probability of generating n protein molecules from one mRNA follows a geometric distribution (11, 12, 34): Embedded Image(1) where ρ is the probability of the ribosome binding and 1 – ρ is the probability of RNase E binding to the overlapping site on mRNA (33). We proved that this model is consistent with our single-molecule measurement. Data fitting of Fig. 4B to Eq. 1 yields ρ = 0.8 ± 0.1 and an average of 4.2 ± 0.5 molecules produced per burst. This number multiplied by the number of bursts per cell cycle (1.2) results in a steady-state protein abundance of 5.0 ± 0.8 molecules per cell in a cell population. Consistently, we experimentally measured this abundance to be 4.1 ± 1.8 molecules per cell by counting the molecules in ∼300 individual cells under the microscope at the same time.

Lastly, the temporal spread of the expression bursts can be characterized from the autocorrelation function of the fluctuation in protein expression, C(2)(τ) (Fig. 4C), averaged from 30 different cell lineages from 15 different movies. The single exponential fit of C(2)(τ) gives a decay time constant of 7.0 ± 2.5 min, corresponding to the average spread of the stochastic arrival times of fluorescent reporter proteins within a burst, despite the fact that the polypeptides are generated within the short lifetime of an mRNA (τmRNA = 1.5 min). We show (SOM Text) that, under the condition that there is one rate-limiting step for the posttranslation assembly of the fusion protein, Embedded Image(2) where s is the average rate of the expression burst and κ is the rate constant of Tsr-Venus assembly process, consisting of transcription, translation, folding, and chromophore maturation. The fitting of Fig. 4C with Eq. 2 gives s = (29 ± 8 min)–1, in agreement with the average number of expression bursts per cell cycle of 1.2 ± 0.3 (Fig. 4A); ρ = 0.7 ± 0.1, consistent with the value of 0.8 ± 0.1 determined from Fig. 4B; and 1/κ = 7.0 ± 2.5 min, corresponding to the rate-limiting step of the protein assembly process. Considering the fast transcription (∼45 bases/s) and translation (∼15 residues/s) rates, we tentatively assign 1/κ to the fluorophore maturation process (SOM Text). Although we can only spatially resolve a few molecules within an E. coli cell because of the diffraction limit, the long spread of the stochastic arrival times of Venus allows many more protein molecules per expression burst to be counted in several consecutive images.

Gene expression, central to life processes, is intrinsically a stochastic process involving low copy number of biomolecules. Our real-time assay allows probing of low copy number proteins in single live cells not accessible by current technologies. This approach, together with other emerging single-molecule techniques (35), will yield further insight into not only gene expression but also other fundamental biological processes.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S6

Table S1


Movies S1 and S2

References and Notes

View Abstract

Navigate This Article