PerspectiveMolecular Biology

Translation Goes Global

See allHide authors and affiliations

Science  16 Dec 2011:
Vol. 334, Issue 6062, pp. 1509-1510
DOI: 10.1126/science.1216974

Global information about the extent of proteins being synthesized in a given cell has lagged behind our knowledge of genomic sequence information and RNAs. Now, a recent study highlights the extent to which this deficit is being rectified (1). Ingolia et al. expand the use of a technique they call ribosome profiling (2) to take “snapshots” of the complexity of mammalian proteomes—the constellation of proteins present in a cell or one of its compartments—focusing on messenger RNAs (mRNAs) being actively decoded to synthesize protein products and revealing several surprises.

Although the translational output of individual mRNAs in a given eukaryotic cell can vary by a factor of 10 or more, it is the ratio of multiple specific mRNAs to proteins being synthesized that is important for understanding global gene expression (3). Earlier efforts to measure this ratio included microarray analysis of RNA isolated from polysomes (mRNAs with multiple translating ribosomes) (4, 5) and translational profiling using affinity purification of epitope-tagged ribosomes (6, 7). However, these techniques do not yield the positional and quantitative information revealed by ribosome profiling. Given the number of interactions relevant to most human traits, increased focus is on regulatory rather than structural gene mutations. In this regard, the study of Ingolia et al. provides important pointers to the unexpected diversity of translational regulatory mechanisms, such as short upstream open reading frames (uORFs).

Where the ribosomes lie.

Ribosomes are “frozen” on actively translating mRNAs, after which RNase digestion (arrows) leaves small (30-nucleotide) ribosome footprints that are converted to DNA and sequenced. Aligning these footprints to the genome reveals ribosome positioning during translation. Compounds that block translation initiation cause footprints to accumulate at start codons (AUG) and uORFs within 5′ untranslated regions (UTRs). Genome-wide analysis reveals the locations of unannotated translation products, ribosomal pause sites, and an unexpected class of short, polycistronic ribosome–associated coding RNAs.

CREDIT: Y. HAMMOND/SCIENCE

Ribosomal profiling has been enabled by the recent ability to sequence vast numbers of DNA fragments at once. It builds on an old discovery that the segment of mRNA contained within a ribosome can be isolated on the basis of its resistance to nucleases that destroy unprotected regions of the mRNA. Ribosomal profiling takes a global snapshot for all ribosomes engaged with mRNA in a cell at one instant. It does so by quantitatively converting the nuclease-protected mRNA to DNA and sequencing all the different DNAs at once. By using appropriate inhibitors of either initiation or ribosome progression, or first the initiation inhibitor and later the progression inhibitor, it was possible to identify the translation start sites, their distribution, and the speed of translating ribosomes. The technique was initially applied in yeast, revealing an unexpected number of uORFs (2), and subsequently in mammalian cells, to study the effects of microRNAs on translation (8).

In the new study, Ingolia et al. analyzed a mouse embryonic stem cell line and assessed the disparity between the amounts of individual mRNAs and the amount of their protein products (1) (see the figure). The translation rate was generally constant at close to 6 amino acids per second, without numerous small pauses at rare codons; 1500 major pauses, defined as a dwell time up to 25 times that of a standard translation event, were found in 1100 genes. These pause sites have a characteristic 3-codon signature (Pro-Pro-Glu), and analysis of their importance may reveal critical features for the subcellular localization and functions of the encoded proteins.

A startling feature was the extent of the new translational start sites identified. Of the ∼5000 genes examined, 13,454 likely start sites [adenine-uracil-guanine (AUG)] were identified, with 65% of the mRNAs containing more than one start site and 16% with four or more. This greatly expands the list of mRNAs containing uORFs. uORFs have major regulatory functions and have been implicated in the synthesis of peptides that stimulate the immune system (9). There was a strong enrichment for start codons that, while not the canonical AUG, were related to it by single-nucleotide base substitutions. Only a small number had been identified previously, and the new number increases the potential importance of the hypothesis that start codon selection regulates the initiation of translation (10). Even among known coding sequences, 570 likely upstream starts were identified that would yield proteins with extra sequence at their amino-terminal ends. Also detected were 870 likely downstream starts that would yield amino-terminal truncations. These newly identified sites could potentially play partially competing roles. The number of upstream starts is 10 times greater than the number identified in a recent analysis based on coding sequence conservation (11), perhaps indicative of a wider regulatory role for upstream starts. The annotation implications for the human genome and regulatory studies are large. Most studies of the proteome identify products predicted to be present and rely on searches for a peptide of a specific mass. The work of Ingolia et al. expands our vision of the proteome, even though it does not address protein stability or modifications.

Another major finding concerns the relatively recently discovered class of more than a thousand lincRNAs, large mammalian RNAs that don't contain the characteristics of known protein-coding sequences (12). Ingolia et al. show that many putative lincRNAs have successive short segments that are translated at a rate similar to comparable classical protein coding sequences. This class of lincRNAs is renamed as short polycistronic ribosome–associated coding RNAs (sprcRNAs). Still, a substantial number of lincRNAs were found not to engage ribosomes. The new information will be crucial for investigations of the linc/sprcRNAs to clarify cytoplasmic versus nuclear functions.

Ingolia et al. also address protein synthesis changes that occur when pluripotent embryonic stem cells undergo differentiation; these findings should inform studies of numerous cell types of diverse species and issues. Ribosome profiling seems poised to inaugurate a new era of studies aimed at genome-wide information on protein synthesis (GWIPS), adding a new star in the firmament of genome-wide analysis. It is a fitting way to mark the 50th anniversary of the determination of the general nature of readout of the genetic code (13) and of identification of the first code word (144).

References

Navigate This Article