Essays on Science and SocietyCell and Molecular Biology

A key component of gene expression, revealed

See allHide authors and affiliations

Science  23 Nov 2018:
Vol. 362, Issue 6417, pp. 904
DOI: 10.1126/science.aav6875

In eukaryotes, genetic information stored in DNA is first transcribed into precursor messenger RNAs (pre-mRNAs), in which noncoding (introns) and coding regions (exons) appear alternatively. Introns must be removed, and exons are ligated to generate a mature mRNA that contains continuous protein-coding sequences through a sophisticated process called pre-mRNA splicing, which was discovered 40 years ago (1, 2). Splicing comprises two sequential transesterification reactions, branching and exon ligation, executed by a multi-megadalton, ribonucleoprotein (RNP) complex known as the spliceosome (3).

The spliceosome is composed of five uridine-rich, small nuclear RNPs (U1, U2, U4, U5, and U6 snRNPs—each of which contains one U snRNA and associated proteins), the NineTeen complex (NTC), the NTC-related complex (NTR), and various protein factors. To catalyze splicing, the spliceosome must recognize the three conserved splice sites (5′ and 3′ splice sites at exon-intron junctions and the branch point sequence) and undergo precise stepwise assembly and remodeling (Fig. 1A).

In the three decades between the discovery of the spliceosome (46) and the beginning of this century, scientists sought to identify its constitutional components and biochemically analyze the various spliceosomal complexes by immunoprecipitation, cross-linking mass spectrometry, in vitro splicing essay, and so on. However, our understanding of the molecular mechanism was impeded owing to the lack of high-resolution structures of the fully assembled spliceosome.

The spliceosome exhibits exceptional conformational flexibility and compositional dynamics. Compared to the relatively stable structures of RNA polymerase and the ribosome, an atomic structure of the spliceosome was long thought to be insurmountable. Electron microscopy (EM) studies achieved only moderate resolutions, with the highest at 20 to 29 Å, which only unveiled the overall shape and global features (7). For my Ph.D. thesis research, I chose to confront this challenging target, investigating the molecular mechanism of the pre-mRNA splicing process by capturing the atomic structures of the spliceosome using the rapidly developing cryo-EM technology.

By tagging Cdc5 with TAP-tag in Schizosaccharomyces pombe, my co-workers and I were able to purify the intact spliceosome, which eventually yielded a structure at the overall resolution of 3.6 Å (8, 9) (Fig. 1B). This structure, containing 37 proteins and four RNA molecules, reveals the organizational principles of a functional spliceosome and the RNA-based splicing active site arrangement for the first time (Fig. 1B, C).

The active site is located in the heart of the spliceosome, consisting of the intramolecular stem-loop (ISL) of U6 snRNA, the associated Mg2+ ions, helix I of the U2/U6 duplex, and loop I of U5 snRNA. Two catalytic metal ions, M1 and M2, are coordinated by zigzagged U6 snRNA in the active site. Thus, the spliceosome is in essence a protein-directed metalloribozyme. Analysis suggests that it represents a late-stage spliceosome known as the intronlariat spliceosome or ILS complex because of the absence of exon sequences and the 20 Å distance separated from the lariat junction away to the catalytic ions.

To further investigate the mechanism of spliceosome assembly and catalysis, we sought to capture the structures of the spliceosome at other working states by tagging different protein components and overexpressing adenosine triphosphatase (ATPase)–defective mutants. Prior to my thesis defense, we solved the atomic or near-atomic structures of nearly all key intermediates of the spliceosome, including pre-B and B (10), Bact (11), C (12), C* (13), P (14), and ILS complexes (15) from S. cerevisiae (Fig. 1D). We also determined the structure of the preassembled U4/U6.U5 tri-snRNP from S. cerevisiae(16). Local resolution of all the structures in the core regions reaches at least 3.0 Å, allowing us to accurately identify the catalytic metal coordination during the splicing reaction (Fig. 1E).

Molecular mechanism of pre-mRNA splicing revealed by structures of the spliceosome at different states. (A) A scheme of the assembly and catalysis cycle of the spliceosome. This process includes four phases: assembly based on the pre-mRNA substrate, activation to prepare for the reaction, two-step splicing reaction, and disassembly of the spliceosome. State transition of the spliceosome is mainly driven by eight DExD/H ATPases/helicases (colored red). (B) The cryo-EM structure of the ILS complex from S. pombe (PDB code 3JB9). (C) A close-up view of the active site of S. pombe ILS complex. (D) Cryo-EM structures of the U4/U6.U5 tri-snRNP, pre-B, B, Bact, C, C*, P, and ILS complex from S. cerevisiae. Structures are color-coded by subcomplexes related to (A). RNA components are highlighted in cartoon style. The orientation and scale are presented the same for comparison. (E) Coordination of catalytic metal ions during splicing reaction (from Bact to ILS complex, B* is predicted). The RNA elements are shown in the same orientations.


These structures provide unprecedented clarity toward a mechanistic understanding of pre-mRNA splicing. It appears that the spliceosome catalyzes the two steps of splicing reaction using a single active site, which remains almost unchanged once formed in the Bact complex. The U6 snRNA coordinates the catalytic metal ions and plays a notable role in catalyzing the two steps of splicing. U1, U2, U5, and U6 snRNAs serve to recognize and transfer the RNA elements on the pre-mRNA, including 5′ and 3′ splice sites, branch point sequence (BPS) and 5′ exon (Fig. 1D). U4 snRNA plays an important role in sequestering U6 snRNA before the reaction.

Throughout the splicing reaction, at least 20 components (mainly from U6 snRNA, U5 snRNP, NTR, and a small portion of U2 snRNA and NTC) act as a rigid scaffold, maintaining the conformation of the active site. The 3′ end of U2 snRNA and ∼13 proteins (part of U2 snRNP and NTC) are compositionally unchanged, but mobile, facilitating the delivery of critical reaction groups on the RNA molecules into the active site for the two-step reactions. The compositional change and remodeling of the spliceosome are mainly driven by ATPase/helicases and its cofactors. Thus, a near-complete cycle of pre-mRNA splicing can be recapitulated in atomic details (Fig. 1).

As an essential step in eukaryotic gene expression, splicing is relevant to many diseases (17). For example, Hsh155 is frequently mutated in hematological malignancies (18). Based on the structure of the Bact complex, we mapped 36 cancer-derived mutations to Hsh155. Most of these residues bind upstream of BPS, thus compromising the binding ability and adversely affect splicing.

Forty years after the discovery of premRNA splicing, we have finally begun to understand its underlying structural mechanism. This work will enable the understanding of the intricacies of an essential biological process. It will also aid drug discovery for numerous diseases that are linked to splicing malfunction, including many cancers, blindness, and other genetic diseases. The next challenge is to investigate the more complex regulation mechanism of the spliceosome, such as alternative splicing.



Ruixue Wan

Ruixue Wan received her undergraduate degree from Sun Yat-sen University in Guangzhou, China, and her Ph.D. from Tsinghua University in Beijing. She is currently a postdoctoral fellow at Tsinghua, where she is conducting research on structural and biochemical investigations of the spliceosome to elucidate the mechanism of pre-mRNA splicing and the regulation of the spliceosome.

References and Notes

Acknowledgments: I am extremely grateful to my Ph.D. adviser, Y. Shi, for his mentorship. I also thank all of the members of the Shi lab, both past and present, especially L. Zhou and R. Bai.

Navigate This Article