Research Articles

Crystal structures of a group II intron lariat primed for reverse splicing

See allHide authors and affiliations

Science  02 Dec 2016:
Vol. 354, Issue 6316, aaf9258
DOI: 10.1126/science.aaf9258

Tie me up, cut me down

Group II in trons are mobile genetic elements found in all domains of life. They are large ribozymes that can excise themselves from host RNA. Costa et al. determined the structure of an excised group II intron in its branched conformation. This conformation is comparable to the branched “lariat” seen during the splicing of nuclear RNA transcripts. The lariat conformation helps assemble the group II active site for the reverse splicing reaction. The lariat in spliceosomal splicing may also have a similar role in the second step of messenger RNA intron removal.

Science, this issue p. 10.1126/science.aaf9258

Structured Abstract


Self-splicing group II introns are catalytic RNAs (ribozymes) that can excise by themselves from precursor RNA molecules. These ribozymes are widespread in the bacterial world and can also be found in the bacterial-derived organelles (mitochondria and chloroplasts) of some higher organisms. Group II self-splicing is believed to have evolved into nuclear pre-mRNA splicing, a fundamental step in the expression of eukaryotic genes during which a large ribonucleoprotein machinery (the spliceosome) catalyzes the removal of introns from nascent premessenger transcripts. Both group II and pre-mRNA splicing proceed via two sequential phosphoryl transfer reactions. First, a 2′-5′ phosphodiester bond is created between a conserved intron adenosine and the first intron nucleotide. The resulting splicing intermediate with a branched conformation is called a “lariat.” In a second step, completion of splicing leads to the ligation of the flanking 5′ and 3′ exons and the release of the intron lariat. Bacterial group II introns are composite elements that, in addition to their ribozyme core, carry an open reading frame encoding a reverse transcriptase (RT) enzyme. In association with their RT, freed group II intron lariats behave as mobile genetic elements that colonize genomes through retrotransposition. The group II mobility pathway is initiated by “reverse splicing” of the intron lariat into a DNA target, followed by synthesis of a DNA copy of the integrated intron by the RT enzyme.


Eukaryotic pre-mRNA splicing relies entirely on formation of the 2′-5′ branch structure. In group II introns, the same branch structure is required for efficient and faithful catalysis of the second step of splicing and for complete, accurate reverse splicing of the intron into DNA targets during intron mobility. Moreover, in both splicing systems, the branched nucleotides must translocate within the active site between the two steps of splicing. To understand the molecular mechanisms at play, we crystallized the lariat form of a group II intron either alone or bound to a nonreactive analog of the 5′ exon.


Our crystal structures at 3.4 and 3.5 Å resolution reveal that the 2′-5′ branched nucleotides are part of a network of hydrogen bonds and stacking interactions that involve highly conserved nucleotides at the intron core and boundaries. The resulting architecture organizes the second-step active site by juxtaposing the intron 5′ and 3′ ends and promotes positioning of the last intron nucleotide into the catalytic center. After the ligated exons have been released, the terminal ribose of the lariat intron remains docked in the reaction center, with its 3′-hydroxyl group activated by a highly coordinated metal ion and poised for catalysis of the reverse-splicing reaction. Stable docking of the 2′-5′ branch structure into the active site is promoted by a rearrangement of the base-pairing pattern within the helix that contains the adenosine branchpoint. This rearrangement operates between the two steps of splicing and is essential for recognition of the proper 3′ splice site. Comparison of lariat structures in the presence and absence of the 5′ exon reveals that substrate binding results in an induced fit that extends into the catalytic center and contributes to coordination of a second catalytic metal ion. This “exon-sensing” device could ensure that catalysis of reverse splicing is dependent on the accuracy of intron-exon pairings.


The present crystal structures bring to light the crucial role of the 2′-5′ branch in organizing the lariat intron catalytic site for efficient and accurate ligation of the flanking exons during the last stage of splicing. Making use of the branch structure to build the second-step active site results in coupling the two steps of splicing and contributes decisively to the fidelity of the overall process. Moreover, the presence of the 2′-5′ branch locks the active site into a near-transition-state configuration for catalysis of reverse splicing, which must have contributed to selection of the lariat conformation during the evolution of mobile group II introns. As all nucleotides involved in the catalytic center have potential homologs in the spliceosomal system, a group II-based model in which the 2′-5′ branch fulfills the same organizational role is proposed for the spliceosome second-step active site. As in group II introns, the postulated architecture implies a notable conformational rearrangement of the spliceosome active center between the two steps of pre-mRNA splicing. Our findings rationalize the extreme conservation of the branched conformation both during the diversification of group II introns and along the evolutionary path that gave rise to the nuclear pre-mRNA splicing apparatus of eukaryotes.

Structural basis for reverse splicing by group II introns.

Self-splicing group II introns and their evolutionary descendants, the spliceosomal introns of eukaryotes, are excised as branched molecules (lariats) with a 2′-5′ phosphodiester bond. The 2′-5′ branch organizes the lariat active site, priming it for catalysis of splicing and reverse splicing. Reverse splicing into DNA is used to initiate group II intron mobility in bacteria.


The 2′-5′ branch of nuclear premessenger introns is believed to have been inherited from self-splicing group II introns, which are retrotransposons of bacterial origin. Our crystal structures at 3.4 and 3.5 angstrom of an excised group II intron in branched (“lariat”) form show that the 2′-5′ branch organizes a network of active-site tertiary interactions that position the intron terminal 3′-hydroxyl group into a configuration poised to initiate reverse splicing, the first step in retrotransposition. Moreover, the branchpoint and flanking helices must undergo a base-pairing switch after branch formation. A group II–based model of the active site of the nuclear splicing machinery (the spliceosome) is proposed. The crucial role of the lariat conformation in active-site assembly and catalysis explains its prevalence in modern splicing.

Group II introns are large ribozymes (catalytic RNAs) with the capacity to self-excise from their host precursor RNAs in vitro (1). The group II ribozyme, which is composed of six structural domains (DI to DVI; fig. S1), self-splices through two consecutive transesterification reactions (Fig. 1). First, a 2′-5′ phosphodiester bond is formed between the 2′-hydroxyl group of an adenosine located in DVI and the first intron nucleotide. The resulting splicing intermediate with a branched conformation is called “lariat.” Subsequently, the terminal 3′-hydroxyl group of the 5′ exon attacks the 3′ splice site, generating the ligated exons and freeing the intron lariat. Although splicing may occasionally be initiated by hydrolysis, which leads to the release of the excised intron in linear instead of branched form, the use of a branched splicing intermediate is the hallmark of group II introns. Because the spliceosome—a large and dynamic ribonucleoprotein machinery—uses the same reactional mechanism to remove introns from pre-mRNA transcripts in the nucleus of eukaryotes, group II and spliceosome-catalyzed splicing are widely believed to share a common origin [reviewed in (2)].

Fig. 1 Splicing and reverse-splicing pathways catalyzed by group II introns.

Red wavy line, group II intron; blue and pink boxes, 5′ and 3′ exons (for splicing) or DNA sequences flanking the target site (for reverse splicing). Conserved nucleotides at the branchpoint and intron boundaries are shown. Splicing proceeds through two consecutive transesterification reactions. Dashed lines and arrowheads indicate nucleophilic attack at each reactional step, diamonds at intron ends stand for reactive phosphate groups, and the 2′-5′ branch is highlighted by a green background. The group II splicing pathway as depicted here is valid for nuclear pre-mRNA splicing catalyzed by the major spliceosome subtype, except that the latter’s intron substrates end with a G. The freed group II lariat intron is ready to undergo reverse splicing, in which the two steps of splicing are performed in the opposite direction. The intron lariat in our crystal structures is in the right conformation to carry out the first step of reverse splicing, here shown on a blue background. For clarity, the intron-encoded reverse transcriptase that assists splicing and reverse splicing in vivo by stabilizing the ribozyme is not depicted.

Group II introns are widespread in the bacterial world. Most bacterial group II introns encode a multifunctional reverse transcriptase that associates with the intron to promote its genomic mobility through retrotransposition, a pathway initiated by “reverse splicing” (3). The latter process, by which the freed intron lariat catalyzes its own insertion into a DNA target (Fig. 1), rests on the chemical reversibility of the two transesterification reactions of splicing. Introns in linear form, which lack the 2′-5′ bond, require host functions to complete reverse splicing and insert into DNA: They transpose much less efficiently and in a predominantly imprecise manner when tested in vivo (4).

Aside from its contribution to retrotransposition, the 2′-5′ branch is also important for the second step of splicing, because branched intron–3′ exon reaction intermediates are much more efficient than their linear counterparts at carrying out exon ligation (5). Despite the central role of the lariat conformation, the exact arrangement of nucleotides that participate in and surround the group II 2′-5′ branch has nonetheless remained elusive. Available x-ray structures of the Oceanobacillus iheyensis group II ribozyme (6, 7) correspond to a linear form of the intron truncated at its 3′ end; they lack the small helical DVI that carries the branchpoint adenosine (bpA). Structures of lariat introns were more recently generated by crystallography (8) and cryogenic electron microscopy (cryo-EM) (9), but in both of them, the resolution of the intron active site is insufficient to identify some essential nucleotides and ascertain that they are correctly assembled.

To understand the structural basis for the supremacy of a 2′-5′ branched structure in group II splicing, we crystallized the lariat form of a chimeric group II ribozyme derived from the O. iheyensis intron. Like most members of structural subclass IIC (3), the wild-type O. iheyensis ribozyme self-splices exclusively through 5′ splice-site hydrolysis in vitro (6), resulting in linear molecules. However, we recently succeeded in activating the branching pathway for O. iheyensis intron constructs in which DVI and its first-step, ι (iota) binding site (10) in substructure IC1, had been replaced by their counterparts in Azotobacter vinelandii intron I2 (construct Oc19; fig. S1B) (11). The intron lariat excised from the Oc19 chimeric precursor by in vitro splicing was purified under denaturing conditions and subsequently refolded and crystallized, either alone or in the presence of a nonreactive 5′-exon analog RNA (12). The two structures were solved at 3.4 and 3.5 Å resolution, respectively, by molecular replacement (12), which allowed us to locate DVI in both electron density maps and unambiguously model the active-site region.

The 2′-5′ branch organizes the reactive site

The Oc19 lariat crystallized alone or in the presence of an unreactive 5′ exon exhibits the same overall folding of intron subdomains around catalytic DV (Fig. 2A) as the previously published structures of the O. iheyensis ribozyme that lacked DVI (6). In particular, the “catalytic triplex,” formed in the major groove of DV by the “catalytic” components of this domain (C358, G359, and C377; fig. S1) and G288 and C289 of the J2/3 strand, displays the same configuration as in the absence of DVI.

Fig. 2 Location of DVI and tertiary structure of the active site of the crystallized lariat.

(A) Overall three-dimensional structure of the Oc19 lariat showing the position of DVI relative to other intron components (color coding as in fig. S1A). (B) Close-up view of DV (red), DVI (purple), the intron 5′-terminal segment (green), and J2/3 linker (blue). (C) Base-stacking array that connects the branched nucleotides to the reaction center. M1, M3, and M4 are metal ions (fig. S3E); dashed yellow and brown lines indicate direct metal ion coordination and hydrogen bonding, respectively. pnt, penultimate intron nucleotide. (D) Base-base interactions that organize the active site. Dotted black lines are hydrogen bonds. Coloring by atom is superimposed onto coloring by domains in the last two panels.

In both our lariat structures, DVI is not bound to its ι receptor (fig. S1A). Instead, its basal section (the one immediately next to the secondary structure central wheel; fig. S1) is only slightly shifted with respect to the basal stem of DV, with whose axis it forms an angle of ~160° (Fig. 2B). This spatial arrangement suggested that the lariat was crystallized in its second-step conformation, which was confirmed by fitting the intron 3′ end into the density map. The last intron nucleotide (U419, position γ′) forms a Watson-Crick base pair with A287(γ) of J2/3, the highly conserved linker between DII and DIII (Fig. 2, C and D, and fig. S2). The γ-γ′ interaction, which positions the 3′ splice site for exon ligation (13), holds the terminal ribose in the catalytic center even after the ligated exons have been released. The 2′- and 3′-oxygens of this terminal ribose coordinate a metal ion (M1; Fig. 2C and figs. S2 and S3E), which is also bound to catalytically important oxygens in DV (6). The position of the essential A287(γ) base (Fig. 2C) differs markedly compared to previous structures (69). The particular backbone path adopted by the catalytically essential J2/3 linker is instrumental in positioning the A287(γ) and G288 bases sufficiently far apart to allow the second-step active site to form at the side of the catalytic triplex (fig. S3D).

Stabilization of γ-γ′ is achieved by A287(γ) being directly stacked on top of the 2′-5′ branch structure, which is formed by base-stacking of the first intron nucleotide (G1) on the bpA (Fig. 2, C and D, and figs. S2 and S3, B and C). Stable docking of the branched nucleotides, which involves hydrogen bonding between the donor groups of the G1 base and the O1P (pro-Sp) oxygen of C357 in DV, places G1 in the proper register to base pair with the penultimate intron nucleotide through a non–Watson-Crick interaction (Fig. 2, C and D) specific to the second step of splicing (14). The A412(bpA)-G1-A287(γ) purine stack is topped by the second intron nucleotide (U2), whose stretched backbone allows its base moiety to stack onto A287(γ) (Fig. 2C and figs. S2 and S3, B and C). The U2 base also interacts with the sugar edge of A376 in the catalytically critical DV bulge. This constrains the U2 nucleobase into a syn conformation, which is unusual for a pyrimidine (Fig. 2D) (15). Finally, two metal ions (M3 and M4) participate in organizing the active site by stabilizing the contorted backbone path between nucleotides 1 and 5 (Fig. 2C and fig. S3E).

By revealing the network of stacking and hydrogen bonding interactions that connects the highly conserved nucleotides of the J2/3 linker and intron boundaries to the 2′-5′ branch, our structures make visible the crucial role played by the branched nucleotides in assembling a functional active site for the second step of splicing. This role had been anticipated by biochemical experiments: A linear intron–3′ exon splicing intermediate carries out the second step of splicing ~800 times slower than its branched counterpart (5), and disruption of the interaction between G1 and the penultimate intron nucleotide affects splicing only of the lariat form of the intron, not its linear form (14). Having the 2′-5′ branch as an essential component of the second-step active site ensures coupling of the two steps of splicing and makes it possible to verify at the exon ligation stage that the proper 5′ splice site was selected during branch formation: Incorrectly assembled splicing intermediates will either get “debranched” through reversal of the first step reaction or eventually dissociate from the 5′ exon.

Alternate branchpoint conformations

On the basis of extensive comparative analyses (16, 17), the bpA of group II introns has generally been assumed to form a single-nucleotide bulge on the 3′ side of DVI (Fig. 3, top left). However, such a secondary structure is not supported by our electron density maps. These show instead an alternate conformation in which the DVI basal helix comprises only 3 base pairs (bp), and the branchpoint A412 is part of a 2-nucleotide (nt) bulge (Fig. 3, top right, and fig. S3F). A 2-nt branchpoint bulge was present in the crystal structure of a small construct intended to mimic DV and DVI of a group II intron (18). This observation and the estimated thermodynamic stabilities in Fig. 3 suggest that the conformation with a 2-nt bulge could constitute the DVI ground state in many group II introns.

Fig. 3 Experimental evidence for a rearrangement of DVI between the first (branching) and second (exon ligation) steps of splicing.

The section of DVI that is proposed to undergo conformational rearrangement by reciprocal strand shifting (red arrows) is in red; base substitutions in Oc19 mutants are in blue. Favored conformations are boxed; dotted arrows indicate unfavorable rearrangements; values of ΔG°37 were calculated with RNAstructure 5.6 (59). k2 and kbr are experimentally determined rate constants for steps 2 and 1, respectively (fig. S5B) (12); their ratio was proposed to depend on the thermodynamic equilibrium between step 1 and step 2 conformations (22).

Because our crystal structures imply that exon ligation makes use of a 2-nt bulge, the conformation with a 1-nt bulge (Fig. 3, top left) could be specific to the first (branching) step of splicing. We verified this by introducing base substitutions that selectively destabilize either one of the two DVI conformations. In constructs G395C:C410G and G396C:C409G, the postulated first-step conformation should be almost as stable as in Oc19, whereas a 2-nt bulge should be highly unfavorable (Fig. 3). As expected, neither of these substitutions interfered with the initiation of splicing: Branched products form rapidly under splicing conditions (figs. S4 and S5). However, both constructs fail to make efficient use of the 3′ splice site, as shown by the nearly complete absence of ligated exons and the transient accumulation of both the intron–3′ exon lariat intermediate and branched products of intermediate lengths, which are eventually converted into a molecule that migrates like the Oc19 intron lariat [fig. S4; the same defects were seen with an O. iheyensis–derived DVI whose sequence lacks the capacity for strand shifting (11)]. In contrast, construct G395C:C409G, which should overwhelmingly favor the conformation with a 2-nt bulge, is inefficient at initiating splicing by branching, as seen from the predominance of linear products generated by 5′ splice-site hydrolysis (fig. S4), but yields a single lariat band in a second-step reaction whose rate constant is at least equal to that estimated for Oc19 (fig. S5).

In the Oc19 crystal structure, the last six nucleotides of the intron form a continuous helical stack (Fig. 2B). Adding U413 to this stack, by pairing it with A393 in a 4-bp DVI basal helix, would drag not only A412 but also G1 out of place, which explains why, in a branched molecule, the 1-nt bulge conformation fails to use the correct 3′ splice junction efficiently. The crucial role played by the branched G1 in binding the intron 3′-terminal segment is confirmed by the behavior of linear intron–3′ exon molecules, in which the absence of the 2′-5′ bond and consequent destabilization of the intron ends can be invoked to explain why intron excision is both slow (5) and imprecise (fig. S4; G395C:C409G lanes).

As opposed to exon ligation, efficient branch formation was shown to require a 4-bp, rather than 3-bp, DVI basal helix (11, 19). This explains why the 2-nt bulge conformation reacts preferentially by 5′ splice-site hydrolysis and accounts for the need to reorganize the DVI bulge between the two steps of splicing. Other group II introns with a 7-nt spacer between the branchpoint and 3′ splice site should share with A. vinelandii I2—whose DVI was part of the Oc19 construct (fig. S1B)—the ability to toggle between conformations with a 4- and 3-bp DVI basal helix. This is true for all but one of the intron lineages with a 7-nt spacer (fig. S6). As in A. vinelandii I2, interconversion between the 1-nt bulge, first-step conformation and the 2-nt bulge, second-step conformation is achieved by reciprocal shifting of the two strands that constitute the middle part of DVI (in red in fig. S6) by one nucleotide. Such a mechanism requires that in a segment that undergoes rearrangement of its base-pairing pattern, one strand contains only G’s (except at its extremities) and the other one contains only pyrimidines. This explains the peculiar base distribution and the slow evolution of sequences surrounding the DVI branchpoint in these intron subgroups, which constitute about half of recognized group II lineages.

Conformational reorganization of DVI could be helped in vivo by proteins—either the intron-encoded reverse transcriptase or cellular helicases like yeast Mss116 (20) that would take advantage of the accessibility of this domain in the three-dimensional structure of the intron (Fig. 2 and fig. S3A) to promote rearrangement of its secondary structure. Although the recent cryo-EM structure of a group II intron bound to its reverse transcriptase (9) lacks any contact between the intron-encoded protein and DVI, footprinting and cross-linking data (21) do suggest a direct, possibly transient, physical interaction between these components.

Future work will investigate how the branchpoint rearrangement we identified here is mechanistically and structurally coupled to the formation of tertiary contacts between DVI and other intron components. These contacts include ι-ι′ (fig. S1A), which is specific to the branching step (10), and two interactions between DII and the basal and distal sections of DVI (8, 22). The latter interactions, which are specific to the exon ligation step (22), were removed in our Oc19 construct for the sake of efficient crystal packing.

Exon-driven induced fit and implications

Group II introns bind their 5′ and 3′ exons for splicing, or their DNA target site for retrotransposition, by base pairing to exon binding site (EBS) sequences located in DI (fig. S1) (3, 23, 24). The linear O. iheyensis ribozyme (DI to DV) had been reported to retain essentially the same structure when crystallized in the presence or absence of exons, except in the EBS1 segment, which is complementary to the last nucleotides of the 5′ exon (fig. S1B) and was found to be slightly disordered in ligand-free structures (6, 7). However, here, we report that whereas the 5′ exon–bound Oc19 lariat (12) differs little from previously published structures over DI to DV, our exon-free lariat structure reveals substantial local rearrangements.

The expected disorganization of the middle section of EBS1 in the absence of the 5′ exon results in the neighboring GAAC terminal loop of DV (ζ′) to adopt an alternate conformation in which all four loop bases are stacked (Fig. 4A and fig. S7A). The reorganized loop cannot interact with its receptor motif (ζ), which becomes partly disordered. Concomitantly, the symmetrical internal loop in the basal section of subdomain IC (fig. S1) adopts a fold that completely extrudes nucleotide A72 (Fig. 4A and fig. S7A). Stacking of the flipped A72 base on the highly conserved A106(λ) is made possible by a slight rotation of A106 away from its position in the exon-bound structure: This chain of rearrangements is ultimately triggered by the loss of the hydrogen bond between the N1 atom of A106 and the 2′-OH group at position −2 of the exon (Fig. 4, B and C, and fig. S7B).

Fig. 4 Conformational rearrangements induced by the 5′ exon.

(A) Conformation of the distal section of DV (red) and basal section of domain IC (bright orange) in the absence of the 5′ exon. The ζ motif (gray tube) in subdomain ID and the middle section of the EBS1 loop (not shown) are disordered. Black arrows indicate the wide movements undergone by A72 and G369 (in the GAAC tetraloop) upon 5′-exon binding. (B) Base-pairing of the 5′-exon substrate (cyan) to EBS1 (gray) promotes integration of A72 into the dynamic IC loop and formation of tertiary interaction between ζ and the GAAC tetraloop. (C) Rotated view from (B) highlighting the 2′-hydroxyl groups (pink dots) of EBS1 and the 5′ exon that are directly recognized by the distal section of DV and nucleotide A106(λ), respectively. See fig. S7 for the complete networks of stacking and hydrogen bonding interactions that stabilize these two structural states. IBS1, intron binding site 1.

These structural rearrangements are functionally relevant, as illustrated by data on the binding of the 5′ exon to a subgroup IIB intron lariat (25). Two binding modes that differ by a ~100-fold difference in Kd (dissociation constant) were identified, with the high-affinity configuration being dependent on the distal section of DV. Moreover, footprinting experiments revealed that the N1 position of the nucleotide homologous to A106(λ) becomes protected from modification only when the 5′ exon is tightly bound to the lariat, as expected from our crystal structures.

Exon-driven rearrangements extend to the active site, because the nonbridging oxygen (O1P) of U375, which participates in catalysis as a metal ion ligand, is properly positioned to fulfill this role only when the 5′-exon substrate is stably bound to the intron (fig. S7C). This observation explains why removal of the 2′-OH group at exon position −2 decreases the rate of catalysis by as much as 500-fold (25) and illustrates the crucial role of the 5′ exon in helping to bend DV into its catalytically competent conformation.

Just like protein enzymes and the ribosome (26), therefore, group II introns bind their 5′ exon through an induced-fit mechanism, in which initial substrate binding induces structural rearrangements in the catalytic core that, ultimately, strengthen enzyme-substrate contacts and trigger catalysis. This device allows the system to “verify” the quality of intron-substrate pairings before engaging in splicing or reverse splicing. The interaction with the 2′-hydroxyl group at position −2 of the 5′ exon is of particular biological interest because it enables group II ribozymes to discriminate between RNA and DNA substrates (25). Because exon-intron pairings are directly recognized by the intron-encoded reverse transcriptase (9), the latter could ensure that DNA is favored over RNA as a target for reverse splicing during intron mobility.

Metal ions and catalysis of reverse splicing

The architecture of the second-step active site is not altered by binding of the 5′-exon substrate, and in both Oc19 lariat structures, the native electron density map revealed a large peak in the reaction center (Fig. 5A). This peak was modeled as metal ion M1 and further assigned to a Mg2+ ion based on ytterbium (Yb3+) soaks (Y1 site in Fig. 5A) (12). Coordination of M1 involves a remarkably high number of inner-sphere contacts. Besides the three coordinations with nonbridging phosphate oxygens in DV [(6) and references therein], two other inner-sphere contacts with the 2′- and 3′-oxygen atoms of the terminal U419(γ′) ribose are visualized for the first time (Fig. 5B). These newly seen ribose ligands had been predicted, based on biochemical evidence, to bind essential divalent metal ions in the transition state for the second step of splicing (27, 28). Our crystal structures support and extend these findings by demonstrating that a single ion, which is already observable in the Oc19 lariat ground state, is simultaneously bound to the 2′- and 3′-oxygen atoms and that this unusual coordination (28) is made possible by the C2′-endo conformation of the terminal ribose (Fig. 5B).

Fig. 5 The catalytic mechanism of group II reverse splicing.

(A) Ytterbium (Yb3+; an anomalous scatterer that mimics Mg2+) anomalous difference map (violet, contoured at 21σ) reveals two ions (Y1 and Y2) at the catalytic center in the structure of the Oc19 lariat complexed with an unreactive 5′-exon analog. The native density FoFc omit map (olive, contoured at 7σ) for metal M1 perfectly superimposes with the anomalous difference map for Y1. (B) Inner-sphere coordinations directly observed in our structure for metals M1 and Y2 are shown as thick green dashed lines, whereas thin dotted lines are inner-sphere contacts inferred after docking [without further modeling (12)] a ligated exon substrate (from PDB entry 4E8K, colored by atoms). The nucleophile of the reverse-splicing reaction (the intron terminal 3′-oxyanion) is in yellow. (C) Crystallographically derived model of the transition state according to the two–metal ion mechanism for group II reverse splicing (O1P and O2P are pro-Sp and pro-Rp oxygens, respectively).

Because the unreactive 5′-exon analog we used to prevent lariat debranching during cocrystallization (12) lacks a terminal 3′-OH group (Fig. 5A), neither of the Oc19 native maps exhibits electron density that could be interpreted as the second divalent metal ion, which was shown biochemically (29) and crystallographically (30) to be bound to that 3′-oxygen. However, our anomalous difference map obtained in Yb3+, which binds with higher affinity than Mg2+ (31), does reveal a second strong Yb3+ binding site (Y2; Fig. 5A), which lies at the appropriate location to coordinate a terminal 3′-oxygen and stands 4.0 Å apart from Y1/M1 (Fig. 5A). The latter distance is typical for catalysis of phosphoryl transfer reactions by the two–metal ion mechanism (32).

When a ligated exon substrate is docked (12) into our 5′ exon–bound Oc19 structure (Fig. 5B), the scissile phosphate of that substrate falls into the reaction center, right between M1 and Y2, with its O1P (pro-Sp) oxygen properly positioned to make inner-sphere contacts with Y2 and the apical position of catalytic metal ion M1 (Fig. 5, B and C). This configuration of the scissile linkage is the one determined biochemically for reverse splicing by group II introns (33). Moreover, the 3′-oxygen of the terminal U419(γ′) ribose, which is the nucleophile of the reverse-splicing reaction, is correctly prepositioned in our crystal for inline nucleophilic attack (34) on the scissile phosphate (Fig. 5, B and C). The entire arrangement of metal ions and ligands is fully consistent with the two–metal ion mechanism of catalysis also in use for polymerases and group I introns (32). That essential features of this mechanism should directly be seen in, or readily deduced from, our ground-state crystal structure brings to light the efficiency with which the 2′-5′ branch structure enforces a near–transition state configuration to the reaction center. This ability to prime the intron for reverse splicing must have contributed decisively to the selection of the lariat over linear form during the emergence of mobile group II introns.

Implications for the spliceosome active site

Nuclear pre-mRNA splicing is catalyzed within the spliceosome, a highly dynamic ribonucleoprotein particle whose active core is composed of three small nuclear RNAs (snRNAs) named U2, U5, and U6 and numerous protein factors (35). In both group II and spliceosomal introns, the 2′-5′ linkage of the lariat results from attack of the 5′ splice site by a conserved adenosine that bulges out from a helical stem. This shared peculiarity and the presence of similar consensus sequences at intron boundaries (5′-GU…AG-3′ for the most common subtype of spliceosomal introns, 5′-GU…AY-3′ for group II introns) led to the hypothesis that the two systems have a common evolutionary origin (36). There is compelling biochemical evidence that part of the highly conserved U6 snRNA is homologous to DV of group II introns: The terminal GA of the invariant ACAGAGA sequence of U6 interacts with the major groove of U2-U6 helix Ib to form a group II intron–like catalytic triplex (37). The latter promotes binding of two catalytic metal ions to conserved, nonbridging phosphate oxygens of U6 in a manner similar to DV (Fig. 5) (38, 39).

We now propose that the architecture of the second-step active site of the spliceosome rests on a network of RNA-RNA interactions (Fig. 6) similar to the one our structures have revealed for the Oc19 lariat. As in group II, the 2′-5′ branch and the terminal nucleotides of spliceosomal introns are both required for the second step of pre-mRNA splicing (40). Moreover, the conserved guanosines at the boundaries of nuclear introns form a non–Watson-Crick base pair that is specifically required for exon ligation (41, 42) and probably fulfills the same function as the pair between G1 and the penultimate nucleotide of group II introns. Although the two pairings are not isomorphic, it is nevertheless possible to fit the spliceosomal G1:Gn interaction into the Oc19 active site in such a way that the Gn terminal ribose remains positioned in the catalytic center (Fig. 6A) (12). In the resulting model, G1 no longer stacks on the bpA (Fig. 6B). However, the stacking interaction between nucleotides U2 and A287(γ) is preserved, and we specifically propose that, in the spliceosome, the third invariant A (A45 in mammals, A51 in yeast) of the ACAGAGA motif of U6 snRNA (Fig. 6) is the counterpart of group II A287. This assignment is supported by several lines of evidence. First, just like A287(γ) in group II introns, A45/51 lies immediately 5′ to the dinucleotide engaged in the catalytic triplex. Second, substitution of A51 totally blocks the second step of pre-mRNA splicing in yeast (43). Third, A45/51 and nucleotide U2 of spliceosomal introns cross-link with each other specifically during the second step of pre-mRNA splicing in both humans (44) and yeast (45), and the chemical mechanism of the cross-link is suggestive of a stacking interaction between these nucleotides.

Fig. 6 Second-step active site of the spliceosome modeled after its group II counterpart.

(A) The second step–specific interaction between the terminal G’s of spliceosomal introns (light green) was modeled into the group II intron active site (12); numbering of snRNA sequences is according to the yeast spliceosome. The catalytically essential components of U6 snRNA (AGC triad, ISL bulge, and ACAGAGA box) are colored as their group II counterparts in DV and the J2/3 linker (Fig. 2). The orange lightning bolt stands for a second step–specific cross-link between nucleotide A51/A45 of U6 snRNA and the U at intron position 2 (see text). The location of group II metal M1 is potentially preserved in the spliceosome (12). (B) In this view, G1:Gn hydrogen bonds are shown as dashed yellow lines. A potential interaction between U6 snRNA A79 (homologous to Oc19 A376; Fig. 2) and intron nucleotide U2 is suggested. (C) Back view of the spliceosomal active site and the potential lariat recognition site for Prp8. The nucleotide groups highlighted as orange dots are required for the second step of pre-mRNA splicing (see text).

Genetic suppression data in yeast (46) led to the suggestion that the highly conserved Prp8 spliceosomal protein, which lies at the heart of the splicing machinery and is evolutionarily related to group II–encoded reverse transcriptases (47), recognizes a second-step active-site RNA structure composed of intron U2, A51 of U6 snRNA, and the AG dinucleotide at the 3′ intron boundary. These four nucleotides are tightly clustered in our model (Fig. 6) and could be recognized by the “catalytic cavity” of Prp8 (48). Moreover, the adenosine N1 atoms at the branchpoint and penultimate intron nucleotide, which are both known to be important for the exon ligation step (49, 50), yet without an identified RNA partner, constitute additional candidates for interaction with Prp8 (Fig. 6C).

Available cryo-EM structures of the catalytically activated spliceosome pertain to the first step of pre-mRNA splicing. They reveal the positioning of the 2′-5′ branch structure either immediately (51) or soon (52) after its formation. Comparison with our second-step spliceosomal model leads to the conclusion that, as already proposed for group II introns (10), the 2′-5′ branch needs to undergo a major translocation between the two transesterification reactions. Such a large-scale movement could be part of the extensive conformational rearrangement that is believed to occur between the two catalytic steps of pre-mRNA splicing (35), and although the structure of the spliceosome active site for exon ligation remains to be established, our model specifically predicts that the 2′-5′ branch structure will prove essential to its assembly. Just as in group II introns, such structural coupling of the two chemical steps of splicing provides an RNA-based “proofreading” device that could complement protein-dependent proofreading (53) in ensuring the fidelity of spliceosomal splicing.

On the other hand, implications for nuclear pre-mRNA splicing of the strand-shifting mechanism that we have uncovered in the group II system are not immediately obvious because, in spliceosomal introns, the presence of a “spacer” of variable length and sequence that generally separates the branchpoint from the 3′ splice site would seem to make it superfluous to rearrange the base-pairing pattern around the branchpoint. Nevertheless, it is possible that in some unicellular eukaryotes, rearrangement of the branch site takes place through an RNA-based mechanism similar to the one used to remodel DVI of group II introns when seven nucleotides separate the branchpoint from the 3′ splice site (fig. S8).

In conclusion, our crystal structures have brought to light key structural features that explain the prominent role of the 2′-5′ branch in the assembly of the intron active site for exon ligation and the initiation of reverse splicing. In doing so, they elucidate the reasons why the lariat bond was so stubbornly conserved not only during the diversification of group II introns but also all the way along the evolutionary path that led to the emergence of nuclear premessenger introns and their splicing machinery from a group II ancestor.

Methods summary

RNA preparation and crystallography

The Oc19 chimeric construct, in which DVI and part of the IC1 subdomain of the Oceanobacillus intron have been replaced by their counterparts in intron Av.I.2 (fig. S1) (12), was selected for crystallography based on its ability to generate abundant intron lariat during self-splicing (fig. S4). For crystallization purposes, RNA synthesis from linearized Oc19 plasmid DNA was performed at 37°C with home-prepared T7 RNA polymerase. To generate Oc19 lariat, posttranscription samples were desalted prior to incubation under self-splicing conditions. The Oc19 lariat RNA was purified by denaturing polyacrylamide gel electrophoresis and stored at −20°C. Prior to crystallization trials, the purified lariat was refolded and subsequently concentrated to 0.7 μg/μl. Crystals were grown in sitting drops by vapor diffusion at 28°C. Crystallization drops of the lariat alone contained 1 μl of reservoir solution [50 mM Na-cacodylate, pH 6.5, 225 mM NH4Cl, 95 mM MgCl2, and 22% 2-methyl-2,4-pentanediol (MPD)] and 1.5 μl of renatured RNA. For cocrystallization of the Oc19 lariat with the 5′ exon, an unreactive RNA analog of the latter was added to the drops at a final concentration of 12 μM. Ytterbium derivative crystals were obtained by soaking native crystals in the crystallization solution supplemented with 0.5 mM Yb3+ chloride. X-ray diffraction data (table S1) were collected on the PROXIMA 1 beamline at the synchrotron SOLEIL (Saint-Aubin, France), and both structures were solved by molecular replacement with MOLREP (54) using as a search model, PDB entry 4FAW (7), which corresponds to a linear form of the O.i.I1 ribozyme lacking domain VI. Model building was done with COOT (55), and structure refinement was achieved with BUSTER (56) and PHENIX (57). The positions of ytterbium atoms were identified by MR-SAD (12) with PHASER (58).

Kinetic analyses

Kinetic analyses of Oc19 and derived mutant constructs (figs. S4 and S5) were performed with 32P-labeled precursors generated from in vitro transcriptions. Self-splicing assays were carried out in 50 mM Tris-HCl, pH 7.5 (37°C), 2 M NH4Cl, 10 mM MnCl2, and 0.01% SDS, and reaction products were analyzed on 4% polyacrylamide –8 M urea gels (fig. S4). Rate constants for branching (kbr) and hydrolysis (khy), two parallel reactions at the 5′ splice site, were calculated from simple exponential fits (e.g., fig. S5A) (12).


Materials and Methods

Figs. S1 to S8

Table S1

References (6067)


  1. Acknowledgments: We thank the SOLEIL Synchrotron (Saint-Aubin, France) for beamtime allocation and the scientists at the PROXIMA 1 beamline, especially P. Legrand, for advice and help in data collection and initial processing. We thank L. Sperling and D. Fourmy for comments on the manuscript. This work was supported by the French Agence Nationale de la Recherche (grant ANR-10-BLAN-1502 to F.M. and E.W.). Coordinates and structure factors have been deposited in the Protein Data Bank (PDB) under accession codes 5J01 (lariat intron) and 5J02 (lariat intron bound to the 5′-exon analog). The authors will provide coordinates of the model in Fig. 6 upon request.
View Abstract

Navigate This Article