Research Article

The Structural Basis of Ribozyme-Catalyzed RNA Assembly

See allHide authors and affiliations

Science  16 Mar 2007:
Vol. 315, Issue 5818, pp. 1549-1553
DOI: 10.1126/science.1136231


Life originated, according to the RNA World hypothesis, from self-replicating ribozymes that catalyzed ligation of RNA fragments. We have solved the 2.6 angstrom crystal structure of a ligase ribozyme that catalyzes regiospecific formation of a 5′ to 3′ phosphodiester bond between the 5′-triphosphate and the 3′-hydroxyl termini of two RNA fragments. Invariant residues form tertiary contacts that stabilize a flexible stem of the ribozyme at the ligation site, where an essential magnesium ion coordinates three phosphates. The structure of the active site permits us to suggest how transition-state stabilization and a general base may catalyze the ligation reaction required for prebiotic RNA assembly.

The discovery of RNA enzymes (or ribozymes) in the 1980s (13) reignited interest in the chemical basis of the origin of life and resulted in formulation of the RNA World hypothesis (4). If RNA, in principle, can be both a genome and a catalyst, prebiotic self-replicating RNA molecules are likely an immediate evolutionary precursor and possibly a constituent of the first living organisms. All extant protein-based RNA polymerases synthesize RNA by catalyzing the templated ligation of a nucleotide triphosphate to form a 5′ to 3′ phosphodiester bond (Fig. 1A). Although natural ribozymes catalyze various phosphodiester bond isomerizations, hydrolysis, and even peptide bond formation (5), no naturally occurring nucleotide triphosphate ligase ribozyme that catalyzes the reaction required of an RNA polymerase has been discovered.

Fig. 1.

Ligation reaction and L1 ligase secondary structures. (A) Ligation reaction catalyzed by the L1 ligase in which the 3′-hydroxyl of the 3′-terminal residue of the substrate oligonucleotide attacks the α-phosphorus of the ribozyme's 5′-terminal guanosine triphosphate, which creates a new phosphodiester bond. (B) The proposed secondary structure of the full-length L1 ribozyme. Nucleotides in lowercase are derived from the constant-sequence regions of the original N90 library; uppercase residues are derived from the randomized region of the pool. The substrate oligonucleotide is italicized. Positions within shaded boxes were invariant among clones isolated from a mutagenized reselection of the L1 ligase; positions in boldface were conserved in >85% of isolated clones. The dotted line indicates a base triple. [Figure adapted from (9).] (C) The secondary structure of the minimized crystallization construct, L1X6c.

Ribozymes that specifically and regioselectively catalyze the template-dependent 5′ to 3′ phosphodiester bond ligation reaction that would have been required for a prebiotic self-replicating ribozyme (6) have, however, been created in the laboratory with artificial evolution and selection techniques (713), thus providing a proof of principle that RNA can catalyze the chemical step required for self-replication. The absence of natural polymerase or replicase ribozymes may be simply a consequence of a selection process that favored protein-based enzymes over less efficient ribozymes, rather than evidence that disfavors their existence in a prebiotic RNA World. In addition, chimeric RNA molecules consisting of a ligase ribozyme joined to an exogenous template-binding domain have been created by using a combination of in vitro evolution and rational design (14, 15). These ligase-based ribozymes are capable of polymerizing an entire turn of an RNA helix and are therefore true RNA-dependent RNA polymerases (16).

To better understand the stereochemistry and mechanism of this reaction in the context of ribozyme catalysis, we have obtained the crystal structure of an L1 RNA ligase reaction product in two conformational states, one of which appears to be close to the likely transition-state geometry of the active ribozyme, and the other appears to be in a relaxed or undocked state.

The L1 ligase ribozyme. The L1 RNA ligase ribozyme was isolated from a population of synthetic random-sequence RNAs by in vitro selection (9). This ribozyme catalyzes nucleophilic attack by a3′-hydroxyl group on the α-phosphorus of the ribozyme's 5′-triphosphate, creating a new phosphodiester linkage and releasing pyrophosphate (Fig. 1A). Although many invitro selected ribozyme ligases produce unnatural 5′ to 2′ phosphodiester linkages, the L1 ligase is one of five ligase ribozymes known to catalyze the regiospecific formation of the natural 5′ to 3′ phosphodiester bond (8, 9, 1113). The L1 ribozyme is highly flexible and, thus, responsive to structural perturbations that influence interconversion between active and inactive conformations. This has been exploited to engineer the L1 ligase into a molecular sensor for oligonucleotides (9), small molecules such as ATP (9) and FMN (17), proteins (18), and peptides (19). These allosteric molecular switch constructs have activation ratios as high as 50,000-fold over basal ligase activity.

Previous mutation and footprinting data (9, 20) revealed that the L1 ligase folds into a triple-stemmed secondary structure, with the majority of conserved residues located either in positions predicted to base pair with experimentally unvaried sequences or within a “catalytic core” region of ∼17 nucleotides adjacent to the three-helix junction (Fig. 1B). Stem A includes the highly conserved, complementary template that pairs with the substrate oligonucleotide and aligns the 3′ end of the substrate with the 5′ end of the ribozyme at the ligation junction. Stem A forms a simple helix with mostly Watson-Crick pairing except for three nonstandard pairings that occur directly at the ligation junction, namely, two G:U pairs on either side of a G:A pair. Although various types of unpaired nucleotides or nonstandard pairing at the ligation junction are seen in other ligase ribozymes (8, 1012) and presumably function to contort the helical geometry in a way that facilitates catalysis, template-mediated proximity effects are not sufficient to account for the observed catalytic rate enhancement. An additional highly conserved region of the ribozyme consists of ∼17 mostly invariant or covariant nucleotides in stem C that reside immediately adjacent to the three-helix junction. This region, previously referred to as the ribozyme “core” (20), contains four predicted Watson-Crick base pairs, an absolutely conserved G:A pair, and several highly conserved but presumed unpaired nucleotides, whose function is unclear from the secondary structure. The remainder of stem C beyond the conserved core region, as well as the majority of stem B, can be replaced with stable tetraloops without reducing ribozyme activity.

To optimize crystallization, we created a 71-nucleotide construct, L1X6c (Fig. 1C), whose autocatalyzed ligation product is a covalently closed circular adduct, in which stems A, B, and C are each capped with GAAA tetraloops (21). The ligation rate of a bimolecular version of L1X6c having covalently distinct enzyme and substrate strands (instead of the tetraloop capping stem A) is 1.8/hour at pH 7.6 in the presence of 1.3 μM substrate and 60 mM MgCl2 [supporting online material (SOM)]; the more difficult to measure unimolecular reaction will likely be faster because of the lack of an unfavorable entropy contribution inherent in a bimolecular reaction.

The L1 ribozyme crystal structures. The 71-nucleotide L1X6c ribozyme, when transcribed, folds into an active structure that autoligates to form a closed circular adduct (Fig. 1C). The 2.6 Å resolution crystal structure was solved (22) by piecewise molecular replacement using ideal A-form RNA model helices (Fig. 2, A to D). Two ligase molecules form an asymmetric unit within the crystal. Each is a roughly γ-shaped molecule in which stems A and B coaxially stack, and the shorter stem C forms an almost perpendicular branch from the three-strand junction. The secondary structure closely conforms to that previously predicted on the basis of biochemical mapping and footprinting, except A23 forms a reverse-Hoogsteen pair with U37 instead of pairing with U38.

Fig. 2.

The crystal structure of the L1 ligase ribozyme. (A) The refined all-atom structure of the crystallographic asymmetric unit of the L1 ribozyme, superimposed on a simulated annealing compositeomit sigma-A–weighted 2FobsFcalc map contoured at 1.2 RMSD (root mean square deviation). The two RNA chains are each rainbow color-coded such that the 5′ terminus is blue and the 3′ terminus is red. (B) A complementary cartoon representation of the crystallographic dimer with the same color scheme, but with the electron density omitted for clarity. The phosphodiester backbone is shown as a ribbon, and the side chains are depicted schematically as sticks. (C) A least-squares superposition of molecules P and Q, with molecule P depicted in cyan and Q in magenta. Stems A, B, and C are labeled. (D) An all-atom stereograph of the docked conformation of the ligaseribozyme(Q) color-coded as in (A and B). In addition, the phosphate at the ligation site is shown in gray, the Mg2+ ion believed to be involved directly in ligation catalysis is represented as a magenta sphere, and a water molecule at the active site is shown in beige. Contacts between the metal ion and three nonbridging phosphate oxygens and between the RNA and the water molecule are depicted as blue-gray dotted lines.

The two ligase molecules in the asymmetric unit, despite their identical secondary structures, are not superimposable; stem C branches in opposite directions when stems A and B of the two conformers are overlaid (Fig. 2C). The two molecules of the asymmetric unit, designated P and Q, are in distinctly different conformations. Stem C of the ligase molecule P is angled away from the ligation site, and this molecule is in a relaxed, presumably inactive, conformation. In contrast, stem C of ligase molecule Q forms tertiary contacts with the ligation site, in which an invariant uridine (residue 38 in stem C) makes a reverse–Watson-Crick base-pair with the invariant A51 at the ligation junction in stem A. We therefore conclude that molecule Q is crystallographically trapped in a docked, putatively active conformation that fulfills interactions inferred to be required from sequence invariance. This situation is somewhat similar to that observed for the hammerhead ribozyme. The truncated form of the hammer-head ribozyme (23, 24) lacks the distant tertiary contacts that stabilize the cleavage site in an active conformation. When these tertiary contacts are included, the structure of the three-stranded junction changes to reposition invariant activesite nucleotides to facilitate acid-base catalysis (25). Fortunately, in the present case, both a relaxed and a potentially active conformer were found to be present as two crystallographically independent molecules within a single crystal.

The superposition of coaxial stems A and B of molecules P and Q indicates that most of stem A is unchanged in the two conformers apart from the three-helix junction and the transition to stem B. As the phosphate backbone approaches and traverses the three-helix junction, the undocked conformer, P, begins to become slightly underwound relative to the docked conformer, Q, which more closely resembles an ideal A-form helix. Overlaying the stem A tetraloops of molecules P and Q allows comparison of the relative dispositions of the stem B tetraloops in the two conformers, revealing that conformer P has become unwound by ∼79° and is displaced by 8 Å relative to conformer Q.

The conserved catalytic core. The positions previously identified as the ribozyme “core” residues can now be subdivided into two functional clusters. The first functional cluster is made up of the unpaired nucleotides U19, A43, and G44 and the G18:C42 pair defining the helical junction. This region appears to function as a hinge, which allows stem C to pivot relative to stems A and B. Distortion of this hinge region accompanies helical unwinding of stems A and B in the undocked molecule. This prevents stem C from making the tertiary contacts with the catalytic site required to form the docked conformer. A43 and G44 stack on each other but do not pair or interact with other nucleotides in either conformer. Although these two positions are highly conserved among L1 ligase variants, they always occur in the context of a five-nucleotide CG···CAG motif encompassing positions C17, G18, C42, A43, and G44, with all but A43 and G44 involved in base pairing. A pentuple substitution of these positions observed in various selection experiments reveals a distinct UA···UGU motif that retains the same pairing layout as the CG···CAG motif but confers a 13-fold increase in the ribozyme's catalytic activity. In general, the L1 ligase variants contained either one motif or the other, but did not tolerate mutations of fewer than all five positions simultaneously (20). In light of the current crystal structure, it seems likely that these five positions have a significant effect on the hinging characteristics and positioning of stem C, and the UA···UGU motif either favors access to the active orientation of stem C or disfavors access to one or more inactive conformations. Also associated with this hinge region is the unpaired, yet highly conserved U19 that flips its base out of the helical stack of stem C. Despite being one of the most highly conserved positions in the molecule (invariant in at least 25 out of 26 sequences), this nucleotide, nonetheless, makes no apparent secondary or tertiary contacts. Although it is possible that an unobserved interaction has been disrupted by the crystallization process, given the context of the position and the global fold of the ligase, this residue may serve simply to add flexibility to the hinge domain.

Two Watson-Crick base pairs join the hinge to the other functional cluster within the core, composed of nucleotides G22 to A23 and U37 to A39. G22 forms a sheared G:A pair with A39, and A23 forms a reverse-Hoogsteen pair with U37. The contortion created by these adjacent, non–Watson-Crick pairs induces the invariant U38 to flip out of the helical stack (Fig. 3A). In the docked conformer, the U at position 38 is the single nucleotide base in stem C that interacts with the ligation junction of stem A as a specific tertiary base-pairing contact in a G:A:U base triple with G1 and A51, the ligation-site nucleotide and its pairing partner, respectively (Fig. 3B). G1 forms a sheared G:A pair with A51, and U38 pairs with A51 in a reverse–Watson-Crick orientation with G1 rotated by 12.8° relative to the U38:A51 plane. In the undocked structure, the orientation of stem C prevents U38 from participating in this base triple or any other interaction, and the resulting disorder of this residue is evident in the electron density map. The consequence of this interaction on the detailed conformation of the ligation site can be observed by overlaying the ligation junctions of the docked and undocked conformers (Fig. 4A). The active site phosphorus is shifted 1.86 Å between conformers P and Q. Although the construct crystallized for these experiments was the ligation product of the reaction, and specific conclusions regarding the preligation complex cannot be extrapolated, it is clear that the interaction with U38 to form the G:A:U base triple has a discernible effect on the positions of the active site atoms.

Fig. 3.

Architecture of the ribozyme core and interhelical base triple interaction. (A) Stereograph of the spatial disposition of the invariant nucleotides in the active site of the ribozyme Q, using the color-coding of Fig. 2A. The sheared G22:A39 and reverse-Hoogsteen A23:U37 exclude U38 from the helical stack, permitting it to make a tertiary interaction with the stem A ligation site in the form of a base triple. The ligation-site phosphate is shown in white; hydrogen bonds are indicated as magenta dotted lines, a Mg2+ ion that bridges the helices of stems A and C is shown as a magenta sphere, and its three direct coordinations to the A39, G40 (light green) and G1 (white) phosphates are shown as light blue dotted lines. U71 (shown in red) forms a wobble pair with G52 (shown in orange). (B) Close-up of the interhelical G1:A51:U38 base triple. The G:A interaction is a sheared base pair, and the A:U interaction is a reverse–Watson-Crick base pair.

Fig. 4.

Close-up view of important interactions in the ligation site. (A) The ligation junction in stem A of the undocked conformation (P) and the docked conformation (Q) are shown according to the color scheme used in Fig. 2C. In addition, the base and ribose of the docked conformation G1 is highlighted in red, and the ligation-site phosphate is shown in white. A conformational change is induced at the ligation site in the docked conformation relative to the undocked conformation by tertiary contacts with stem C (omitted in this figure for clarity). (B) A stereograph of the Mg2+ ion coordination to the A39, G40 and ligation-site (G1) phosphates. A difference Fourier peak contoured at 3.0 RMSD corresponding to the Mg2+ ion center of mass is shown in gray, and composite-omit electron density, as in Fig. 1, is shown in blue contoured at 2.0 RMSD. The Mg2+ ion appears in the composite-omit map with a peak height of 1.5 RMSD, and in refined sigma-A–weighted 2FobsFcalc maps at about 5 RMSD. The binding environment is inconsistent with a water binding-site but is ideal for a Mg2+ ion. (C) A stereograph of the active site of ribozyme Q in which the blue to red color scheme of Fig. 2 is used, in conjunction with the ligation site phosphate highlighted in white. The blue mesh represents the final refined sigma-A–weighted 2FobsFcalc map contoured at 1.5 RMSD. The ligation-site Mg2+ ion is shown in magenta, and water 13, believed to play a crucial role along with the Mg2+ ion in stabilizing the structure of the active site, is shown in red next to the “W.” A putative “active hydrogen bond” is shown as an orange dotted line; other hydrogen bonds are indicated as dark blue lines. (D) A schematic representation of a possible transition-state extrapolated from observed atomic positions in analogy to Fig. 4C. Potential hydrogen bonds and other noncovalent interactions are shown as thin dotted lines. Bonds that form or break are indicated as thicker dotted lines. The pyrophosphate leaving group, which is not observed in the structure, is designated as OPPi. Because of the observed 2.9 Å separation of 3′-O and the O1P of A39, which is also coordinated by the Mg2+ ion, the potential for an active hydrogen bond involved in nucleophile generation is indicated. As the bond between the 3′-O and the phosphorus of G1 forms, the (unobserved) pyrophosphate group departs.

Mg2+ binding and the ribozyme's chemical mechanism. The L1 ligase is an obligate metalloenzyme that is highly specific for Mg2+. It was selected in the presence of 60 mM MgCl2 and functions optimally in Mg2+ concentrations as high as 100 mM. None of the other common divalent metal ions, including manganese, are able to substitute for Mg2+, which suggests a role for Mg2+ in the chemical mechanism of catalysis. Inspection of 2FobsFcalc and FobsFcalc difference Fourier maps allowed identification of several potential high-occupancy Mg2+ sites above 5σ. The most prominent of these peaks (Fig. 4B) is located proximal to the ligation site of the docked structure Q and is centered 2.2 Å from the nonbridging phosphate oxygens of A39 and G40 of stem C, making an angle of 91° between them. This site is also 4.5 Å from the N7 of G40, a distance and orientation consistent with a water-mediated coordination (26). On the basis of the nearly perfect resemblance to the ideal geometry for an octahedrally coordinated Mg2+ ion, and its incompatibility with water hydrogen-bonding distances, we have assigned this as a Mg2+ site. In addition to coordinating the two adjacent phosphates in stem C, the Mg2+ ion also makes a 2.2 Å contact to the ligationsite phosphate of G1 (Fig. 4B), thus bridging the stem C and stem A helices. This bridge and the U38:G1:A51 base triple form the main tertiary contacts that stabilize the docking of stems C and A in the ligase ribozyme and lock the molecule into what appears to be an active conformation.

The bond distances (2.2 Å) and angle (91°) between the nonbridging phosphate oxygens of A39 and G40 and the ligation-site Mg2+ are both (within experimental error) ideal for an octahedrally coordinated complex. The geometry of the complex with respect to the ligation-site phosphate is less ideal, with a Mg-O distance of 2.2 Å and angles of 86° and 153° between the phosphate oxygen of G1 and the phosphate oxygens of A39 and G40, respectively. Based upon the location of the Mg2+ ion bound to the ligation-site phosphate, the previously observed strict requirement for Mg2+ ions in the ligase reaction, as well as the details of this ion's coordination geometry, we propose that the ligation-site Mg2+ ion binds tightly to three phosphates that make up a specific Mg2+ binding pocket that is formed when stem C makes specific tertiary contacts with the invariant ligation site nucleotides. In addition, we hypothesize that the catalytic role of this Mg2+ ion may include screening the excess negative charge that accumulates in the transition state.

L1 ribozyme catalysis and regioselectivity. The ribose of U71 that contains the 3′-hydroxyl nucleophile of the ligation reaction participates in an extensive hydrogen-bonding network within the catalytic pocket including the A39 phosphate of stem C, and the 2′-hydroxyl of U38 via water W13 (Fig. 4C). This network suggests why the L1 ligase is regioselective for formation of the biologically relevant 5′ to 3′ phosphodiester bond rather than a 5′ to 2′ bond. The 3′-oxygen of U71 in the ligation product is only 2.9 Å from the nonbridging phosphate oxygen that is coordinated with the ligation-site Mg2+ ion. Although clearly within hydrogen-bonding distance, the 3′-oxygen cannot donate a hydrogen bond to the phosphate oxygen because it has already formed an ester linkage to the G1 phosphate. If the positions of these atoms are retained in the preligation complex, the O1P phosphate of A39 would be ideally positioned to abstract a proton from the 3′-oxygen, thus generating the attacking nucleophile. Although the 2′-hydroxyl of U71 also resides within potential hydrogen-bonding distance to this phosphate oxygen, it, unlike the 3′-oxygen, has a tightly bound water molecule as a hydrogen-bonding partner. This water molecule, W13, interacts specifically not only with the 2′-hydroxyl of U71, but also with the 2′-hydroxyl of U38, the exocyclic amine of G52, and possibly, the exocyclic oxygen of U71 (Fig. 4C). This hydrogen-bonding network may sequester the 2′-hydroxyl of U71 away from the ligation-site phosphate, and in the event of deprotonation, the 2′-alkoxide might be resupplied with a proton from water W13, which would effectively quench the side reaction (Fig. 4D).

The in vitro–evolved L1 RNA ligase ribozyme, therefore, appears to fold into a compact structure in which a set of invariant nucleotides interact to create a catalytic pocket capable of juxtaposing the ligation ends with a bound Mg2+ ion cofactor. The network of specific structural interactions that promote catalysis of phosphodiester bond formation, including transition-state stabilization interactions and functional group positioning for general base catalysis, as well as the propensity to fold into a preformed active site capable of binding a substrate and a metal ion cofactor, are each reminiscent of what has been observed in several natural ribozymes. The L1 ligase ribozyme thus demonstrates, in principle, that RNA indeed has the ability to evolve into a structure capable of catalyzing regiospecific phosphodiester bond ligation and appears to use strategies of transition-state stabilization and acid-base catalysis similar to those that exist for natural ribozymes and protein enzymes.

Supporting Online Material

Materials and Methods

Figs. S1 and S2

Table S1


References and Notes

View Abstract

Stay Connected to Science

Navigate This Article