One Sequence, Two Ribozymes: Implications for the Emergence of New Ribozyme Folds

See allHide authors and affiliations

Science  21 Jul 2000:
Vol. 289, Issue 5478, pp. 448-452
DOI: 10.1126/science.289.5478.448


We describe a single RNA sequence that can assume either of two ribozyme folds and catalyze the two respective reactions. The two ribozyme folds share no evolutionary history and are completely different, with no base pairs (and probably no hydrogen bonds) in common. Minor variants of this sequence are highly active for one or the other reaction, and can be accessed from prototype ribozymes through a series of neutral mutations. Thus, in the course of evolution, new RNA folds could arise from preexisting folds, without the need to carry inactive intermediate sequences. This raises the possibility that biological RNAs having no structural or functional similarity might share a common ancestry. Furthermore, functional and structural divergence might, in some cases, precede rather than follow gene duplication.

Related protein or RNA sequences with the same folded conformation can often perform very different biochemical functions, indicating that new biochemical functions can arise from preexisting folds. But what evolutionary mechanisms give rise to sequences with new macromolecular folds? When considering the origin of new folds, it is useful to picture, among all sequence possibilities, the distribution of sequences with a particular fold and function. This distribution can range very far in sequence space (1). For example, only seven nucleotides are strictly conserved among the group I self-splicing introns, yet secondary (and presumably tertiary) structure within the core of the ribozyme is preserved (2). Because these disparate isolates have the same fold and function, it is thought that they descended from a common ancestor through a series of mutational variants that were each functional. Hence, sequence heterogeneity among divergent isolates implies the existence of paths through sequence space that have allowed neutral drift from the ancestral sequence to each isolate. The set of all possible neutral paths composes a “neutral network,” connecting in sequence space those widely dispersed sequences sharing a particular fold and activity, such that any sequence on the network can potentially access very distant sequences by neutral mutations (3–5).

Theoretical analyses using algorithms for predicting RNA secondary structure have suggested that different neutral networks are interwoven and can approach each other very closely (3,5–8). Of particular interest is whether ribozyme neutral networks approach each other so closely that they intersect. If so, a single sequence would be capable of folding into two different conformations, would have two different catalytic activities, and could access by neutral drift every sequence on both networks. With intersecting networks, RNAs with novel structures and activities could arise from previously existing ribozymes, without the need to carry nonfunctional sequences as evolutionary intermediates. Here, we explore the proximity of neutral networks experimentally, at the level of RNA function. We describe a close apposition of the neutral networks for the hepatitis delta virus (HDV) self-cleaving ribozyme and the class III self-ligating ribozyme.

In choosing the two ribozymes for this investigation, an important criterion was that they share no evolutionary history that might confound the evolutionary interpretations of our results. Choosing at least one artificial ribozyme ensured independent evolutionary histories. The class III ligase is a synthetic ribozyme isolated previously from a pool of random RNA sequences (9). It joins an oligonucleotide substrate to its 5′ terminus. The prototype ligase sequence (Fig. 1A) is a shortened version of the most active class III variant isolated after 10 cycles of in vitro selection and evolution. This minimal construct retains the activity of the full-length isolate (10). The HDV ribozyme carries out the site-specific self-cleavage reactions needed during the life cycle of HDV, a satellite virus of hepatitis B with a circular, single-stranded RNA genome (11). The prototype HDV construct for our study (Fig. 1B) is a shortened version of the antigenomic HDV ribozyme (12), which undergoes self-cleavage at a rate similar to that reported for other antigenomic constructs (13, 14).

Figure 1

Secondary structures of the prototype ribozymes. When designing ribozyme derivatives, we strived to maintain important Watson-Crick pairing (thick dashes), wobble pairing (oval), and the identity of key residues (pink letters). Secondary-structure landmarks are labeled as follows: P, paired region; J, joining segment; L, loop. (A) The class III self-ligating ribozyme. The substrate for the prototype ligase hybridizes to the ribozyme, forming the P1 helix and positioning the 2′-hydroxyl for ligation. The L3 and L5 tetraloops (UUUU) replace nonessential sequences deleted from the original class III isolate [isolate e3, (9)]. Important residues and pairing interactions were deduced by comparative analysis of active ligases that had been selected from large pools of class III variants (9). (B) The HDV antigenomic ribozyme cleaves itself at the indicated linkage (arrowhead). This sequence differs from typical antigenomic constructs in that the distal end of the P4 helix, known to be dispensable, was truncated by eight nucleotides to generate a construct 86 nucleotides long, one nucleotide shorter than the reacted ligase prototype. Important residues and pairing have been deduced from comparative analysis of HDV ribozymes and site-directed mutagenesis experiments (11). This information was rationalized and extended with the recent crystal structure of the genomic ribozyme (36).

The prototype class III and HDV ribozymes have no more than the 25% sequence identity expected by chance and no fortuitous structural similarities that might favor an intersection of their two neutral networks. Nevertheless, sequences can be designed that simultaneously satisfy the base-pairing requirements of both the HDV and ligase ribozymes, while preserving most of the residues important for the activity of each ribozyme. An example of such a sequence is shown (Fig. 2A). This sequence is 42 mutational steps away from the prototype ligase (39 base substitutions, one point deletion, and two single-nucleotide insertions) and 44 mutational steps from the prototype HDV ribozyme (40 substitutions, one deletion, and three insertions). In Fig. 2A, this sequence was color-coded in accordance with the ligase secondary structure and then threaded into the HDV secondary structure, illustrating that the two ribozyme folds have no base pairs in common. The two tertiary structures probably have no hydrogen bonds in common, yet sequences such as this can readily be designed that might lie within, or very near to, the neutral networks for both ribozymes.

Figure 2

One sequence, two ribozymes. (A) An intersection sequence is shown threaded through both the class III ligase and HDV ribozyme secondary structures. Colored segments are base- paired in the ligase fold. The single-nucleotide substitutions that generate constructs LIG1 and HDV1 are indicated. (B) Ligation and cleavage reactions of the intersection sequence (INT) and neighboring sequences, measured under identical conditions. LIG1 and HDV1 have single-nucleotide substitutions designed to favor ligation function and the HDV fold, respectively, as shown in (A). Substituting C38 with G in LIG1 generates LIG2, whereas deleting G10 in HDV1 generates HDV2. The top gel resolves the substrate (5′-radiolabeled) and product of ligation (10). The bottom resolves the precursor RNA (5′-radiolabeled) and the 5′-cleavage product (14). For LIG2, LIG1, INT, and HDV1, the ligation products are represented by two bands because of an unanticipated ligation activity of truncated ribozyme molecules (25) in addition to the activity of full-length RNA. For these constructs, the lower product band was excluded when calculating ligation rates. For all other ligase constructs, only the full-length product band was observed. Some HDV transcripts have an extra, untemplated G at their 5′ terminus, explaining the appearance of the second, slower migrating cleavage product. The deletion of G10 accounts for the faster mobility of the 5′-cleavage products of HDV2 compared with those of the intersection and HDV1 ribozymes.

When this sequence was synthesized in the two formats of Fig. 2A, catalytic activity was detectable for both self-ligation and site-specific self-cleavage (Fig. 2B). Ligation occurred with formation of a 2′,5′-phosphodiester linkage (15), the regiospecificity of the class III ligase (9), indicating that the class III ligase fold was assumed by some of the molecules. Cleavage occurred with formation of a cyclic phosphate (15), as expected for the HDV fold (11, 16). The cleavage rate was 70 times faster than the rate of uncatalyzed RNA cleavage (17, 18). The ligation rate exceeded those of nonenzymatic ligation reactions in which analogous oligoribonucleotides are aligned by simple Watson-Crick pairing. Ligation was 460 times faster than nonenzymatic formation of 2′,5′ linkages and seven times faster than that of 3′,5′ linkages (17, 19, 20). Although ligation and cleavage are slower than for the prototype ribozymes, the single sequence does catalyze both ribozyme reactions.

The effects of point mutations on this sequence provided further evidence that its position in sequence space lies at a junction of the two ribozyme folds. Substituting C13 with A simultaneously restores a residue known to be critical for ligase activity and introduces a G:A mismatch in the HDV fold (Fig. 2A, construct LIG1). The C13A point substitution dramatically increased the ligation rate (90 times) and lowered the cleavage rate below detection (Fig. 2B). The U73C substitution, which is expected to stabilize the HDV fold, substantially increased site-specific cleavage (120 times), while lowering the ligation rate twofold (Fig. 2, HDV1). Hence, one sequence can assume the fold of either of two ribozymes and catalyze the two respective reactions. Although the activities of this sequence are too low to conclude that the intersection of ribozyme networks can be neutral, we call it an “intersection sequence” because it demonstrates that ribozyme networks do intersect in a qualitative manner.

Ribozyme activity provides a very sensitive readout for RNA folding, but it cannot provide quantitative information on the number of sequences populating each fold at a given time. A low activity could be explained either by a small fraction of molecules populating a ribozyme fold or by a suboptimal active site within molecules that do fold properly. Structural probing of the prototype, LIG1, intersection, and HDV1 RNAs with the Pb(II)-cleavage assay (21) indicated that both potential explanations are relevant. The dominant fold of the intersection sequence resembles that of the ligase, suggesting that suboptimal residues (e.g., C13) in the ribozyme core attenuate ligase activity. The converse is true for the HDV fold; far fewer molecules appear to assume the HDV fold, yet those that do populate the fold are highly active (22).

The substantial improvement seen with single-nucleotide substitutions suggested that the intersection sequence might be very close to both neutral networks. We searched for additional substitutions that would bring the level of activity within range of the prototype sequences (23). Substitution C38G, when combined with the A13C substitution (construct LIG2), brought the ligation rate within 15-fold of the rate of the prototype ligase, well inside the range of variability seen for isolates of the same ribozyme from different biological species (24). The intersection sequence is also very close to the HDV neutral network; combining the G10 point deletion with the U73C substitution (construct HDV2) brought the self-cleavage rate within twofold of the prototype's rate. Thus, only four substitutions separate LIG2 and HDV2, yet the two ribozyme sequences have totally different folds and near-prototype activities.

To confirm that both LIG2 and HDV2 are on the neutral network of their respective prototypes, we designed neutral paths in sequence space that link these minor variants of the intersection sequence to their prototype sequences (Fig. 3) (25). Each step along these paths changed no more than two residues, often as compensatory mutations. A few of the sequences designed during this effort had less than a tenth of the prototype activity; such sequences were discarded, and alternative steps were chosen that remained on a neutral path. Very smooth paths could be designed, gradually changing nearly half the ribozyme residues yet never deviating from the prototype activities by more than sevenfold (Fig. 3B). The ease by which we were able to design these neutral sequences reinforces the idea that there are a great many neutral paths for each ribozyme fold. The ability to find neutral paths to very distant sequences, with no more than two residues changed per step, supports the notion that each fold is indeed represented by a single, contiguous neutral network rather than by multiple, isolated networks (6, 7, 26). Because these two ribozymes share no evolutionary history or structural features, it is reasonable to expect that neutral networks for other pairs of ribozymes closely approach each other. For ribozymes of similar size, it appears plausible that each ribozyme neutral network might closely approach a great many, if not all, other ribozyme networks (27).

Figure 3

A close apposition of two ribozyme neutral networks. (A) Alignment of sequences spanning the distance between the two prototype ribozymes (25). Each sequence differs from its neighbors at no more than two residues. Each variant is named on the basis of whether it catalyzes ligation (LIG) or self-cleavage (HDV) and the number of residues that differ from the intersection sequence (boxed residues). The prototype ligase (LIG P) and HDV (HDV P) sequences are at the top and bottom of the alignment, respectively, with their secondary structures annotated as in Fig. 1. Positions are numbered with respect to the intersection sequence (INT), as in Fig. 2A. (B) Activities of the ribozyme sequences aligned in (A). Self-ligation activity is plotted in blue; self-cleavage activity is plotted in red. The horizontal axis represents the number of residues that differ from the intersection sequence. The vertical axis indicates the reaction rate of each ribozyme, normalized to that of the respective prototype ribozymes (37). The relative rate for uncatalyzed cleavage with formation of a cyclic phosphate (17) is indicated by the long-dashed line. The relative rates for nonenzymatic, template-directed oligonucleotide ligation (17) are indicated by the short-dashed line (ligation with formation of a 2′,5′-linkage) and the dotted line (ligation with formation of a 3′,5′-linkage). Both ligation and cleavage rates are plotted for the intersection sequence, demonstrating an intersection of the two ribozyme networks.

The intersection of two networks need not be neutral to provide a viable mechanism for accessing new ribozyme folds, particularly if neutral networks very closely approach each other, as seen for the ligase and HDV ribozymes (Fig. 3). Once neutral drift generates sequences for the preexisting activity that are near to sequences with a different fold having advantageous activity, the chance mutation to an intersection sequence could impart a net selective advantage by virtue of its new biochemical function, even if the preexisting activity is greatly diminished. At this point, the gene for the intersection sequence could duplicate, and the two genes could diverge by adaptive evolution to more optimal activities, as each is relieved of the constraint imposed by providing dual activities. Thus, with intersecting networks, functional and structural divergence can precede gene duplication. This differs from the canonical view of divergence following gene duplication (28).

The likelihood that this mechanism for acquiring new folds has played an important role in the emergence of new ribozymes is difficult to ascertain. It depends on the abundance of intersection sequences that link neutral networks with biologically relevant functions and the extent to which neutral networks are explored during the course of evolution. The competing mechanism, in which new folds emerge from nonfunctional, arbitrary sequences carried as “junk” in the genome, also relies on the chance occurrence of a series of rare events. Its likelihood depends on the abundance of useful ribozymes among arbitrary sequences and the abundance of nonfunctional sequences carried in the genome. Where there is a high cost of maintaining nonfunctional sequences (for example, in genomes of viruses, prokaryotes, and perhaps organisms of the RNA world), the intersection of ribozyme networks might provide an important mechanism for the emergence of new ribozyme folds. This could occur through a global alternative conformation, as is seen for the ligase and HDV folds, or perhaps more likely, through a series of smaller conformational switches, each involving different domains of the ancestral ribozyme. Thus, extant RNAs with no structural or functional similarity might belong to the same genealogical lineage, though no historical information has been retained at the level of sequence, structure, or function.

The finding that RNA networks intersect raises the question of whether networks for very different protein folds might also intersect (4). Some peptide segments can assume very different folds within larger protein or ribonucleoprotein contexts (29–31). However, no protein sequence is known to autonomously assume two different enzymatic folds and catalyze the two respective reactions (32). It is questionable whether such a protein sequence could be found; indeed, it has been an accomplishment to design protein sequences that have different folds yet retain 50% sequence identity (33). The chemical diversity of the 20 amino acid subunits may restrict the conformational options of protein sequences. The 20 amino acids have characteristic propensities to form α-helical or β-sheet secondary structure. They also differ markedly in water solubility, which may explain why the most dramatic protein conformational changes, such as those of prions, result in insoluble aggregates (34). In contrast, the roles of the four RNA nucleotides in forming secondary and tertiary structure are less specialized. Hence, the lack of chemical diversity among the four RNA nucleotides, often cited as a disadvantage for developing efficient catalysis, allows for comprehensive conformational flexibility, leading to the intersections of ribozyme networks and making RNA an attractive biopolymer for the birth of new functional folds in early evolution.

  • * To whom correspondence should be addressed. E-mail: dbartel{at}


View Abstract

Navigate This Article