Research Article

The Complete Atomic Structure of the Large Ribosomal Subunit at 2.4 Å Resolution

See allHide authors and affiliations

Science  11 Aug 2000:
Vol. 289, Issue 5481, pp. 905-920
DOI: 10.1126/science.289.5481.905

Abstract

The large ribosomal subunit catalyzes peptide bond formation and binds initiation, termination, and elongation factors. We have determined the crystal structure of the large ribosomal subunit fromHaloarcula marismortui at 2.4 angstrom resolution, and it includes 2833 of the subunit's 3045 nucleotides and 27 of its 31 proteins. The domains of its RNAs all have irregular shapes and fit together in the ribosome like the pieces of a three-dimensional jigsaw puzzle to form a large, monolithic structure. Proteins are abundant everywhere on its surface except in the active site where peptide bond formation occurs and where it contacts the small subunit. Most of the proteins stabilize the structure by interacting with several RNA domains, often using idiosyncratically folded extensions that reach into the subunit's interior.

In the last step of the gene expression pathway, genomic information encoded in messenger RNAs is translated into protein by a ribonucleoprotein called the ribosome (1). As in most other organisms, the prokaryotic ribosome (MW ≈ 2.6 × 106) is about two-thirds RNA and one-third protein and consists of two subunits, the larger of which is approximately twice the molecular weight of the smaller (2). The small subunit, which sediments at 30S in prokaryotes, mediates the interaction between mRNA codons and tRNA anticodons on which the fidelity of translation depends. The large subunit, which sediments at 50S in prokaryotes, includes the activity that catalyzes peptide bond formation—peptidyl transferase—and the binding site for the G-protein (GTP–binding protein) factors that assist in the initiation, elongation, and termination phases of protein synthesis.

Because the structures of several DNA and RNA polymerases have been determined at atomic resolution, the mechanisms of DNA and RNA synthesis are both well understood. Determination of the structure of the ribosome, however, has proven a daunting task. It is several times larger than the largest polymerase, and 100 times larger than lysozyme, the first enzyme to be understood at atomic resolution. Until now an atomic resolution structure for the ribosome has not been available, and as a result the mechanism of protein synthesis has remained a mystery.

Electron microscopy has contributed to our understanding of ribosome structure ever since the ribosome was discovered. In the last few years, three-dimensional (3D) electron microscopic images of the ribosome have been produced at resolutions sufficiently high to visualize many of the proteins and nucleic acids that assist in protein synthesis bound to the ribosome (3). Earlier this year, an approximate model of the RNA structure in the large subunit was constructed to fit a 7.5 Å resolution electron microscopic map of the 50S subunit from Escherichia coli as well as biochemical data (4).

Crystallization studies of the ribosome begun two decades ago by Yonath and Wittmann (5) and by the group at Pushchino (6) opened the possibility of using x-ray crystallography to determine the structure of the ribosome at atomic resolution. The first electron density map of the ribosome that showed features recognizable as duplex RNA was a 9 Å resolution x-ray crystallographic map of the large subunit from Haloarcula marismortui published 2 years ago (7). A year later, extension of the phasing of that map to 5 Å resolution made it possible to locate several proteins and nucleic acid sequences, the structures of which had been determined independently (8). At about the same time, with the use of similar crystallographic strategies, a 7.8 Å resolution map was generated of the entireThermus thermophilus ribosome, showing the positions of tRNA molecules bound to its A, P, and E sites (9), and a 5.5 Å resolution map of the 30S subunit from T. thermophilus was obtained, which allowed the fitting of solved protein structures and the interpretation of some of its RNA features (10). Subsequently, an independently determined, 4.5 Å resolution map of the T. thermophilus 30S subunit was published, which was based, in part, on phases calculated from a model corresponding to 28% of the subunit mass that had been obtained with a 6 Å resolution experimental map (11). The interpretation of the subunit packing in the two 30Sstructures is not the same, even though the crystals used by the two groups appear to be identical.

Using a 2.4 Å resolution, experimentally phased, electron density map, we have produced an atomic structure of the H. marismortui50S ribosomal. The model includes 2711 of the 2923 nucleotides of 23S ribosomal RNA (rRNA), all 122 nucleotides of its 5S rRNA, and structures for the 27 proteins that are well ordered in the subunit. Here, we describe the architecture of the subunit, the structure of its RNAs, and discuss the location, structures, and functions of its proteins.

The secondary structures of both 5S and 23S rRNA are remarkably close to those deduced for them by phylogenetic comparison. The secondary structure of the 23S rRNA divides it into six large domains, each of which has a highly asymmetric tertiary structure. The irregularities of their shapes notwithstanding, the domains fit together in an interlocking manner to yield a compact mass of RNA that is almost isometric. The proteins are dispersed throughout the structure and mostly concentrated on its surface, but they are largely absent from the regions of the subunit that are of primary functional significance to protein syntheses: the 30S subunit interface and the peptidyl transferase active site. The most surprising feature of many of these proteins is the extended, irregular structure of their loops and termini, which penetrate between RNA helices. The primary role of most proteins in the subunit appears to be stabilization of the 3D structure of its rRNA.

Structure determination. Several experimental approaches were used to extend the resolution of the electron density maps of theH. marismortui 50S ribosomal subunit from 5 to 2.4 Å. A back-extraction procedure was developed for reproducibly growing crystals that are much thicker than those available earlier and that diffract to at least 2.2 Å resolution. The twinning of crystals, which obstructed progress for many years (8), was eliminated by adjusting crystal stabilization conditions (12). All of the x-ray data used for high-resolution phasing were collected at the Brookhaven National Synchrotron Light Source except for two native data sets, which were collected at the Advanced Photon Source at Argonne (13) (Table 1). Osmium pentamine (132 sites) and iridium hexamine (84 sites) derivatives proved to be the most effective in producing isomorphous replacement and anomalous scattering phase information to 3.2 Å resolution (14). Intercrystal density averaging, which had contributed significantly at lower resolution, was not helpful beyond about 5 Å resolution. Electron density maps were dramatically improved, and their resolutions were eventually extended to 2.4 Å with the solvent-flipping procedure in the CNS program (15, 16).

Table 1

Statistics for data collection, phase determination, and model construction. HA, heavy-atom concentration; ST, soaking time; Res, resolution; λ, wavelength; Obs, observations; Redun, redundancy; Compl, completeness; (*) last-resolution shell.R iso: Σ|F PHF P|/F PH, whereF PH and F P are the derivative and the native structure factor amplitudes, respectively.R sym: ΣΣi|I (h)iI (h)i|/ΣΣ:I (h)i, whereI (h) is the mean intensity after reflections. Phasing power: rms isomorphous difference divided by the rms residual lack of closure.R cullis: Σ(∥F PHF P| − |F H(calc)∥)/Σ|F PHF P|, where F PH is the structure factor of the derivative and F P is that of the native data. The summation is valid only for centric reflection. FOM (figure of merit): mean value of the cosine of the error in phase angles. Abbreviations: MIRAS, multiple isomorphous replacement, anomalous scattering; SAD, single wavelength anomalous diffraction.

View this table:

Except for regions obscured by disorder, the experimentally phased 2.4 Å resolution electron density map was of sufficient quality that both protein and nucleic acid sequencing errors could be identified and corrected. Each nucleotide could be fitted individually, and the difference between A and G was usually clear without having to refer to the chemical sequence, as was the distinction between purines and pyrimidines (Fig. 1). Only a few of the many water molecules and metal ions evident in the electron density have been positioned so far.

Figure 1

Portions of the experimental 2.4 Å resolution electron density map. (A) A stereo view of a junction between 23S rRNA domains II, III, IV, and V having a complex structure that is clearly interpretable. The electron density is contoured at 2σ. The bases are white and the backbones are colored by domain as specified in Fig. 4. (B) The extended region of L3 interacting with its surrounding RNA, where the red RNA density is contoured at 2σ and the blue protein density is contoured at 1.5σ. (C) Detail in the L2 region showing a bound Mg2+ion. (D) Detail from L2 showing amino acid side chains. (E) Helices 94 through 97 from domain VI. The red contour level is at 2σ, and the yellow contour at 6σ shows the positions of the higher electron density phosphate groups.

Subtraction of the atomic model from the experimental electron density map leaves no significant density except water and ions, showing that the model accounts for all the macromolecular density. Preliminary refinement of the model was achieved with experimental phase restraint in the program CNS (16). The model was further refined in real space against the 2.4 Å electron density map with the program TNT (17), which yielded a model with anR factor of 0.33. One additional round of mixed target refinement of both atomic positions and B factors with CNS led to the structure described here. The current free Rfactor is 0.26 (Table 1).

Sequence fitting and protein identification. Guided by the information available on the secondary structures of 23SrRNAs (18), the sequence of 23S rRNA was fit into the electron density map nucleotide by nucleotide starting from its sarcin/ricin loop sequence [A2691 to A2702 (E. coli numbers A2654 to A2665)] whose position had been determined at 5 Å resolution (8). The remaining RNA electron density neatly accommodated 5S rRNA. The interpretation of electron density corresponding to protein was more complicated because each protein region had to be identified chemically before the appropriate sequence could be fit into it; with the assistance of D. Klein, L. Min, S. Antolić, and M. Schmeing, ∼4000 amino acid residues of 27 proteins were fit into electron density.

The H. marismortui 50S subunit appears to contain 31 proteins, and sequences for 28 of them exist in the Swiss-Prot data bank, including one called HMS6 or L7ae, which originally had been assigned to the small ribosomal subunit (19). The three remaining proteins were identified using the sequences of the ribosomal proteins from eukaryotes and other archaeal species as guides. No electron density was found for one of the H. marismortuilarge ribosomal subunit proteins in the sequence database, LX. Either the assignment of LX to the large subunit is in error, or LX is associated with a disordered region of the subunit. It is also possible that LX is absent from the subunits examined altogether.

The 2.4 Å resolution electron density map lacks clear electron density for proteins L1, L10, L11, and L12, the positions of which are known from earlier low-resolution x-ray and/or electron microscopic studies. These proteins are components of the two lateral protuberances of the subunit, which are both poorly ordered in these crystals. L1 is the sole protein component of one of them (20) and is visible in 9 Å resolution density maps of the subunit (7), but not at higher resolutions. L10, L11, and L12 are components of the other protuberance, which is often referred to as the L7/L12 stalk (20). L11 and the RNA to which it binds were located in the 5 Å resolution electron density map of theH. marismortui large subunit (8) using the independently determined crystal structures of that complex (21, 22). A protein fragment (∼100 residues) associated with the RNA stalk that supports the L11 complex can be seen in the 2.4 Å resolution map. On the basis of its location, the fragment must be part of L10. No electron density corresponding to L12 was seen at any resolution, but the L12 tetramer is known to be attached to the ribosome through L10, and the L10/L12 assembly is known to be flexible under some circumstances (23), which may explain its invisibility here.

The structures of eubacterial homologs of proteins L2, L4, L6, L14, and L22 have previously been determined in whole or in part (Table 2). L2, L6, and L14 were initially located in the 5 Å resolution map (8). L4 and L22 have now been identified and positioned the same way. Electron density corresponding to most of the remaining proteins was assigned by comparing chain lengths and sequence motifs deduced from the electron density map with known sequence lengths. Occasionally, these comparisons were assisted by the information available on relative protein positions (24) and protein interactions with 23S rRNA and 5S rRNA (25). Each of the protein electron density regions so identified is well accounted for by the amino acid sequence assigned to it.

Table 2

Large-subunit proteins from Haloarcula marismortui. The top block of proteins include all those known to have eubacterial homologs of the same name. The second block lists proteins found in the H. marismortui large ribosomal subunit that have only eukaryotic homologs (19). Their names are all followed by the letter “e” to distinguish them from eubacterial proteins that would otherwise have the same name. The third block are large-subunit proteins for which no H. marismortuisequence yet exists. They are identified by sequence homology with standard L names. 1The structures of all or part of homologs of the following proteins were previously determined: L1 (28), L2 (43), L4 (44), L6 (58), L11 (21,22, 59), L12 (60), L14 (61), L22 (62), and L30 (63). All other structures, except L10, have been newly determined in this study. 2Rat homolog. Rat equivalents to H. marismortui protein are from (26). 3Sequence chain length.4Conformation: glb, globular; ext, extension.5The protein interactions with the six domains of 23S rRNA, 5S rRNA, and other proteins are specified. (+) Implies that the interaction is substantial; (±) implies a weak, tangential interaction. Protein names in parentheses implies that the interactions are weak; otherwise, the interaction is substantial.

View this table:

The most interesting of the proteins identified by sequence similarity was L7ae, which first appeared to be L30e. The L30e identification seemed plausible because the structure of yeast L30e superimposes neatly on the electron density of L7ae, and the structure of the RNA to which L7ae binds resembles that of the mRNA element to which yeast L30e binds (26). Nevertheless, the sequence of HMS6, which by sequence similarity is a member of the L7ae protein family, better fits the electron density. Four of the other proteins identified by sequence similarity, L24e, L37e, L37ae, and L44e, contain zinc finger motifs. The rat homologs of L37e and L37ae were predicted to be zinc finger proteins on the basis of their sequences (27), and this prediction helped identify their homologs in H. marismortui. Even though no H. marismortui sequences were available for the proteins L10e, L15e, and L37ae, they could be identified using the alignments of other available archaeal sequences.

General appearance of the subunit. In its rotated crown view (Fig. 2), the large ribosomal subunit, which is about 250 Å across, presents its surface that interacts with the small subunit to the viewer with the three projections that radiate from that surface pointed up. Although the protuberance that includes L1 is not visible in the 2.4 Å resolution electron density map, the structure of L1, which has been determined independently (28), has been positioned approximately in lower resolution maps (7) and is included here to orient the reader. It is evident that, except for its two lateral protuberances, the large ribosomal subunit is monolithic. There is no hint of a division of its structure into topologically separate domains. In addition, partly because it lacks obvious domain substructure but also because it is so large, it is impossible to comprehend looking at it as a whole. To convey a sense of how it is put together, the subunit must be dissected into its chemical components.

Figure 2

The H. marismortui large ribosomal subunit in the rotated crown view. The L7/L12 stalk is to the right, the L1 stalk is to the left, and the central protuberance (CP) is at the top. In this view, the surface of the subunit that interacts with the small subunit faces the reader. RNA is shown in gray in a pseudo–space-filling rendering. The backbones of the proteins visible are rendered in gold. The Yarus inhibitor bound to the peptidyl transferase site of the subunit is indicated in green (64). The particle is approximately 250 Å across.

RNA secondary structure. All the base pairs in H. marismortui 23S rRNA stabilized by at least two hydrogen bonds were identified with a computer program that searched the structure for hydrogen bond donors and acceptors separated by less than 3.2 Å. Bases linked by at least two such bonds were considered paired if the angle between their normals was less than 45° and if the angle between bonds and base normals was also less than 45°. On the basis of the results of this analysis, R. Gutell and colleagues prepared a secondary structure diagram (Fig. 3) in the format standard for 23S/28S rRNAs. The secondary structure predicted for this molecule by phylogenetic comparison was remarkably accurate, but it did not find all of the tertiary pairings and failed to identify interactions involving conserved bases. In addition to base pairs of nearly every type, the RNA contains numerous examples of well-known secondary structure motifs such as base triples, tetraloops, and cross-strand purine stacks, but no dramatically new secondary structure motifs have been identified so far.

Figure 3

The secondary structure of the 23SrRNA from H. marismortui is shown in a format made standard by R. Gutell and colleagues (65). It was prepared by Dr. Gutell to show all the base pairings seen in the crystal structure of the large subunit that are stabilized by at least two hydrogen bonds. Pairings shown in red were predicted and were observed. Those shown in green were predicted, but were not observed. Interactions shown in blue were observed, but were not predicted. Bases shown in black were not involved in pairing interactions. Sequences that cannot be visualized in the 2.4 Å resolution electron density map are depicted in gray with the secondary structures predicted for them.

The secondary structure of this 23S rRNA consists of a central loop that is closed by a terminal stem, from which 11 more or less complicated stem-loops radiate. It is customary to describe the molecule as consisting of six domains and to number its helical stems sequentially starting from the 5′ end (Fig. 4) (29). The division of the molecule into domains as shown in Fig. 4 deviates from standard practice with respect to helix 25, which is usually considered part of domain I. Here, it is placed in domain II because it interacts more strongly with domain II than the other elements of domain I.

Figure 4

The tertiary and secondary structures of the RNA in the H. marismortui large ribosomal subunit and its domains. (A and B) The RNA structure of the entire subunit. Domains are color-coded as shown in the schematic (C). (A) The subunit particle in its crown view. (B) The crown rotated by 180° about a vertical axis in the plane of the image. (C) Schematic secondary structure diagram of 23S rRNA with the domain coloring used throughout the figures and the helices numbered according to Leffers et al. (29). (D) The secondary structure of 5S rRNA fromH. marismortui. Bases joined by thick lines represent Watson-Crick pairing, and those joined by a lower case “o” indicate non–Watson-Crick pairing. Bases joined by thin lines interact via a single hydrogen bond, whereas those in black are unpaired. Base pairings shown in red are phylogenetically predicted pairings that are now confirmed (66). Pairs shown in blue were observed but were not predicted, and pairs shown in green were predicted but were not observed. (E through L) Stereo views of the RNA domains of the 23S rRNA and of 5S rRNA. Each domain is color-coded from its 5′ end to its 3′ end to help the viewer follow its trajectory in three dimensions. The backbones are shown as ribbons and the bases as sticks. The surfaces where the most important interdomain interactions occur are shown in mono to the right. (E), Domain I; (F), domain II; (G), domain III; (H), domain IV; (I), domain V, crown view; (J), domain V, back view; (K), domain VI; and (L), 5S rRNA.

There are five sequences longer than 10 nucleotides in 23SrRNA whose structures cannot be determined from the 2.4 Å resolution map because of disorder. Together they account for 207 out of the 232 nucleotides missing from the final model. The disordered regions are: all of helix 1, the distal end of helix 38, helix 43/44 to which ribosomal protein L11 binds, the loop end of stem-loop 69, and helix 76/77/78, which is the RNA structure to which L1 binds. For completeness, these regions are included in Fig. 3 (in gray) with the secondary structures determined for them phylogenetically.

Overall architecture of rRNA. The six domains of 23S rRNA and 5S rRNA all have complicated, convoluted shapes that fit together to produce a compact, monolithic RNA mass (Fig. 4, A and B). Thus, despite the organization of its RNAs at the secondary structure level, in three dimensions the large subunit is a single, gigantic domain. In this respect, it is quite different from the small subunit. Even in low-resolution electron micrographs the small subunit consists of three structural domains, each of which contains one of the three secondary structure domains of its RNA (30). This qualitative difference between the two subunits may reflect a requirement for conformational flexibility that is greater for the small subunit.

Domain I, which looks like a mushroom (Fig. 4E), lies in the back of the particle, behind and below the L1 region. The thin part of the domain starts in the vicinity of domain VI, which is the location of its first and last residues. Helices 1 and 25 span the particle in the back and then the domain expands into a larger, more globular structure below and behind the L1 region.

Domain II is the largest of the six 23S rRNA domains, accounting for most of the back of the particle. It has three protrusions that reach toward the subunit interface side of the particle (Fig. 4F). One of them (helix 42 to 44) is the RNA portion of the L7/L12 stalk, which is known to interact with elongation factors, is not well ordered in these crystals. The second domain II protrusion is helix 38, which is the longest, unbranched stem in the particle. It starts in the back of the particle, bends by about 90° and protrudes toward the small subunit between domains V and 5S rRNA. The third region (helix 32 to 35.1) points directly toward the small subunit and its terminus, the loop of stem-loop 34, interacts directly with the small ribosomal subunit (31). This loop emerges at the subunit interface between domains III and IV.

Domain III is a compact globular domain that occupies the bottom left region of the subunit in the crown view (Fig. 4G). It looks like a four-pointed star with the origin of the domain (stem-loop 48) and stem-loops 52, 57, and 58 forming the points. The most extensive contacts of domain III are with domain II, but it also interacts with domains I, IV, and VI. Unlike all the other domains, domain III hardly interacts with domain V at all; the sole contact is a van der Waals interaction involving a single base from each domain.

Domain IV accounts for most of the interface surface of the 50S subunit that contacts the 30S subunit (Fig. 4H). It forms a large diagonal patch of flat surface on that side of the subunit and connects to domains III and V in the back of the particle. Helices 67 through 71 constitute the most prominent feature of domain IV and form the front rim of the active site cleft, which is clearly visible at low resolution (Fig. 2). This is one of the few regions of the 23S rRNA that is not extensively stabilized by ribosomal proteins. Helix 69 in the middle of this ridge interacts with the long penultimate stem of 16S rRNA in the small ribosomal subunit (9).

Domain V, which is sandwiched between domains IV and II in the middle of the subunit, is known to be intimately involved in the peptidyl transferase activity of the ribosome (32). Structurally, this domain can be divided into three regions (Fig. 4, I and J). The first starts with helix 75 and ultimately forms the binding site for protein L1. The second, which consists of helices 80 through 88, forms the bulk of the central protuberance region and is supported in the back by the 5S rRNA and domain II. The third region, which includes helices 89 through 93, extends toward domain VI and helps stabilize the elongation factor-binding region of the ribosome.

The smallest domain in 23S rRNA, domain VI, which forms a large part of the surface of the subunit immediately below the L7/L12 stalk, resembles a letter X with a horizontal bar at the bottom (Fig. 4K). The most interesting region of this domain is the sarcin-ricin loop (SRL) (stem-loop 95), the structure of which has been extensively studied in isolation (33,34). The SRL is essential for factor binding, and ribosomes can be inactivated by the cleavage of single covalent bonds in this loop (35). As suggested by nucleotide protection data, the major groove of this loop is exposed to solvent (36), and its conformation is stabilized by proteins and through interaction with domain V.

5S ribosomal RNA, which is effectively the seventh RNA domain in the subunit, consists of three stems radiating out from a common junction called loop A (Fig. 4D). In contrast to what is seen in the crystal structure of fragment 1 from E. coli5S rRNA (37), the helix 2/3 arm of the molecule stacks on its helix 4/5 arm, not helix 1 (Fig. 4L). This arrangement results from a contorted conformation of loop A residues that involves two stacked base triples. Indeed, from the secondary structure point of view, the loop A–helix 2/3 arm of 5SrRNA is remarkable, with a high concentration of unusual pairings leading to a convoluted RNA secondary structure.

Sequence conservation and interactions in 23SrRNA. Although 23S/28S rRNAs contain many conserved sequences, they also vary substantially in chain length. Shorter 23S/28S rRNAs are distinguished from their longer homologs by the truncation of, or even the elimination of, entire stem-loops, and by comparing sequences, one can identify a minimal structure that is shared by all (38). The expansion sequences in the 23S rRNA of H. marismortui, i.e., the sequences it contains that are larger than the minimum, are shown in Fig. 5 in green. They are largely absent from the subunit interface surface of the particle, but they are abundant on its back surface, far from its active sites. This is consistent with low-resolution electron microscopic observations, suggesting that the region of the large subunit whose structure is most conserved is the surface that interacts with the small subunit (39).

Figure 5

Conserved residues and expansion sequences in the 23S rRNA of H. marismortui. The general, nonconserved RNA in these images is gray. Sequences that are found to be >95% conserved across the three phylogenetic kingdoms are shown in red. Sequences where expansion in the basic 23S structure is permitted are shown in green (65). (A) The particle rotated with respect to the crown view so that its active site cleft can be seen. (B) The crown view. (C) The back view of the particle, i.e., the crown view rotated 180° about its vertical axis.

There are two classes of conserved sequences in 23S rRNA. One contains residues concentrated in the active site regions of the large subunit. The second class consists of much shorter sequences scattered throughout the particle (Fig. 5, red sequences). The SRL sequence in domain VI and the cluster of conserved residues belonging to domain V located at the bottom of the peptidyl transferase cleft are members of the first class. They are conserved because they are essential for substrate binding, factor binding, and catalytic activity. Most of the residues in the second class of conserved residues are involved in the inter- and intradomain interactions that stabilize the tertiary structure of 23SrRNA. Adenosines are disproportionately represented in this class. The predominance of adenosines among the conserved residues in rRNAs has been pointed out previously (40). Throughout the particle, adenosines are observed to participate in tertiary interactions by exploiting the smooth N1-C2-N3 face of the adenine base, which allows for very close packing and additional backbone-backbone interactions. In particular, a reoccurring pattern of two or more stacked adenosines that dock into the minor grooves of receptor helices seems to reveal a very basic principle in tertiary RNA structure formation and could be regarded as an equivalent of a hydrophobic core formation in globular protein domains. Common RNA structural motifs, such as the ribose zipper and the tetraloop-tetraloop receptor interaction, depend on this principle of adenosine packing. A manuscript in preparation describes these A-dependent interactions at greater length.

In addition to its reliance on A-dependent motifs, the tertiary structure of the domains of 23S rRNA and their relative positions are stabilized by familiar tertiary structure elements like pseudoknots and tetraloop-tetraloop receptor motifs (41,42). Thus, in many places, base pairs and triples stabilize the interactions of sequences belonging to different components of the secondary structure of 23S rRNA.

5S rRNA and 23S rRNA do not interact extensively with each other. The few RNA/RNA interactions that do occur involve the backbones of the helix 4/5 arm of 5S rRNA and of helix 38 of 23S rRNA. Most of the free energy and specificity of 5S rRNA binding to the large ribosomal subunit appears to depend on its extensive interactions with proteins that act as modeling clay, sticking it to the rest of ribosome.

Proteins. We have determined the structures of 27 proteins found in the large ribosomal subunit of H. marismortui(Table 2). Twenty-one of these protein structures have not been previously established for any homologs, and the structures of the six that do have homologs of known structure have been rebuilt into the electron density map with their H. marismortui sequences. In addition, there are structures available for homologs of H. marismortui L1, L11, and L12, which cannot be visualized in the 2.4 Å resolution electron density map. Only the structure of L10 is still unknown among the 31 proteins of this subunit.

Almost all of these structures are complete. Yet, an entire domain of L5 is missing from the electron density, presumably because of disorder. Further, L32e is also noteworthy. Its NH2-terminal 97 residues are not seen in the electron density map, and the electron density map suggests that its COOH-terminal residue may be covalently bonded to the most NH2-terminal of its visible residues.

Of the 30 large subunit ribosomal proteins whose structures are known, 17 are globular proteins, similar in character to thousands whose structures are in the Protein Data Bank (Table 2). The remaining 13 proteins either have globular bodies with extensions protruding from them (“glb+ext”) or are entirely extended (“ext”). Their extensions often lack obvious tertiary structure and in many regions are devoid of significant secondary structure as well (Fig. 6). These extensions may explain why many ribosomal proteins have resisted crystallization in isolation. The exceptions that prove the rule are L2 and L4, both of which are proteins belonging to the “glb+ext” class. Protein L2 was crystallized and its structure solved only after its extensions had been removed (43), and the large loop of L4 that is extended in the ribosome is disordered in the crystal structure of intact L4 (44).

Figure 6

The backbone structures of some large subunit ribosomal proteins that have nonglobular extensions. The globular domains of these proteins are shown in green, and their nonglobular extensions are depicted in red. The positions of the zinc ions in L44e and L37e are indicated by large dots in red.

Except for proteins L1, L7, L10, and L11, which form the tips of the two lateral protuberances, the proteins of the 50S subunit do not extend significantly beyond the envelope defined by the RNA (Fig. 7). Their globular domains are found largely on the particle's exterior, often nestled in the gaps and crevices formed by the folding of the RNA. Thus, unlike the proteins in spherical viruses, the proteins of the large ribosomal subunit do not form a shell around the nucleic acid with which they associate, and unlike the proteins in nucleosomes, they do not become surrounded by nucleic acid, either. Instead, the proteins act like mortar filling the gaps and cracks between “RNA bricks.”

Figure 7

Proteins that appear on the surface of the large ribosomal subunit. The RNA of the subunit is shown in gray and protein backbones are shown in gold. (A) The crown view of the subunit. (B) The back side of the subunit in the 180o rotated crown view orientation. (C) A view from the bottom of the subunit down the polypeptide tunnel exit which lies in the center. The proteins visible in each image are identified in the small images at the lower left of the figure. Figures were generated using RIBBONS (67).

The distribution of proteins on the subunit surface is nearly uniform, except for the active site cleft and the flat surface that interacts with the 30S subunit. In the crown view, the proteins lie around at the periphery of the subunit (Fig. 7A), but when viewed from the side opposite the 30S subunit binding site (the “back side”), they appear to form an almost uniform lattice over its entire surface (Fig. 7B). Similarly, the bottom surface of the subunit, which includes the exit of polypeptide tunnel, is studded with proteins (Fig. 7C). Indeed, the six proteins that surround the tunnel exit may play a role in protein secretion because they are part of the surface that faces the membrane and the translocon when membrane and secreted proteins are being synthesized (45).

Although Fig. 7 shows protein chains disappearing into the ribosome interior, the degree to which proteins penetrate the body of the particle can be fully appreciated only when the RNA is stripped away. The interior of the particle is not protein-free, but it is protein-poor compared with the surface of the particle. Extended tentacles of polypeptide, many of which emanate from globular domains on the surface, penetrate into the interior, filling the gaps between neighboring elements of RNA secondary structure (Fig. 8E). The bizarre structures of these extensions are explained by their interactions with RNA. A detailed analysis of these proteins and their interactions with RNA will be presented elsewhere.

Figure 8

The protein extensions into the RNA and multiple domain interactions. (A) Some of the proteins in the neighborhood of the polypeptide tunnel exit, showing the unusual extended structure of L39e (green) that enters the tunnel and L37e (red) that interpenetrates the RNA. L29, which is on top of L37e, has been removed. Protein L22 extends a long β hairpin extension inside the 23S rRNA. L24 has a similar extension but the entire protein is on the surface of the particle. L39 is the only protein in the subunit that lacks tertiary structure, whereas L37e has both NH2- and COOH-terminal extensions. L19 is unique in having two globular domains on the surface of the subunit connected by an extended sequence that weaves through the RNA, shown as gray ribbons. (B) The nonglobular extensions of L2 and L3 reaching through the mass of 23S rRNA toward the peptidyl transferase site, which is marked by a CCdAp-puromycin molecule, the Yarus inhibitor (64). (C) L22 interacting with portions of all six of the domains of 23SrRNA. (D) Schematic of the 23S rRNA secondary structure showing the locations sequences (red) that make contact with protein. (E) Stereo view of the proteins of the large ribosomal subunit without the RNA. Proteins are colored as an aid to visualization only. (F) A cross section of the subunit in the area of the tunnel exit. Protein L22 is shown as ribbons in red, and the β hairpin loop where mutations confer erythromycin resistance is in orange. Atoms on the surface are gray, protein atoms are green, and atoms at the slice interface are blue.

Although extended, nonglobular structures are rare in the protein database, they are not unknown. Extended protein termini often form interprotein contacts, e.g., in viral capsids, presumably adopting fixed structures only upon capsid formation (46). The basic “tails” of histones may behave the same way when nucleosomes form (47). The NH2-terminal sequences of capsid proteins are often positively charged, and in virus crystal structures, the electron density for these sequences often disappears into the interior of the virus where they presumably interact with asymmetrically arranged nucleic acid. The interactions observed in the ribosome could be useful models for these viral interactions.

The interactions between extended polypeptides and RNA in the large subunit, which stabilize its massive nucleic acid structure, result in an intertwining of RNA and protein in the center of the subunit (Fig. 8, A and B). It is hard to imagine such an object assembling from its components efficiently in anything other than a highly ordered manner. Chaperones may well be required to prevent the aggregation of the extended regions of these proteins, which are likely to be disordered outside the context provided by rRNA, and to manage the folding of rRNA.

Mutations in some ribosomal proteins render bacteria resistant to certain antibiotics. One such example is a deletion of three amino acids in the β hairpin loop of protein L22 that renders bacteria resistant to erythromycin (48). Because this β hairpin is forming part of the surface of the tunnel wall, the mutation changes the surface properties of the polypeptide exit tunnel and may prevent the antibiotic from binding; alternatively, the mutation could be acting indirectly through RNA.

Protein and RNA interactions. Because protein permeates the large subunit extensively, there are only a few segments of the 23S rRNA that do not interact with protein at all. Of the 2923 nucleotides in 23S rRNA, 1157 make at least van der Waals contact with protein (Fig. 8D), and there are only 10 sequences longer than 20 nucleotides in which no nucleotide contacts protein. The longest such sequence contains 47 nucleotides, and is the part of domain IV that forms the ridge of the active site cleft.

The extent of the interactions between RNA and protein that occur when the large subunit assembles can be estimated quantitatively. Using the Richards algorithm (49) and a 1.7 Å radius probe to compute accessible surface areas, it can be shown that 180,000 Å2of surface become buried when the subunit forms from its isolated, but fully structured components. This is about half their total surface area. The average is about 6000 Å2 per protein. Although this is an enormous amount compared with the surface buried when most protein oligomers form, it should be recognized that ribosome assembly must be accompanied by a large loss in conformational entropy that does not occur when most proteins oligomerize. The extended protein termini and loops of the ribosomal proteins are almost certainly flexible in isolation, and in the absence of protein, the RNA is probably quite flexible as well. Thus, the burial of a large amount of surface area may be required to provide the free energy required to immobilize the structures of these molecules.

All of the proteins in the particle except L12 interact directly with RNA, and all but 7 of the remaining 30 proteins interact with two rRNA domains or more (Table 2). The “champion” in this regard is L22, which is the only protein that interacts with RNA sequences belonging to all six domains of the 23S rRNA (Fig. 8C). The protein-mediated interactions between 5S rRNA and 23S rRNA are particularly extensive. Protein L18 attaches helix 1 and helix 2/3 of 5S rRNA to helix 87 of 23S rRNA. Protein L21e mediates an interaction between the same part of 5S rRNA and domains II and V. Protein L30 binds helix 4/5 region of 5S RNA to domain II. Loop C is linked to domain V by protein L5, and loop D is attached to domains II and V by protein L10e. Whatever else they may do, it is evident that an important function of these proteins is stabilization of the relative orientations of adjacent RNA domains. Several also help secure the tertiary structures of the domains with which they interact.

Because most ribosomal proteins interact with many RNA sequences and the number of proteins greatly exceeds the number of RNA domains, it can hardly come as a surprise that every rRNA domain interacts with multiple proteins (Table 2). Domain V, for example, interacts with 15 proteins, some intimately and a few in passing.

It is clear that the oligonucleotide binding experiments long relied on for information about the RNA binding properties of ribosomal proteins have underestimated their potential for interacting with RNA. The high-affinity RNA binding site identified on a protein by such an experiment may indeed be important for ribosome assembly, but its many, weaker interactions with other sequences are likely to be missed, and they too may be vital for ribosome structure. Most ribosomal proteins crosslink RNA, and crosslinking is impossible without multiple interactions. Similar considerations may apply to proteins that are components of other ribonucleoproteins, such as the spliceosome.

Of the seven proteins that interact with only one domain, three (L1, L10, and L11) participate directly in the protein synthesis process. Rather than being included in the ribosome to ensure that the RNA adopts the proper conformation, it seems more appropriate to view the RNA as being structured to ensure the correct placement of these proteins. Another three (L24, L29, and L18e) interact with several secondary structure elements within the domains to which they bind, and presumably they function to stabilize the tertiary structures of their domains. The last of the single RNA domain proteins, L7ae, is puzzling. It cannot function as an RNA stabilizing protein because it interacts with only a single sequence in domain I, but it is far from the peptidyl transferase and factor binding sites. It is quite close to L1, however, which appears to be important for E-site function (50), and maybe it is involved in that activity. It could also be involved in the 70S assembly, because L7ae was originally assigned as a small subunit protein (HMS6).

While many ribosomal proteins interact primarily with RNA, a few interact significantly with other proteins. The most striking structure generated by protein-protein interactions is the protein cluster composed of L3, L6, L13, L14, and L24e that is found close to the factor binding site. The surface of these proteins provides important interactions with factors. It may prove to be more generally the case that ribosomal proteins interacting primarily with RNA are principally stabilizing RNA structure, whereas some of those showing extensive protein-protein interactions may have additional binding functions.

The structure presented above illuminates both the strengths and weaknesses of approaches to complex assemblies that depend on determining the structures of components in isolation. The structures of the globular domains of homologs of the proteins in the large ribosomal subunit from H. marismortui are largely the same as those of the corresponding domains in the intact subunit, though adjustments in domain positions are sometimes required. Consequently, these structures were very useful for locating proteins and interpreting lower resolution electron density maps. However, for obvious reasons, the structures of the extended tails and loops of ribosomal proteins cannot be determined in the absence of the RNAs that give them structure, and the feasibility of strategies that depend on producing low–molecular weight RNA-protein complexes that have all the RNA contacts required to fix the structures of such proteins seems remote. The structures of RNA fragments also depend on their context. Whereas the sarcin/ricin loop has much the same structure in isolation (33, 34) as it does in the ribosome, the structure of 5S rRNA in isolation (37) differs in some respects from what is seen in the ribosome, and the structure of the isolated P loop (51) shows no resemblance to the structure of the P loop in the ribosome. Clearly, a “structural genomics” approach to the ribosome, which would have entailed determining the structures of all of the proteins and all possible rRNA fragments, neither would have provided the relevant structures of all of the pieces nor would it have shown their relative positions. Indeed, the structure of the large ribosomal subunit highlights the importance of structural studies of entire assemblies that show biological activity.

The analysis of the 50S ribosomal subunit structure presented here describes the overall architectural principles of RNA folding and its interaction with proteins, but many exciting details remain to be explored. The principles of protein-RNA interaction that should emerge from the 27 protein complexes with RNA have yet to be developed. On average, each of the 27 proteins has 3000 Å2of surface area in contact with RNA, which is comparable to the 2700 Å2 of glutaminyl-tRNA synthetase that contact tRNAGln (52), so the number of interactions between RNA and protein to be analyzed in the large subunit structure is 30 times the number in this synthetase complex. Further, because the RNA structure of the large subunit will increase the RNA structural database by a factor of 4 to 5, most of the important RNA secondary and tertiary structural motifs to be found in nature may be represented. It will be interesting to see whether a complete analysis of this RNA structural database will enable the prediction of structures for other RNA sequences. Unknown at this time is the ease with which it will be possible to model by sequence homology the 50S ribosomal subunit rRNA from other species and kingdoms. However, the extensive sequence conservation in the 23S rRNA that forms the core active site and peptide tunnel regions suggests that reasonably accurate homology modeling based on this H. marismortuisubunit structure may be feasible. Finally, enormous numbers of monovalent and divalent metal ions as well as water molecules are visible in this map. Analysis of their interactions with RNA should elucidate their roles in the formation and stabilization of RNA structure.

  • * These two authors contributed equally to this work.

REFERENCES AND NOTES

View Abstract

Stay Connected to Science

Navigate This Article