Research Article

Architecture of RNA Polymerase II and Implications for the Transcription Mechanism

See allHide authors and affiliations

Science  28 Apr 2000:
Vol. 288, Issue 5466, pp. 640-649
DOI: 10.1126/science.288.5466.640


A backbone model of a 10-subunit yeast RNA polymerase II has been derived from x-ray diffraction data extending to 3 angstroms resolution. All 10 subunits exhibit a high degree of identity with the corresponding human proteins, and 9 of the 10 subunits are conserved among the three eukaryotic RNA polymerases I, II, and III. Notable features of the model include a pair of jaws, formed by subunits Rpb1, Rpb5, and Rpb9, that appear to grip DNA downstream of the active center. A clamp on the DNA nearer the active center, formed by Rpb1, Rpb2, and Rpb6, may be locked in the closed position by RNA, accounting for the great stability of transcribing complexes. A pore in the protein complex beneath the active center may allow entry of substrates for polymerization and exit of the transcript during proofreading and passage through pause sites in the DNA.

RNA polymerase II (pol II), the central enzyme of gene expression, synthesizes all messenger RNA in eukaryotes. The intricate regulation of pol II transcription underlies cell growth and differentiation. The size and complexity of pol II befit this important role. The best characterized form of the enzyme, that from the yeast Saccharomyces cerevisiae, comprises 12 different polypeptides, with a total mass of about 0.5 megadaltons (MD) (Table 1). The human enzyme must be virtually identical, as the human genes for all subunits show a high degree of sequence conservation (Table 1), and at least 10 mammalian pol II genes can be substituted for their counterparts in yeast (1).

Table 1

Yeast RNA polymerase II subunits.

View this table:

Pol II is the core of the transcription machinery. On its own, it can unwind the DNA double helix, polymerize RNA, and proofread the nascent transcript. In the presence of additional proteins, it assembles even larger initiation and elongation complexes, capable of promoter recognition and response to regulatory signals. A regulated initiation complex comprises pol II, five general transcription factors, and a multiprotein Mediator (2–4). It contains some 60 proteins, with a total mass of 3.5 MD. In transcription elongation complexes, Mediator and some of the general transcription factors are replaced by SII (TFIIS), Elongator, other elongation factors, and RNA processing proteins (3, 5, 6).

Determination of molecular models for the pol II transcription machinery has so far been limited to a half dozen of the smallest proteins and protein fragments (7–17). Detailed structural studies of the larger proteins and multiprotein complexes, essential for understanding the mechanism and regulation of transcription, pose a more formidable challenge. We report here the x-ray analysis of a 10-subunit yeast pol II. As nine of the subunits are conserved among RNA polymerases I, II, and III (18), our findings provide a basis for understanding the entire eukaryotic transcription machinery. They suggest roles for each of the many subunits and give insight into the remarkable features of the transcription mechanism.

Our investigation stemmed originally from the development of a yeast cell extract capable of accurately initiated pol II transcription (19) and the development of a general method of forming single-layer [two-dimensional (2D)] protein crystals (20). An active extract opened the way to the isolation of functional pol II (21), whereas the 2D crystallographic approach extended the reach of structure determination to such scarce, large, fragile multiprotein complexes. The first 2D crystallization trials gave crystals too small and too poorly ordered for structure determination (21). However, the ease and small amount of material required for 2D crystallization allowed its use as a structural assay to guide the preparation of pol II that would form better crystals. It soon emerged that heterogeneity of pol II, owing to substoichiometric levels of two small subunits, Rpb4 and Rpb7, was an impediment to crystallization. The problem was solved by the isolation of pol II from an RPB4 deletion strain of yeast, yielding a “deletion” enzyme lacking both Rpb4 and Rpb7, which together account for only 8% of the mass of the wild-type protein. The deletion enzyme, unimpaired in transcription elongation and also fully active in transcription initiation when supplemented with the missing subunits (22), formed exceptionally large, well-ordered 2D crystals (23). Structures of pol II alone, and complexed with general transcription factors and nucleic acids, were determined by 3D reconstruction from electron micrographs of 2D crystals to about 15 Å resolution (24–27). In the course of this work, it became apparent that even at the low protein concentration used for 2D crystallization, typically about 0.1 mg/ml, there was a tendency of the crystals to grow epitaxially, adding additional layers in register with the first (23). This tendency was exploited by the use of 2D crystals as seeds for growing 3D crystals (28), which are now readily obtained by conventional methods as well.

X-ray diffraction from 3D crystals of pol II was initially undetectable. The problem proved to be oxidation. Maintenance of an inert atmosphere during the final stages of protein purification and throughout crystal growth, as well as improvements in crystallization conditions, enabled the collection of diffraction data to 3.5 Å resolution (29). Because of the great size of the protein and unit cell, only large heavy atom clusters, such as an 18–tungsten-atom cluster, could be used for initial phase determination. The validity of the initial phases was shown by a close fit of the electron density map computed at 6 Å resolution to the pol II map from 2D crystallography (29). There was only one deviation between the two maps, which was attributed to movement of a protein domain, suggested to clamp nucleic acid in a transcribing complex (29).

With a 6 Å phase set, it should have been possible to locate individual heavy atoms in isomorphous derivatives and to extend structure determination to higher resolution. There were, however, three major obstacles. First, diffraction to 3.5 Å resolution could not be obtained reproducibly. Second, the crystals were nonisomorphous, varying by as much as 10 Å in one dimension of the unit cell. Very few crystals could be derivatized and matched with an isomorphous native crystal. Because of the low abundance of pol II, approximately 10,000 liters of cell culture had to be processed to obtain the 6 Å electron density map, and far more would have been required for extension to high resolution. The final obstacle was that heavy atom compounds commonly used for protein phase determination destroyed diffraction from the crystals.

A crystallographic backbone model for RNA polymerase II. These difficulties were overcome in the present work by a soaking procedure that shrank the crystals to an apparent minimum of the variable unit cell dimension (30). The resulting crystals were isomorphous and diffracted isotropically to 3.0 Å resolution (31). Because the improved crystals were nonisomorphous with the original crystals, initial phases were redetermined by multiple anomalous dispersion (MAD) with a six–tantalum-atom cluster derivative, which showed a single peak in difference Pattersons (Fig. 1) (32). These phases sufficed to reveal individual heavy atoms in other crystals by means of cross-difference Fouriers (Fig. 1) (33). An extensive search identified nonstandard mononuclear heavy atom compounds that gave useful derivatives (Table 2) (34). Phases were determined by multiple isomorphous replacement with anomalous scattering (MIRAS) from 10 data sets, ranging from 4.0 to 3.1 Å resolution (Table 2) (35). The resulting molecular envelope was in good agreement with that previously obtained at 6 Å resolution (29). After solvent flattening, an electron density map was obtained that revealed the course of the polypeptide chain and many amino acid side chains (Fig. 2) (36).

Figure 1

Localization of heavy atoms. (A) Harker sections of isomorphous and anomalous difference Patterson maps of the tantalum cluster derivative (Table 2). A single peak at the same position in the two maps is observed. Heights of the Harker peaks in the isomorphous and anomalous difference Pattersons were 6 σ and 5 σ, respectively. The resolution range of the data used is 40 to 5.5 Å. The contour levels are 3 σ (background) and 1 σ (steps). (B) Anomalous difference Fourier calculated with native data collected at the zinc anomalous peak energy using initial tantalum MAD phases (left) and final MIRAS phases (right). The projection of one asymmetric unit along the z axis is shown for tantalum and MIRAS phases at a contour level of 3 σ and 7 σ, respectively, with 1 σ steps. The eight strong peaks correspond to structural zinc atoms (Table 1). The ninth peak corresponds to the active site metal and likely arises from partial replacement of magnesium by zinc.

Figure 2

Subunit structures determined previously or rebuilt here fitted to the experimental pol II electron density. The solvent-flattened MIRAS electron density map (blue) is contoured at 1.0 σ. Experimental phases in the resolution range 40 to 3.1 Å were used to calculate the map. In (A) and (B), the map was filtered with program MAPMAN to reduce noise (84). This map facilitated fold recognition but appears to be at lower resolution, and side chain density is largely removed. In (C), the original map is shown, which is noisier but reveals many details. (A) Cα model of Rpb5 [black (47)] fitted to the density (blue). A loop that is involved in packing against Rpb1 is in a different conformation in pol II than in the structure of free Rpb5 (orange). Peaks of anomalous difference Fourier transforms of two mercury derivatives (pink, yellow, both contoured at 5 σ) coincide with the position of Cys83. (B) Cα traces of the NMR structure of the Rpb10 homolog from Methanobacterium thermoautotrophicum [orange (81)] fitted to the density (blue) and the rebuilt backbone model for yeast Rpb10 (black). The location of the zinc ion in the NMR structure coincides with a strong peak in the zinc anomalous Fourier (pink, contoured at 7 σ). (C) One of the β strands in Rpb11 (black, residues 68 to 75) fitted to the density (blue). Distinct electron density is present for several side chains. The model was obtained by placing the conserved core of E. coli α (69) and replacing the side chains with those in yeast Rpb11 using the most common rotamer. This figure was prepared with BOBSCRIPT (85) and MOLSCRIPT (86).

Table 2

Data collection and MIRAS phasing.

View this table:

Available structures of pol II subunits and subunit fragments, comprising 14% of all pol II amino acid residues, were manually fit into the electron density (37). The complete structures of yeast Rpb5 and Rpb8 were used, whereas structures ofEscherichia coli and archaebacterial homologs of yeast Rpb3, 6, 9, 10, and 11 were truncated to the conserved regions (Table 1). In all cases, a unique fit of the subunit fold to regions of the electron density map was observed. Subunit placement was facilitated by the location of eight zinc ions, revealed by a zinc anomalous difference Fourier (Fig. 1 and Table 1). Most parts of the yeast subunits missing from the homologous proteins could be modeled as polyalanine into adjacent regions of electron density. The remaining density, about 70% of the total volume, was attributed to the two large subunits, Rpb1 and Rpb2, with a minor contribution from the smallest subunit, Rpb12. It was modeled as polyalanine fragments, with the use of standard secondary structure elements wherever possible. Combination of phases from MIRAS and an initial polyalanine model resulted in an improved map, which allowed adjustment and extension of the model (38). The polyalanine fragments were assigned to Rpb1 or Rpb2 on the basis of (i) the location of the active-site metal bound by Rpb1 (see below); (ii) two zinc-binding motifs in the NH2-terminal region of Rpb1, connected by a linker of appropriate length; (iii) one zinc site in the COOH-terminal region of Rpb2; and (iv) cross-linking of Rpb5 to the COOH-terminal region of Rpb1 and of Rpb3 to residues 901 to 992 of Rpb2 (39).

The current backbone model comprises 8 polyalanine fragments for Rpb1, 10 fragments for Rpb2, and major portions of all small subunits (Table 1). It accounts for the entire molecular volume observed in the crystals and contains 3219 residues, about 83% of the total, assuming all residues are ordered except the COOH-terminal domain of Rpb1. Building of an atomic model is well advanced.

General architecture and DNA binding. The two largest subunits, Rpb1 and Rpb2, form distinct masses with a deep cleft between them (Fig. 3). Each of the small subunits occurs in a single copy, arrayed around the periphery. The structure is cross-strutted by elements of Rpb1 and Rpb2 that traverse the cleft: A helix of Rpb1 bridges the cleft, and the COOH-terminal region of Rpb2 extends to the opposite side. The Rpb1-Rpb2 complex is anchored at one end by a subassembly of Rpb3, Rpb10, Rpb11, and Rpb12.

Figure 3

Architecture of yeast RNA polymerase II. Backbone models for the 10 subunits are shown as ribbon diagrams. Secondary structure has been assigned by inspection. The three views are related by 90° rotations as indicated. Downstream DNA, though not present in the crystal, is placed onto the ribbon models as 20 base pairs of canonical B-DNA (blue) in the location previously indicated by electron crystallographic studies (27). Eight zinc atoms (blue spheres) and the active site magnesium (pink sphere) are shown (Table 1). The box (upper right) contains a key to the subunit color code and an interaction diagram. The same views and color coding are used throughout the article. This and other figures have been prepared with RIBBONS (87).

The active site was located crystallographically by replacement of the catalytic Mg2+ ion with Zn2+, Mn2+, or Pb2+ (40). A native zinc anomalous Fourier showed a 10-σ peak that likely results from partial replacement of the active site Mg2+ by Zn2+during protein purification (Fig. 1), and difference Fouriers obtained from crystals soaked with either Mn2+ or Pb2+showed a single peak at the same location (41). The metal ion site occurs within a prominent loop of Rpb1 (Fig. 3), which, on the basis of preliminary sequence assignment, harbors the conserved aspartate residue motif (42). Only one catalytic metal ion was found, and only one was reported for a bacterial RNA polymerase (43), although a two-metal ion mechanism, as described for single-subunit polymerases (44), is not ruled out.

The location of duplex DNA downstream of the active site (ahead of the transcribing polymerase) was previously determined by difference 2D crystallography of an actively transcribing complex (27). Canonical B-form DNA placed in this location lies in the Rpb1-Rpb2 cleft, and can follow a straight path to the active site (Fig. 3). About 20 base pairs are readily accommodated between the edge of the polymerase and the active site, consistent with nuclease digestion studies showing the protection of about this length of downstream DNA (45). This proposal for the pol II–DNA complex is also consistent with results of protein-DNA cross-linking experiments: Rpb1 and Rpb5 cross-link to one side of the DNA and Rpb2 to the other; and in the case of Rpb5, the cross-links are located about 5 to 15 base pairs downstream of the active site (46).

Jaws position downstream DNA. Rpb5, and regions of Rpb1 and Rpb9 on the opposite side of the Rpb1-Rpb2 cleft, form “jaws” that appear to grip the DNA (Fig. 4). Both the upper and lower jaw may be mobile, opening and closing on the DNA. Mobility within Rpb5 is suggested by comparison with the x-ray crystal structure of the subunit alone (47). There was a nearly perfect fit of the subunit structure to the corresponding region of the pol II electron density map (Fig. 2A), except for a change in relative orientation of the NH2- and COOH-terminal domains, and a conformational change of a loop in the COOH-terminal domain (Fig. 4B). The solvent-exposed, NH2-terminal domain (residues 1 to 142) has apparently moved by as much as 5 Å in the direction of DNA in the pol II cleft, relative to the position in Rpb5 alone, with the COOH-terminal domain (residues 143 to 215) held fixed against the body of Rpb1 (Fig. 4B). The observed position of the NH2-terminal domain in pol II is defined by crystal contacts.

Figure 4

Jaws. (A) Stereoview of structural elements constituting the jaws (left) and the location of these elements within pol II (right). (B) Mobility of the larger, NH2-terminal domain of Rpb5. Backbone models of free Rpb5 [gray (47)] and Rpb5 in pol II (pink) are shown with their smaller, COOH-terminal domains superimposed. (C) Conservation of amino acid residues of Rpb5.

Residues in the Rpb5 loops facing the DNA are conserved (Fig. 4C). Two prolines that are strictly conserved present their side chains to the DNA with a spacing and relative orientation appropriate for contacting the DNA backbone. Proline residues have been seen to interact with backbone ribose moieties of DNA in other crystal structures (48,49). Such nonspecific van der Waals interactions might favor a particular rotational setting of the DNA, without greatly impeding the helical screw rotation required to propel the DNA toward the active site and to unwind it for transcription.

Other conserved residues of Rpb5 are located in the linker between the NH2- and COOH-terminal domains and in the NH2-terminal helix (Fig. 4C). Since the linker is not involved in subunit-subunit interactions, conserved residues might ensure a directed movement of the NH2-terminal domain. Conserved residues in the NH2-terminal helix form a positive charge cluster that is too far from DNA to contact it directly, but might attract it through long-range interactions.

Rpb5 is likely to play a role in transcriptional activation (50). The NH2-terminal domain of Rpb5 binds to the transactivation domain of the hepatitis B virus X protein (51). Another Rpb5-interacting protein interferes with transactivation (52). Some activators might function by enhancing jaw-DNA interaction, thereby stabilizing transcription initiation or elongation complexes.

The upper jaw, formed by regions of Rpb1 and Rpb9, corresponds with a domain previously shown to be mobile by 2D crystallography (53). Rpb9 is composed of two zinc-binding domains separated by a 15-residue linker. A stretch of the linker adds a β strand to a sheet in the Rpb1 region of the jaw. Rpb9 therefore buttresses Rpb1, possibly constraining mobility of the jaw and strengthening its grip on DNA. Mutations in Rpb9 alter the locations of transcription start sites (54–56), which might be explained by a diminished grip on the DNA, or alternatively, by direct Rpb9-DNA interaction before entry of the DNA into the Rpb1-Rpb2 cleft.

A clamp retains DNA. A second mobile element of pol II, previously revealed by low-resolution structures and referred to as a “hinged” domain, was suggested to clamp nucleic acids in the cleft (29). This element, here termed the “clamp,” comprises NH2-terminal regions of Rpb1 and Rpb6 and the COOH-terminal region of Rpb2 (Fig. 5). All three polypeptides enter at the base of the clamp near the active site, allowing a degree of conformational freedom but not unrestricted movement of the clamp. Within the Rpb6 region, 17 out of 42 residues are negatively charged, forming a cluster near the bottom of the clamp. This region of Rpb6 is also phosphorylated by casein kinase II, suggesting a regulatory role (57).

Figure 5

Clamp. Structural elements constituting the clamp and their location in pol II are shown. The COOH-terminal region of Rpb2 and the NH2-terminal region of Rpb1 bind one and two zinc ions, respectively (blue spheres). The NH2-terminal tail region of Rpb6 extends from its main body (at the bottom in the front view) into the clamp. The direction of movement of the clamp revealed by comparison with electron crystal structures (29) is indicated (double-headed red arrow).

The clamp forms one side of the Rpb1-Rpb2 cleft, where it may interact with the DNA (and the DNA-RNA hybrid, see below) from the active site to about 15 residues downstream. This DNA region corresponds with a double-stranded DNA binding site, 3 to 12 residues downstream of the active site, defined by biochemical analysis ofE. coli RNA polymerase (58–60). This binding site was referred to as a “sliding clamp” because of its importance for the great stability of a transcribing complex and processivity of transcription (60). Closure of the clamp over the DNA could account for this stability. Such a movement of the NH2-terminal region of the largest subunit was inferred from cross-linking studies of the E. coli enzyme (58). Although the clamp is seen here in an open conformation, it is involved in crystal contacts and the observed position is likely determined by the crystal lattice. The electron density in this region is of lower quality than elsewhere in the map, and the three zinc peaks associated with the region have the lowest heights (Zn6-8, Table 1), also consistent with mobility of the clamp.

DNA-RNA hybrid binding site, RNA binding site. Transcribing polymerases have been shown to harbor an unwound region of DNA, or “bubble,” within which is centered a DNA-RNA hybrid of 8 or 9 base pairs, with the 3′ or growing end of the RNA at the active site (Fig. 6A) (60). Linear extension of duplex DNA placed in our crystallographic model, to accommodate the DNA-RNA hybrid, is impossible because of an element from Rpb2 blocking the path (Figs. 3, 4, and 6). This blocking element corresponds with a “wall” of density previously noted in the structure of bacterial RNA polymerase (43). Because of the wall, and because the active site lies well beneath the level of the downstream DNA, the DNA-RNA hybrid must be tilted relative to the axis of the downstream DNA (dashed line in Fig. 6C). The exact orientation of the hybrid remains to be determined.

Figure 6

Topology of the polymerizing complex, and location of Rpb4 and Rpb7. (A) Nucleic acid configuration in polymerizing (top) and backtracking (bottom) complexes. (B) Structural features of functional significance and their location with respect to the nucleic acids. A surface representation of pol II is shown as viewed from the top in Fig. 3. To the surface representation has been added the DNA-RNA hybrid, modeled as nine base pairs of canonical A-DNA (DNA template strand, blue; RNA, red), positioned such that the growing (3′) end of the RNA is adjacent to the active site metal and clashes with the protein are avoided. The exact orientation of the hybrid remains to be determined. The nontemplate strand of the DNA within the transcription bubble, single-stranded RNA and the upstream DNA duplex are not shown. (C) Cutaway view with schematic of DNA (blue) and with the helical axis of the DNA-RNA hybrid indicated (dashed white line). An opening in the floor of the cleft that binds nucleic acid exposes the DNA-RNA hybrid (pore 1) to the inverted funnel-shaped cavity below. The plane of section is indicated by a line in (B), and the direction of view perpendicular to this plane (side) is as in Fig. 3. (D) Surface representation as in (B), with direction of view as in (C). The molecular envelope of pol II determined by electron microscopy of 2D crystals at 16 Å resolution is indicated (yellow line), as is the location of subunits Rpb4 and Rpb7 (arrow, Rpb4/7), determined by difference 2D crystallography (25).

At the upstream end of the DNA-RNA hybrid (5′ end of the RNA, remote from the active site), the strands must separate. Biochemical studies show that the RNA strand enters a binding site on the protein, extending from about 10 to 20 nucleotides upstream of the active site (61). There are two prominent grooves in the pol II structure exiting the hybrid binding site, each of which could accommodate one, but not two, nucleic acid strands. One groove winds around the base of the clamp (Fig. 7, groove 1). The other is between the lower part of the wall and Rpb1, and continues downward between Rpb1 and Rpb11 (Fig. 7, groove 2). We favor groove 1 as the RNA binding site for three reasons. First, the length and location of the groove are appropriate for binding a region of RNA 10 to 20 nucleotides from the active site, in agreement with biochemical studies. Second, the RNA path would lead back toward the downstream DNA, ending in close proximity to the NH2-terminal region of Rpb1 (defined by a zinc site). This path would accord with the reported cross-linking of RNA about 20 nucleotides upstream of the active site to the NH2-terminal region of the largest subunit of E. coli RNA polymerase (58–60). Finally, RNA in the groove at the base of the clamp could explain the great stability of transcribing complexes. The affinity of the polymerase for the DNA template is coupled to the presence of an RNA transcript (60). We speculate that closure of the clamp over DNA, assuring its retention in a transcribing complex, would enlarge the groove at the base of the clamp, and subsequent binding of RNA in the groove would prevent the clamp from reopening. RNA would act as a lock on the closed conformation of the clamp.

Figure 7

Possible RNA exit grooves and funnel beneath the active site. The model of Fig. 6B is shown in two perpendicular directions of view (side, back), and also viewed from the opposite side (bottom). To the side and back views have been added dashed lines corresponding to about 10 nucleotides of RNA, lying in well-defined grooves leading away from the hybrid-binding region (groove 1, red; groove 2, orange). The nontemplate strand of the DNA within the transcription bubble and the upstream DNA duplex are not shown. To the bottom view has been added a solid line indicating the rim of the funnel-shaped cavity.

Mobility of the clamp may also be modulated by interactions with other pol II subunits and transcription factors, for example, Rpb4 and Rpb7. Although these two small subunits were absent from the form of pol II analyzed here, their approximate location is known from electron microscopy of 2D crystals (25). A surface representation of the crystallographic backbone model corresponds closely with the molecular envelope from 2D crystals (Fig. 6D). On this basis, Rpb4 and Rpb7 occupy a crevice in the surface between the lower jaw and the clamp (Fig. 6D). Interaction with either of these mobile elements or with downstream DNA could underlie the requirement for Rpb4 and Rpb7 for the initiation of transcription (22).

A funnel for substrate entry, backtracking, and elongation factor access. The floor of the Rpb1-Rpb2 cleft, which supports duplex DNA and the DNA-RNA hybrid, is very thin and perforated, exposing the nucleic acids to the space below. The perforation is bisected by the helix that forms a bridge between Rpb1 and Rpb2, creating two pores, one of which lies beneath the active site (pore 1) and the other, beneath the downstream DNA (pore 2). Both pores are about 12 Å in diameter and lie at the apex of an inverted funnel-shaped cavity, which increases to about 30 Å in diameter at the opposite side of pol II (Fig. 7, bottom). As the Rpb1-Rpb2 cleft is occupied by duplex DNA and the DNA-RNA hybrid during transcription, nucleotides may be unable to enter above the active site and may instead gain access from below, through the funnel and pore 1, as previously suggested for both pol II and bacterial RNA polymerase (29, 43).

The funnel and pore 1 may play similar roles in other aspects of transcription. Bacterial and eukaryotic RNA polymerases oscillate between forward (polymerization) and backward (backtracking) movement during transcription (Fig. 6A) (60). Backtracking is important for proofreading and for traversing obstacles such as DNA damage, bound proteins, or natural pause sites in the DNA. During backtracking, the polymerase and associated transcription bubble move backward along both the DNA and the RNA. The region engaged in the DNA-RNA hybrid retreats like a zipper, releasing the 3′ end of the RNA in single-stranded form, and incorporating single-stranded RNA on the 5′ side of the transcription bubble into the hybrid (Fig. 6A). As mentioned above for access of nucleotides to the active site during polymerization, duplex DNA and hybrid in the Rpb1-Rpb2 cleft may block release of the 3′ end of the RNA into the cleft during backtracking. Rather, as suggested for entry of nucleotides, the 3′ end of the RNA may exit through the funnel and pore 1.

Backtracking beyond a certain point can result in an arrested complex, unable to reverse direction, to restore the 3′ end of the RNA to the active site, and to resume transcription (60). We speculate that when a certain length of RNA has been extruded by backtracking, it may interact with a site in the funnel and be trapped, preventing reversal and recovery. For recovery from arrest, cleavage of the RNA is required to generate a new 3′ end at the active site (60). This cleavage is achieved with the help of transcript cleavage factors (62, 63). The funnel and pore 1 may provide access for such factors, for example, TFIIS. A small zinc-binding domain of TFIIS has an extended β hairpin at one end with two conserved residues that come near the active site of pol II and that are critical for RNA cleavage (15, 16, 64–66). Also included are tryptophan and arginine side chains involved in nucleic acid binding (67, 68). Modeling shows that this domain, only 20 Å in diameter, can be accommodated in pore 1 with the two conserved β hairpin residues reaching the active site, while still leaving room for an extruded strand of RNA.

Comparison with bacterial RNA polymerase. Most information about core bacterial RNA polymerase structure comes from x-ray diffraction studies of the α2 homodimer from E. coli (69) and the α2ββ′ polymerase from Thermus aquaticus (43). Regions of sequence similarity have been noted between α, Rpb3, and Rpb11 (69), between β and Rpb2 (70), and between β′ and Rpb1 (71). The crystallographic pol II model contains a conserved core of secondary structural elements similar to those in the bacterial enzyme, surrounded by divergent elements and eukaryote-specific subunits. Conserved elements are located in the vicinity of the DNA-RNA hybrid binding site, the adjacent downstream DNA binding site, and the sides of the funnel. Consistent with the conservation of these structural elements, similar modes of interaction with nucleic acids in the vicinity of the active site have been proposed for the eukaryotic and bacterial enzymes (72). The pore beneath the active site is conserved, and the bacterial enzyme may contain a clamp as well (73). On the other hand, the jaws, which include eukaryote-specific subunits and a domain of Rpb1, are found only in pol II, possibly reflecting their interaction with the eukaryote-specific transcription initiation factor TFIIE, as revealed by 2D crystallography (26). The occurrence of jaws in pol II, but not in the bacterial enzyme, presumably accounts for the nuclease protection of about 20 base pairs of downstream DNA by pol II, compared with only about 13 base pairs by the bacterial enzyme (45, 60).

A more detailed comparison is possible, at present, for the α2 dimer and its counterpart in pol II, the Rpb3-Rpb11 heterodimer. The α2 dimer nucleates assembly of bacterial polymerase, binding β to form a subcomplex, which then binds β′ to form a complete core enzyme (74). Similarly, the Rpb3-Rpb11 heterodimer binds Rpb2 to form a subcomplex (75). The location of the heterodimer in pol II is similar to that of α2 in the bacterial enzyme, and the domain conserved between Rpb3, Rpb11, and α exhibits an identical fold (motif of α helices and β sheets forming the lower half of the subcomplex in Fig. 8). The conserved domain represents almost the entirety of Rpb11 and is responsible for Rpb3-Rpb11 interaction (or dimerization in the case of α). The nonconserved domain of Rpb3 (upper half of the subcomplex in Fig. 8) interacts with the eukaryote-specific subunits Rpb10 and Rpb12. Contact of Rpb10 with Rpb3 is consistent with biochemical evidence for a stable Rpb3-Rpb11-Rpb10 subcomplex (76). Rpb12 binds through a tail, which adds a β strand to a sheet in the nonconserved region of Rpb3. Rpb12 also interacts with Rpb2 through its zinc-binding module. Consistent with this, Rpb12 has been shown to contact the second largest subunit in RNA polymerase I, and this interaction requires an intact zinc-binding motif (77). Moreover, a mutation in the COOH-terminal region of Rpb12 impairs assembly of RNA polymerase III (77). Thus, Rpb12 appears to play an essential role in the assembly or maintenance of all eukaryotic RNA polymerases by bridging between the Rpb3-Rpb11-Rpb10 subcomplex (or its homologs in polymerases I and III) and the second largest subunit.

Figure 8

The Rpb3-Rpb11-Rpb10 subcomplex and Rpb12. A stereoview of the arrangement of the four subunits is shown in the upper part, and the location of this subcomplex within pol II is shown in the lower part.

Transcription pathway. The crystallographic model of pol II also gives insight into the transcription pathway and the still larger multiprotein complexes involved. The pathway begins with the formation of a TFIIB–TFIID–promoter DNA complex and its interaction with pol II, followed by entry of TFIIE, and finally TFIIH, whose helicase activities melt DNA around the start site of transcription. The initial interaction of pol II with the promoter must be with essentially straight, duplex DNA. The pol II model, however, requires a considerable distortion for binding at the active site, which can only occur upon melting. The transition from an initial complex to a transcribing complex will therefore be accompanied by structural changes and movement of the DNA. Transcription begins with the repeated synthesis and release of short RNAs (“abortive cycling”), until a barrier at about 10 nucleotides is traversed, and chain elongation ensues. On reaching a transcript size of about 20 nucleotides, the full stability of a transcribing complex is attained. The barrier at 10 nucleotides corresponds to the point at which the 5′ end of the growing transcript must disengage from the template DNA and enter the proposed groove for RNA in the model. The transcript size needed for full stability corresponds with the length of RNA needed to fill the groove.

The interpretation along these lines may be extended and evaluated by the solution of pol II cocrystal structures, with the use of the pol II model for molecular replacement. Cocrystals with TFIIB and TFIIE (78) should reveal the trajectory of DNA in the initial pol II–promoter complex. Cocrystals containing pol II in the act of transcription (79) will show the locations of nucleic acids in an elongation complex. Cocrystals with TFIIS (80) may indicate the proposed exit pathway for RNA through a pore beneath the active site during backtracking. Other cocrystals may be sought to investigate the mechanism of transcriptional regulation by the multiprotein Mediator complex and associated activator and repressor proteins (4).

  • * To whom correspondence should be addressed. E-mail: kornberg{at}


View Abstract

Stay Connected to Science

Navigate This Article