Structural Basis of Transcription: Separation of RNA from DNA by RNA Polymerase II

See allHide authors and affiliations

Science  13 Feb 2004:
Vol. 303, Issue 5660, pp. 1014-1016
DOI: 10.1126/science.1090839


The structure of an RNA polymerase II–transcribing complex has been determined in the posttranslocation state, with a vacancy at the growing end of the RNA-DNA hybrid helix. At the opposite end of the hybrid helix, the RNA separates from the template DNA. This separation of nucleic acid strands is brought about by interaction with a set of proteins loops in a strand/loop network. Formation of the network must occur in the transition from abortive initiation to promoter escape.

X-ray crystallography has revealed the RNA-DNA hybrid at the center of an RNA polymerase II (Pol II)–transcribing complex (1). Besides confirming the existence of the hybrid, previously inferred from biochemical evidence, the x-ray structure showed the orientation of the hybrid helix at an angle of almost 90° to the incoming DNA double helix, and the position of the growing end of the hybrid above an opening in the floor of the Pol II active center, through which nucleotides are thought to enter for transcription. What the x-ray structure did not reveal were the critical events at the ends of the hybrid: the selection, at the growing or downstream end, of a ribonucleotide triphosphate complementary to the coding base in the template DNA; and the separation of strands at the upstream end, whereby the product RNA is disengaged from the template DNA.

These limitations of the previous x-ray structure derived from the method of generating the transcribing complex. It involved the use of a duplex DNA with a single-stranded “tail” protruding from one 3′ end. Pol II initiates transcription on the tail, about three residues from the junction with the duplex region (2). The advantage of this approach is that it does not require the many general transcription factors involved in initiation at a promoter. The disadvantages are that initiation on a tailed template is imprecise, leading to heterogeneity in transcript length, and the transcript fails to separate from the template, giving rise to an extended RNA-DNA hybrid. A further limitation of the previous work was that the 3′ end of the transcript lay in the nucleotide-addition site (also known as the i+1 site). Because transcription had been paused by withholding uridine 5′-triphosphate (UTP), required for pairing with the next base in the template, the 3′ end of the transcript must have advanced by translocation to the –1 site, exposed the requirement for UTP, and then backtracked to the i+1 site. Pausing by withholding UTP was also problematic because of misincorporation, resulting in additional heterogeneity of the transcribing complex.

We have overcome these limitations by assembly of transcribing complexes from Pol II and synthetic oligonucleotides, rather than by actual transcription of DNA. Others have shown that Pol II binds a single strand of DNA and eight- or nine-residue complementary RNA to form a stable complex, capable of extending the RNA (3). Addition of the complementary strand of DNA results in a complete transcribing complex, capable also of separating the RNA product from the DNA.

Ten-subunit Pol II from Saccharomyces cerevisiae was combined with a 15-residue DNA strand and 9-residue complementary RNA. A chain-terminating residue was added to the RNA by transcription with 3′-deoxyadenosine triphosphate (Fig. 1). The resulting complex crystallized in the form of thin, fragile, radiation-sensitive plates. Data were collected to 3.6 Å resolution (Table 1), and the structure was solved by molecular replacement with the previous transcribing-complex structure and rigid body refinement (4). An electron density map calculated with the current x-ray intensities and phases from the previous model with the RNA and DNA removed (“omit map”) showed clear density for the RNA-DNA hybrid (Fig. 1), confirming its presence in the current structure. The near identity of the current and previous structures establishes the equivalence of the complex assembled from oligonucleotides to that formed by transcription.

Fig. 1.

RNA and DNA in the structure of a Pol II–transcribing complex. (A) Model for RNA and DNA fitted to electron density for nucleic acids (2FobsFcalc SigmaA-weighted map, with phases from Pol II alone, contoured at 0.8 Å). The direction of viewis from the Rpb2 side of the Pol II structure, the same as that previously shown of nucleic acids in the transcribing complex [figure 2C in (1)]. RNA is in magenta and template DNA is in cyan. A chain-terminating 3′-dA residue is shown in yellow. (B) Sequences of RNA and DNA in the transcribing complex. Nucleotide positions are numbered with respect to the addition site (i+1 site, denoted +1), with positions upstream extending from –1 and those downstream from +2. The separation of RNA and DNA strands upstream of –8 is indicated schematically. (C) Downstream end of the RNA-DNA hybrid in the previous transcribing-complex structure (1), showing occupancy of the nucleotide addition (i+1) site. (D) Downstream end of the RNA-DNA hybrid in the present transcribing-complex structure, showing vacancy of the nucleotide-addition site. The “bridge helix” (in green), extending across the Pol II cleft between the two largest subunits, and the Mg2+ ion (pink sphere) provide landmarks of the active-center region and points of reference to previous structures. Electron density maps (2FobsFcalc SigmaA-weighted, with phases from Pol II alone) are shown as gray nets. Figures were generated by PyMOL (12) or SPOCK (13).

Table 1.

Crystallographic data for Pol II elongation complex. Values in parentheses correspond to the highest resolution shell.

Space group C2
Unit cell dimensions (Å) 163 by 222 by 194
Wavelength (Å) 0.98
Mosaicity (°) 0.64
Resolution (Å) 40-3.6 (3.7-3.6)
Completeness (%) 92.9 (87.1)
Redundancy 2.5
Unique reflections 73,591
II 6.4
Rsym (%)View inline 13.9 (31.2)
  • View inline* Rsym = Σi,h|I(i,h) - 〈I(i,h)〉|/Σi,h|I(i,h)| where 〈I(h)〉 is the mean of the i observations of reflection h. Rsym was calculated with anomalous pairs merged.

  • Where the omit map differed from the previous structure was at the upstream and downstream ends of the RNA-DNA hybrid. The omit map lacked any density at the downstream end for a nucleotide in the i+1 site [Fig. 1; compare (C) and (D)]. The complex was evidently in the “posttranslocation” state, with the nucleotide that was just added to the RNA having advanced to the –1 site, leaving the i+1 site open for addition of the next nucleotide. Elsewhere we report on the interaction of nucleotides with the i+1 site, giving insight into the nucleotide-addition mechanism (5).

    The omit map differed from the previous transcribing-complex structure at the upstream end of the RNA-DNA hybrid by the presence of density for additional nucleic acid residues and protein loops (Fig. 2). The paths of the RNA and DNA strands clearly diverged, beginning at position –8, with residues –9 and –10 of the RNA completely separated from the complementary residues in the DNA (Fig. 2, A and B). The RNA-DNA hybrid was therefore eight base pairs in length, the same as the optimal length for stability of a complex assembled from oligonucleotides (3).

    Fig. 2.

    Separation of RNA transcript from DNA template: the loop/strand network. (A) Portion of Fig. 1A, from residues –2 to –10, viewed from the front of the transcribing complex (rotated 90° around the RNA-DNA hybrid helix axis in Fig. 1A). Unpaired bases are colored orange (–8), purple (–9), and gray (–10). (B) Close-up of residues –7 to –10 of the model in (A). Average distances (in angstrom) between groups ordinarily involved in hydrogen bonding between complementary bases are shown. (C) Electron density for protein loops involved in strand separation. Backbone models of fork loop 1 (orange), rudder (green), and lid (purple) are fitted to electron density as in Fig. 1A. RNA and DNA models are from Fig. 2A. (D) Some residues of protein loops (carbon atoms, yellow; nitrogen atoms, blue) interacting with RNA and DNA. Fork loop 1 (Rpb2) residues Lys471 and Arg476, rudder (Rpb1) residues Ser318 and Arg320, and lid (Rpb1) residue Phe252 are shown.

    Three protein loops that could not be traced because of disorder in any previous polymerase structure—the “lid (Rpb1 246–264),” “rudder (Rpb1 310–324),” and “fork loop 1 (Rpb2 461–480)”—are now revealed. All three loops play key roles in RNA-DNA strand separation (Fig. 2, C and D). Ordering of these loops is evidence of their interaction with RNA, DNA, and one another in the present structure. The lid serves as a wedge to drive the RNA and DNA strands apart and interacts with residues –8, –9, and –10 of the RNA. It forms a barrier to maintain the separation of the strands and guide the RNA along an exit path. Rpb1 residue Phe252, at the tip of the wedge, splits the RNA-DNA base pair at position –10, contacting the DNA base with the plane of the aromatic side chain perpendicular to the plane of the base (Fig. 2D). Rpb1 residue Phe264 may similarly contact the DNA base at position –10 or –11. (The residue at position –11 in the DNA strand was the only one of the 15 nucleotides in the crystal for which no density was present in the map.)

    The rudder is not directly involved in strand separation but rather interacts with the DNA at positions –9, –10, and possibly also –11 (Fig. 2, C and D), preventing reassociation with the RNA. Rpb1 residues Ser318 and Arg320 contact the sugar and 5′ phosphate at position –10. A primary role of the rudder in stabilizing the unwound DNA beyond the hybrid region is consistent with recent molecular genetic analysis (6). A contact of the rudder with the RNA at position –7 to –8 accounts for protein-RNA cross-linking at this position in a bacterial RNA polymerase complex (7).

    Fork loop 1 projects from Rpb2 and interacts with the RNA at positions –5, –6, and –7 in the hybrid region (Fig. 2, C and D). Rpb2 residues Lys471 and Arg476 appear to contact RNA phosphates. Fork loop 1 may serve to delimit the region of RNA-DNA strand separation, preventing unwinding of the hybrid past position –8. Fork loop 1 extends the region of protein contact with the RNA-DNA hybrid from the first three residues, seen in the previous transcribing complex structure (2), to essentially the entire hybrid. In consequence of the ordering of fork loop 1, Pol II almost completely surrounds the hybrid.

    The three protein loops interact not only with RNA and DNA but also with one another, the lid interacting with the rudder and the rudder with fork loop 1. The loops lie near the Pol II “saddle,” between the “clamp” and the “wall” (Fig. 3, A and B), where the lid interacts with additional protein elements to form an “arch” over the saddle (Fig. 3B). RNA exits the active-center region through an “exit pore” beneath the arch, whereas the DNA exits above the arch, preventing any possibility of reassociation. Beyond the saddle, two possible exit paths for RNA have been noted, on the basis of the surface charge distribution of Pol II (Fig. 3C). Path 1, a positively charged groove running around the base of the Pol II clamp, leads toward Rpb7, which contains a ribonucleoprotein fold, capable of binding single-stranded RNA (8). Path 2, a positively charged groove running down the back side of Pol II, leads toward Rpb8, which also contains motifs for possible interaction with single-stranded nucleic acids (9). (In bacterial RNA polymerase, only path 2 is positively charged.) RNA passing beneath the lid is on a trajectory for entering path 2. A sharp bend would be required for entry of the RNA into path 1.

    Fig. 3.

    Exit path of RNA from the transcribing complex. (A) The transcribing complex with Pol II in surface representation, viewed down the axis of the RNA-DNA hybrid [from above the hybrid helix in Fig. 1; this is the “top” view shown in figure 1 of (10)]. Fork loop 1 is in orange, rudder is in green, lid is in magenta, and the RNA backbone is in yellow. Locations of the “clamp” and “wall,” important landmarks in the Pol II structure, are indicated by dashed lines. (B) The transcribing-complex structure sectioned and viewed as indicated by the dashed line and arrow in (A) to reveal the “arch” of protein density above the “saddle” and the “exit pore” through which RNA (yellow backbone) passes, following separation from the template DNA. (C) Two possible RNA exit paths from Pol II. The viewis the same as in (A), colored according to the electrostatic surface potential (negative in red, neutral in white, and positive in blue), with the Pol II wall, rudder, and lid removed to better reveal the sugar-phosphate backbone of the RNA spiraling upward from the active site. The two paths along which the RNA may be extended are the positively charged grooves, indicated by dashed yellow lines, labeled 1 and 2.

    The many interactions of the protein loops with one another, with other protein elements, and with RNA and DNA strands give rise to a complex “strand/loop network.” Establishment of this network is likely to be a concerted process. It begins when the RNA reaches a length of 9 or 10 residues, with thermal “breathing” of the end of the RNA-DNA hybrid and trapping of the transient RNA and DNA single strands by the protein loops. During initiation at a Pol II promoter, RNA longer than 9 or 10 residues clashes with the N-terminal region of general transcription factor TFIIB, which also traverses the Pol II saddle (10). A pause in transcription at the point of the clash would allow the time required to form the strand/loop network. The clash is then resolved by the network displacing TFIIB, whose release results in promoter escape. Alternatively, the clash may be resolved by retention of TFIIB and release of the RNA (abortive initiation) (11).

    The requirement for a pause in transcription to form the strand/loop network can account for the lack of strand separation during transcription of a tailed template. Pol II initiates on the tail without a requirement for general transcription factors, and in the absence of TFIIB, there is no pause in transcription at an RNA length of 9 or 10 residues. Once transcription has proceeded past this point, the opportunity for strand separation is lost.

    Formation of the strand/loop network also explains why the present structure is in the posttranslocation state, with an empty i+1 site, whereas the previous structure had apparently undergone translocation and backtracking, restoring the 3′-terminal nucleotide of the transcript to the i+1 site. In the present structure, the energy gain due to strand/loop network formation offsets the energy cost due to disruption of two base pairs and to loss of interactions in the i+1 site (the loss further diminished by the absence of a 3′-OH group on the chain-terminating nucleotide for interaction with the activecenter Mg2+ ion). In the previous structure, formed by transcription of a tailed template, there was no strand/loop network, so the energetics favored occupancy of the i+1 site (including the 3′-OH–Mg2+ ion interaction).

    Supporting Online Material



    References and Notes

    View Abstract

    Navigate This Article