Structural Basis of Transcription: Backtracked RNA Polymerase II at 3.4 Angstrom Resolution

See allHide authors and affiliations

Science  29 May 2009:
Vol. 324, Issue 5931, pp. 1203-1206
DOI: 10.1126/science.1168729


Transcribing RNA polymerases oscillate between three stable states, two of which, pre- and posttranslocated, were previously subjected to x-ray crystal structure determination. We report here the crystal structure of RNA polymerase II in the third state, the reverse translocated, or “backtracked” state. The defining feature of the backtracked structure is a binding site for the first backtracked nucleotide. This binding site is occupied in case of nucleotide misincorporation in the RNA or damage to the DNA, and is termed the “P” site because it supports proofreading. The predominant mechanism of proofreading is the excision of a dinucleotide in the presence of the elongation factor SII (TFIIS). Structure determination of a cocrystal with TFIIS reveals a rearrangement whereby cleavage of the RNA may take place.

RNA polymerases catalyze rapid RNA chain growth [20 to 70 nucleotides (nt) per second for RNA polymerase II (pol II)] in a template-directed manner (14). They do not, however, move monotonously forward on the template. Rather, they oscillate between forward and backward movement at every step in the process (5). Three states of a transcribing complex are observed (Fig. 1): a pretranslocation state, in which the nucleotide just added to the growing RNA chain is still in the nucleotide addition site; a posttranslocation state, in which the enzyme has moved forward on the template, which makes the nucleotide addition site available for entry of the next nucleoside triphosphate (NTP); and a backtracked state, in which the enzyme has retreated on the template, extruding the 3′-end of the RNA (6, 7). Forward movement is favored by NTP binding, which traps the complex in the posttranslocation state. Backtracking predominates when forward movement is impeded, for example, by damage in the template or by nucleotide misincorporation in the RNA (8, 9). Backtracking by one or a few residues is reversible, whereas backtracking a greater distance leads to arrest, from which recovery is only possible by cleavage of the transcript in the polymerase active center, induced by transcription factor SII (TFIIS) in eukaryotes and GreA and/or Gre B in bacteria [reviewed in (1013)]. Backtracking and cleavage enable proofreading of the transcript, through the excision of misincorporated nucleotides and resynthesis (9, 1421).

Fig. 1

The three states of a pol II transcription elongation complex. RNA transcript is red, DNA template is blue. The nucleotide base just added to the 3′-end of RNA and the complementary base in the DNA template are represented by cyan and green bars, respectively. The dashed oval represents the empty nucleotide addition site in the posttranslocation state. The green circle represents the pol II bridge helix.

X-ray crystal structures of pol II transcribing complexes in the pre- and posttranslocation states have been obtained, illuminating the mechanisms of RNA synthesis and product release (2225). Here, we report x-ray structures of backtracked complexes, showing that they are stable intermediates and completing the overall picture of the transcription process. Some of the backtracked structures include TFIIS, which reveals interactions responsible for transcript cleavage, proofreading, and recovery from arrest.

Backtracked complexes were produced by two approaches (26). DNA-RNA hybrids with mismatched nucleotides at the 3′-end of the RNA were bound to pol II, which directly recreated a backtracked state. Alternatively, DNA-RNA hybrids bearing DNA damage downstream of the 3′-end of the RNA were transcribed by pol II, with the expectation that the enzyme encountering the impediment would retreat to a backtracked state. In both cases, crystal structures were obtained showing extra electron density for the transcript downstream of the nucleotide addition site, indicative of an extruded 3′-end. The structures were, moreover, very similar (Fig. 2A and fig. S1).

Fig. 2

Structure of pol II elongation complex in the backtracked state. (A) Complex with one mismatched residue at the 3′-end of the RNA (12-nt oligomer RNA). The view is a standard one, from the “Rpb2 side,” as in the past (2225). Difference electron density map (FobsFcalc omit map, contoured at 3.0 sigma) between backtracked and posttranslocation complexes is shown in green mesh. RNA and DNA are red and cyan. Ribonucleotides at +1 and +2 positions in the RNA are yellow and blue. Parts of bridge helix (Rpb1 825 to 848) and trigger loop (Rpb1 1070 to 1100) are green and cyan. Rpb1 T827 is orange. Side chains of Rpb1 Q1078 and N1082 are also shown. Rpb2 527 to 532 is in light magenta. (B) The binding pocket for backtracked ribonucleotide at position +2. View is rotated from that in (A) to better reveal side-chain interactions. Rpb1 440 to 450, Rpb1 470 to 481, bridge helix (Rpb1 810 to 848), trigger loop (1070 to 1100), and Rpb2 760 to 776 are in orange, lime green, green, cyan, and hot pink, respectively. Side chains of Rpb1 (R446, N479, L824, T827, Q1078, and N1082) and Rpb2 (Y769) are shown. Other colors as in (A). The binding pocket is highlighted by a dashed green circle. (C) Backtracked RNA is kinked toward the bridge helix and differs from canonical A form RNA. Backbones of one-base–mismatch backtracked RNA (red, yellow, and blue) and canonical A form RNA (gray) are superimposed. The superimposed two-base–mismatch structure is shown in magenta. Color scheme as in (A). View is rotated 90° counterclockwise about a vertical axis from that in (A). (D) Molecular dynamics simulation of bases in the two-base–mismatch (13-nt oligomer RNA) complex. Deviations of bases three (A10) and one (G12) residues from the 3′-end, and of the 3′-terminal residue (C13). Root mean square deviation (RMSD) values with respect to the starting conformation were computed every 10 ps.

In a complex with a hybrid containing one mismatched residue at the 3′-end of the RNA (12-nt oligomer RNA), the last matched residue, uridine 5′-monophosphate (UMP), was in the nucleotide addition site (designated +1) and the mismatched residue, guanosine 5′-monophosphate (GMP), was in a location downstream (+2) not observed in any previous transcribing complex structure (Fig. 2A). The UMP base was paired with its complement in the DNA template but tilted 15° to 20° out of the plane. Between the UMP and GMP, the backbone of the backtracked RNA was bent over 120° out of the path of the hybrid helix, which enabled the GMP base to hydrogen-bond with the adenosine 5′-monophosphate (AMP) base two residues away (position –1) in the RNA chain (Fig. 2C).

Interactions between the RNA and the bridge helix, trigger loop, and other pol II residues created a binding pocket for the backtracked GMP (Fig. 2, A and B) and caused the deviations from hybrid helix geometry observed. Bridge helix residue [RNA pol II subunit (Rpb)] Rpb1 Thr827, Rpb2 Tyr769, and Rpb2 529 to 531 contacted the GMP. These contacts are consistent with previous biochemical evidence that the base of GMP at the +2 position participates in polymerase interaction (27). The N-2 amino group of the base helps fix the location of GMP through pol II interaction. Although the precise positions of Mg2+ and associated water could not be determined at the resolution of our analysis, the proposal that N-7 of GMP base and/or Rpb2 Tyr769 coordinate Mg2+ and water for nucleophilic attack during intrinsic cleavage is at least consistent with the structure (27). Rpb 1 Asn479 and Arg446 contacted the UMP one nucleotide away from the GMP and the AMP two nucleotides away. The trigger loop was in a conformation intermediate between the “open” and “closed” conformations previously observed (2225, 2830). Trigger loop residues Rpb1 Asn1082 and Gln1078 contacted the phosphate group between the UMP and GMP, whereas His1085, believed to play a role in catalysis in the closed conformation (25), was directed away from the RNA and close to Rpb1 Ser769 and Gly772.

In a complex with a hybrid containing two mismatched residues at the 3′-end (13-nt oligomer RNA), the last matched residue (UMP) and the first of the mismatched residues (GMP) were in locations similar to those for the complex with one mismatched residue (Fig. 2C). The UMP base, paired with its complement in the DNA template, was tilted 20° to 30° out of the plane, and the backbone of the backtracked RNA was bent 80° to 90° out of the path of the hybrid helix (Fig. 2C). The next backtracked (second mismatched) residue was not revealed in the structure, because of motion or static disorder. In molecular dynamics simulations (Fig. 2D) [supporting online material (SOM) text (26)], this backtracked residue was highly mobile, which would argue in favor of motion rather than disorder.

Backtracked complexes containing additional mismatched residues and with different mismatched sequences showed the same conformation for the first mismatched residue. A complex with three mismatched residues (14-nt oligomer RNA) also showed no ordering of residues beyond position +2, but a complex with seven mismatched residues (18-nt oligomer RNA) produced more electron density downstream (Fig. 3A). Backtracked nucleotides at positions +3 and +4 could be built into this density, although the base and sugar at +4 were disordered. The backbone of the backtracked RNA was sharply bent between positions +2 and +3, owing to salt bridges of Rpb2 residues Gln763 and Arg766 with the phosphate between +2 and +3. Rpb1 Lys752, Rpb2 Ser1019, and Rpb2 Arg1020 appeared to form hydrogen bonds or salt bridges with the phosphate between +3 and +4 (Fig. 3B). Three regions of pol II interact with the backtracked RNA; bridge helix residues Rpb1 824 to 827 and Rpb2 regions 760 to 772 and 529 to 531 also interact with one another and form a network that probably enhances the stability of the backtracked state. In the more extensively backtracked complex structures, the trigger loop was in an “open” conformation, remote from the nucleotide-addition site. Part of the trigger loop (Rpb1 1078 to 1081), however, remained near enough to interact with the backtracked RNA.

Fig. 3

Structure of backtracked complex with seven mismatched residues at the 3′-end of the RNA (18-nt oligomer RNA). (A) Same representation as Fig. 2A, except that ribonucleotide at +3 position is magenta and phosphate group at +4 position is orange. (B) Interactions of pol II with additional backtracked residues. Rpb1 750 to 754, Rpb2 760 to 772, and Rpb2 1018 to 1022 are lime green, hot pink, and marine blue, respectively. Other colors as in Fig. 2A, view as in Fig. 2C. Side chains of Rpb1 (K752) and Rpb2 (Q763, R766, Y769, S1019, and R1020) are shown.

To determine whether nucleotides other than GMP would be stably bound at the +2 position in the backtracked state, we also solved structures of backtracked complexes containing RNA of the same length (13-nt oligomer) but with UMP rather than GMP at this position. We observed significant electron density attributable to UMP in a similar location to that found for GMP. The UMP base was in the same binding pocket as the GMP base (between bridge helix residues Rpb1 824 to 827 and Rpb2 769), whereas the ribose sugar was in a different location, interacting with Rpb2 766 (fig. S5).

Cleavage of backtracked RNA in the presence of TFIIS depends on an Asp290-Glu291 dipeptide in a hairpin loop of domain III of the protein (31), a zinc ribbon motif. Domain II, a three-helix bundle, and the linker between domains II and III are responsible for binding to pol II (3236). Domain I, an N-terminal four-helix bundle, is nonessential in vitro (34, 37, 38), but appears to play a role in transcription initiation (39, 40). A crystal structure of transcribing pol II in the posttranslocation state, complexed with TFIIS (28, 41), showed domain III inserted in the pol II secondary channel (pore and funnel) but gave limited insight into the TFIIS mechanism, because of the absence of backtracked RNA. Superposition of this structure [Protein Data Bank (PDB) no. 1Y1V] with our backtracked complex structure reveals a steric clash of the hairpin loop of TFIIS domain III with backtracked RNA (Fig. 4). In particular, the catalytic Asp290 and Glu291 residues of TFIIS come too close to the phosphate group between the –1 and +1 positions in the RNA (Fig. 4A) and clash with the side chain of Rpb2 Tyr769 (Fig. 4B). This TFIIS structure might reflect the state following RNA cleavage, but to investigate interactions relevant to cleavage itself, we determined the structure of pol II in the backtracked state, with 13-nt oligomer RNA, complexed with a point mutant of TFIIS (E291H), in which Glu291 is replaced by His, unable to cleave the RNA (34, 42). Initial phases were obtained by molecular replacement with the previous structure (1Y1V), with the omission of TFIIS, Rpb 4/7, the trigger loop, and nucleic acids. The initial electron density map clearly showed the three-helix bundle of domain II of TFIIS, part of domain III, a long α helix of the interdomain linker, the pol II trigger loop, and nucleic acids. Trigger loop and nucleic acid models were manually built into the electron density. TFIIS was then manually docked, and rigid body refinement was performed, followed by manual adjustment of the conformation of the hairpin loop of domain III.

Fig. 4

Structure of backtracked complex with TFIIS. (A) Clash of backtracked RNA with published structure of TFIIS. Backtracked complex with one mismatched base (12-nt oligomer RNA), depicted as in Fig. 2A, was superposed with TFIIS (1Y1V), with the use of Coot, on the basis of secondary-structure matching (ssm) of RPB1. The tip of domain III of TFIIS is light blue. Side chains of TFIIS residues 290 and 291 are shown. Parts of bridge helix and trigger loop of 1Y1V complex are wheat-colored. (B) Structure of backtracked complex with two mismatched bases (13-nt oligomer RNA) complexed with TFIIS. Backtracked complex is depicted as in Fig. 2A, except slightly rotated, and with Rpb2 760 to 776 in hot pink. The tip of domain III of TFIIS is magenta. Corresponding regions from 1Y1V are superimposed; pol II in wheat and TFIIS in light blue.

Parts of the electron density map were of sufficient quality that side chains could be placed and specific interactions between TFIIS and pol II identified (for example, the interdomain linker–pol II interaction). In the case of the hairpin loop, only the main chain could be built, because of limitations of resolution and less defined secondary structure. The structure clearly showed a different position of the hairpin loop from that seen previously (Fig. 4B). The position of the backtracked RNA was also affected, as the residue at +2 was disordered. There was no evidence of a 4 Å shift of the upstream RNA in the RNA-DNA hybrid region previously reported (28, 41). The remainder of TFIIS was similar in conformation between the present and previous structures (28, 41). The trigger loop was in an “open” conformation in both structures (28, 41).

Superposition of our backtracked pol II–TFIIS complex structure with our structures of backtracked pol II alone revealed a steric clash of the hairpin loop with the RNA, but less severe than that observed with the previous posttranslocation complex–TFIIS structure. The clash was more pronounced for longer backtracked RNAs (15-nt, 18-nt, and 24-nt oligomer RNAs). Evidently, some rearrangement is required to accommodate both TFIIS and RNA in the complex, shown by the disorder of the residue at +2 in the 13-nt oligomer RNA cocrystal.

In the presence of wild-type TFIIS, backtracked complexes containing one, two, three, four, and seven mismatched residues were cleaved at positions two, three, four, five, and eight residues from the 3′-end of the RNA (fig. S6, A and C). On the basis of the structures of the complexes, we conclude that cleavage occurs between the addition site (–1) and the position preceding (+1 site). In effect, cleavage represents the reversal of nucleotide addition, except with water rather than pyrophosphate as the nucleophile (43, 44).

The cleavage rates of mismatched complexes were 15- to 30-fold the rates of those of matched complexes (fig. S6A) (26), in keeping with results of previous studies (9). This preferential cleavage of mismatched complexes is the presumed basis for the error correction capability of TFIIS (21). Selective removal of mismatched residues has been observed for E. coli RNA polymerase stimulated by the TFIIS homolog GreA (45), Archaeal RNA polymerase stimulated by TFS (46, 47), and pol III with its intrinsic counterpart of TFIIS (48).

Selective removal of mismatched residues has also been observed in the absence of GreA for T. aquaticus RNA polymerase (27). This “intrinsic” cleavage is much slower than the TFIIS-stimulated reaction (fig. S6). The half time for intrinsic cleavage of a backtracked complex with one mismatched residue (12-nt oligomer RNA) was about 440 s, compared with about 10 s for cleavage in presence of 100 nM TFIIS [equilibrium constant (Ke) for TFIIS–pol II interaction is about 100 nM] (fig. S6C) (26). Intrinsic cleavage of longer backtracked complexes was barely detectable. More rapid rates of intrinsic cleavage reported for T. aquaticus RNA polymerase (27) likely reflect differences in reaction conditions, such as higher Mg2+ concentration, pH, and temperature, as well as the different RNA polymerases involved.

In our system, longer backtracked RNAs were also cleaved more slowly in the presence of TFIIS. The cleavage rates of backtracked 12-nt, 13-nt, and 14-nt oligomer were 1200-, 480-, and 60-fold faster than that for backtracked 18-nt oligomer (26). This trend presumably relates to the rearrangement required to accommodate both TFIIS and backtracked RNA in the pol II secondary channel. Longer RNA may undergo rearrangement less readily, because of more extensive interaction in the secondary channel.

To further investigate the influence of base mismatch on cleavage in the presence of TFIIS, we prepared a backtracked complex with a G<>U mismatch instead of a G<>G mismatch. Because G<>U can form a wobble base pair, it might occupy the nucleotide addition site at +1 rather than backtracking to +2. We observed two cleavage products, of two and three residues, instead of the single product of three residues found for a G<>G mismatch (fig. S6B). Evidently, a wobble base pair does form to a limited extent.

Our results lead to two conclusions. First, pol II backtracked by one residue represents a discrete, stable state of the transcribing enzyme. In the course of backtracking, pol II stalls at this position. The evidence is threefold: the observation of a defined structure of the 12-nt oligomer RNA complex, in which the backtracked nucleotide is revealed at full occupancy; the absence of density for additional nucleotides in 13-nt and 14-nt oligomer RNA complexes; and molecular dynamics simulations, showing that nucleotides beyond the first backtracked residue are expected to be highly mobile. The demonstration of a defined one-residue–backtracked state supports the long held, but largely conjectural, notion of polymerases in diffusional equilibrium between forward and retrograde motion during transcription. Backtracking by one residue is favorable, whereas backtracking by two or three more residues confers no greater energetic benefit. Longer backtracked RNAs do make additional interactions, possibly contributing to arrest (irreversible backtracking) and, by alteration of the RNA conformation, to the ability of TFIIS to rescue the complex from the arrested state (49).

The second conclusion concerns the significance of the one-residue–backtracked state. It is readily cleaved in the presence of TFIIS and releases a dinucleotide, which supports the previous ideas that cleavage occurs in the pol II active site, and that an important role of this cleavage is the removal of misincorporated nucleotides. The pol II structure is evidently well suited to the purpose. In the event of misincorporation, forward translocation is disfavored, because of distortion of the RNA-DNA hybrid helix, and the diffusional equilibrium of the enzyme is shifted towards the backtracked state. A significant lifetime in this state, due to binding of the misincorporated nucleotide in the +2 position, leads to cleavage. The stability of the one-residue–backtracked state thus underlies the proofreading capability of the pol II system. We refer to the nucleotide-binding site at +2, which defines the backtracked state, as the “P” site.

Cleavage in the one-residue–backtracked state can occur both in the presence of TFIIS and in its absence (“intrinsic cleavage”). Because the reaction with TFIIS is faster in vitro by more than 100-fold, it is likely the predominant mechanism in vivo. Intrinsic cleavage may nevertheless play an important role, as disruption of the gene for TFIIS causes at most an error rate 10 times as great in transcription (5053).

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S6

Tables S1 to S3


References and Notes

  1. Materials and methods are available as supporting material on Science Online.
  2. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.
  3. This research was supported by NIH grants GM049985 and GM036559 to R.D.K. D.W. was supported by a special fellow award from the Leukemia and Lymphoma Society and the NIH Pathway to Independence Award (K99 GM085136). X.H. and M.L. were supported by NIH Roadmap for Medical Research Grant U54 GM072970 and NIH grant GM041455. Portions of the research were carried out at the Stanford Synchrotron Radiation Laboratory, a national user facility operated by Stanford University on behalf of the U.S. Department of Energy (DOE), Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the DOE, Office of Biological and Environmental Research, and by the National Institutes of Health, National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences. Portions of this research were conducted at the Advanced Light Source, a national user facility operated by Lawrence Berkeley National Laboratory, on behalf of the DOE, Office of Basic Energy Sciences. The Berkeley Center for Structural Biology is supported in part by the DOE, Office of Biological and Environmental Research, and by the National Institutes of Health, National Institute of General Medical Sciences. We thank C. D. Kaplan for providing TFIIS E291H mutant for structural studies. PDB accession codes: 13-nt oligomer backtracked pol II, 3GTJ; 18-nt oligomer backtracked pol II, 3GTK; 12-nt oligomer backtracked pol II, 3GTG; and 13-nt oligomer backtracked pol II with TFIIS, 3GTM.
View Abstract

Navigate This Article