Report

Structural Basis of Transcription Initiation

See allHide authors and affiliations

Science  23 Nov 2012:
Vol. 338, Issue 6110, pp. 1076-1080
DOI: 10.1126/science.1227786

Abstract

During transcription initiation, RNA polymerase (RNAP) binds and unwinds promoter DNA to form an RNAP-promoter open complex. We have determined crystal structures at 2.9 and 3.0 Å resolution of functional transcription initiation complexes comprising Thermus thermophilus RNA polymerase, σA, and a promoter DNA fragment corresponding to the transcription bubble and downstream double-stranded DNA of the RNAP-promoter open complex. The structures show that σ recognizes the –10 element and discriminator element through interactions that include the unstacking and insertion into pockets of three DNA bases and that RNAP recognizes the –4/+2 region through interactions that include the unstacking and insertion into a pocket of the +2 base. The structures further show that interactions between σ and template-strand single-stranded DNA (ssDNA) preorganize template-strand ssDNA to engage the RNAP active center.

In transcription initiation, RNA polymerase (RNAP), together with at least one transcription initiation factor, binds to promoter DNA to yield an RNAP-promoter closed complex (RPc) and then unwinds ~12 base pairs of promoter DNA to form a “transcription bubble” and yield an RNAP-promoter open complex (RPo) (1). RPo is the critical, catalytically competent, intermediate in transcription initiation, and modulation of the formation, stability, and activity of RPo is an important means of regulation of gene expression (1). Structural models of RPo have been generated based on information from electron microscopy, fluorescence-resonance-energy-transfer measurements, and protein-DNA crosslinking (29). However, no high-resolution structural information for a functional, promoter-dependent, initiation-factor-dependent RPo previously has been reported.

Here, we report crystal structures of a bacterial RPo and a bacterial RPo in complex with a ribodinucleotide primer at 2.9 and 3.0 Å resolution, respectively. The results define the interactions of RNAP and the transcription initiation factor σ with the nontemplate and template strands of the transcription bubble in RPo, the interactions of RNAP with downstream double-stranded DNA (dsDNA) in RPo, and the RNAP clamp conformation in RPo.

To obtain a structure of RPo, we used a synthetic nucleic-acid scaffold corresponding to the transcription bubble and downstream dsDNA of RPo (10) (fig. S1). The top strand comprised a 14-nucleotide (nt) single-stranded DNA (ssDNA) tail containing a consensus promoter –10 element (11) and consensus discriminator element (12, 13), followed by a 13-nt duplex-forming segment. The bottom strand comprised a noncomplementary 6-nt ssDNA tail, followed by a 13-nt duplex-forming segment. The scaffold was functional in RNAP-DNA complex formation (fig. S1E) and in de novo transcription initiation with position +1 as the transcription start site (fig. S1F). Published work indicates that scaffolds of this form recapitulate all functional properties of the transcription bubble and downstream dsDNA of RPo (14).

We prepared complexes of the scaffold with Thermus thermophilus RNAP holoenzyme containing σA, identified crystallization conditions by use of robotic crystallization trials, grew crystals, collected data at a synchrotron, solved the structure by molecular replacement, and refined the structure, yielding a structure of RPo with a resolution of 2.9 Å and free R factor (Rfree) of 0.226 (fig. S1). Electron-density maps showed unambiguous density for nontemplate-strand nucleotides –12 to +12, template-strand nucleotides –4 to +12, 3139 RNAP residues, and 346 σ residues, and showed density for template-strand nucleotides +1 and +2 in the RNAP active-center “i” and “i + 1” sites. Analysis of crystals obtained using a derivative of the scaffold containing 5-bromouracil (5-BrU) in place of T at template-strand position +1 showed a single peak of Br anomalous difference density at the expected position, confirming that RNAP interacts with the scaffold in the crystals in a single translocational register and confirming that the translocational register places template-strand nucleotides +1 and +2 in the RNAP active-center “i site” and “i + 1 site” (fig. S2).

To obtain a structure of RPo in complex with a ribodinucleotide primer, we used an analogous nucleic-acid scaffold containing GpA, a ribodinucleotide complementary to template-strand positions –1 and +1 (Fig. 1A and fig. S3). Biochemical experiments verified that this scaffold is functional in primer-dependent transcription initiation (fig. S3, E and F). Crystals were prepared, and data were collected and processed as above, yielding a structure of RPo-GpA with a resolution of 3.0 Å and Rfree of 0.247 (Fig. 1, A and B, and fig. S3). Protein-DNA interactions were essentially identical in the structures of RPo and RPo-GpA (fig. S4).

Fig. 1

Structure of RPo-GpA. (A) Summary of protein–nucleic-acid interactions. Black residue numbers and lines, interactions by RNAP; green residue numbers and lines, interactions by σ; blue, –10 element of DNA nontemplate strand; light blue, discriminator element of DNA nontemplate strand; pink, rest of DNA nontemplate strand; red, DNA template strand; magenta, GpA; violet, active-center Mg2+; asterisks, water-mediated interactions; cyan boxes, bases unstacked and inserted into pockets. Residues are numbered as in E. coli RNAP and σ70. (B) Overall structure (RNAP β' nonconserved domain omitted for clarity). RNAP, gray; σ, yellow. Other colors as in (A). (C) Interactions of RNAP and σ with transcription-bubble nontemplate strand, transcription-bubble template strand, and downstream dsDNA (RNAP β subunit and β' nonconserved domain omitted for clarity). Colors as in (B). Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.

The structures show that RNAP and σ recognize promoter elements through sequence-specific interactions with transcription-bubble nontemplate-strand ssDNA (Figs. 1 to 3). σ makes sequence-specific interactions with the upstream part of the transcription-bubble nontemplate strand (positions –12 to –4) (Fig. 1, A to C, and Fig. 2), and RNAP makes sequence-specific interactions with the downstream part of the transcription-bubble nontemplate strand (positions –4 to +2) (Fig. 1, A to C, and Fig. 3).

Fig. 2

Structure of RPo-GpA: recognition by σ of –10 element and discriminator element. (A) Interactions between σ region 2 (σR2) and nontemplate-strand ssDNA of –10 element. Green, residues important for sequence recognition at position –12 (15, 16) (see discussion in legend to fig. S5). Other colors as in Fig. 1, B and C. (B) Effects on transcription of substitutions of σ residues that contact –10-element bases. (C) Interactions between σ region 1.2 (σR1.2) and nontemplate-strand ssDNA of discriminator element. Green, residue that crosslinks with position –5 (19). (D) Effects on transcription of substitutions of σ residues that contact discriminator-element bases. (E) Recognition of A-11 base by unstacking and insertion into pocket formed by residues of σ. (F) Recognition of T-7 and G-6 bases by unstacking and insertion into pockets formed by residues of σ.

Fig. 3

Structure of RPo-GpA: recognition by RNAP of CRE. (A) Interactions between RNAP β subunit and nontemplate-strand ssDNA of CRE (positions –4 through +2). RNAP β' subunit and σ omitted for clarity. Downstream dsDNA helix viewed end-on. Colors as in Fig. 1, B and C. (B) Effects on transcription of substitutions of RNAP β-subunit residues that contact CRE bases. (C) Recognition of G+2 base by unstacking and insertion into pocket formed by residues of RNAP β subunit, and stacking of T+1 base on RNAP β-subunit residue W183. (D and E) Effects on RNAP-DNA interaction of substitutions of G+2 base by A, T, or abasic site (X). (D) shows fluorescence-detected equilibrium binding. (E) shows fluorescence-detected high-salt–induced dissociation.

σ interacts with the promoter –10-element nontemplate strand through a deep, L-shaped groove that runs across σ region 2 (σR2) and part of σ region 1.2 (σR1.2) (Fig. 1, A to C, and Fig. 2A). σ interacts with all six nucleotides of the –10 element and potentially interacts with bases of five nucleotides of the –10 element (positions –12, –11, –9, –8, and –7) (Fig. 1, A to C, Fig. 2A, and figs. S5 and S6). The interactions with bases involve σ residues L110, E116, N383, R385, L386, K418, F419, E420, R423, Y425, S428, T429, Y430, T432, W433, and R436 (residues numbered throughout as in Escherichia coli RNAP and σ70). Alanine substitutions of these residues result in defects in transcription, indicating that these residues are functionally important (Fig. 2B). Model building indicates that σ residues Q437 and T440 could contact a nucleotide base-paired to nontemplate-strand position –12, providing a structural explanation for results indicating that these residues are important for sequence recognition at position –12 (15, 16) (Fig. 2A and fig. S5). The interactions between σ and the –10 element in RPo and RPo-GpA are essentially identical to those observed in a crystal structure of a σR2 fragment bound to a –10 element ssDNA oligonucleotide (17, 18).

σ interacts with the promoter discriminator-element nontemplate strand through a shallow groove that runs across the face of σR1.2 (Figs. 1A and 2C). σ interacts with bases of three nucleotides of the discriminator element (positions –6 through –4) (Figs. 1A and 2C, and fig. S7). The interactions involve σ residues D96, V98, R99, M102, R103, M105, G106, and R385. Alanine substitutions of these residues result in defects in transcription (Fig. 2D). σ residue M102 makes direct van-der-Waals contact with the base at nontemplate-strand position –5, providing a structural explanation for the observation that σ M102 can be crosslinked to this base (19) (Figs. 1A and 2C, and fig. S7B).

A striking feature of the interactions between σ and the –10 and discriminator elements is that σ unstacks three DNA bases, flips them out of base stacks, and inserts them into pockets formed by residues of σ (Figs. 1 and 2, and figs. S6, A and C, and S7A). σR2 unstacks, flips, and inserts into a deep pocket the adenine base at the second position of the –10 element (A-11) (Fig. 2, A and E, and fig. S6A). Within this deep pocket, A-11 makes stacking interactions with one aromatic amino acid (σ Y430), makes edge-edge interactions with two aromatic amino acids (σ Y419 F419 and Y425), and makes H bonds between base Watson-Crick atoms and two amino-acid backbone atoms (σ K418 and Y419 E420), enabling unambiguous sequence read-out (fig. S6A). In an analogous manner, σR2 unstacks, flips, and inserts into a deep pocket the thymine base at the sixth position of the –10 element (T-7) (Fig. 2, A, C, and F, and fig. S6C), and σR1.2 unstacks, flips, and inserts into a deep pocket the guanine base at the first position of the discriminator element (G-6) (Fig. 2, C and F, and fig. S7A). The insertion of flipped-out bases into pockets provides an effective means to read sequence, because it enables contacts with essentially all atoms of the bases. This mode of interaction accounts for the fact that A-11 and T-7 are the most important positions of the –10 element (11). The insertion of flipped-out bases into pockets also provides an effective means to use binding energy to drive DNA unwinding, because this mode of interaction can occur only when DNA is unwound. This mode of interaction accounts for the ability of σ to facilitate promoter unwinding.

RNAP core enzyme interacts with transcription-bubble nontemplate-strand positions –4 to +2, which we term the “core recognition element” (CRE) (Fig. 1, A and C, and Fig. 3), and which we show to be a sequence-specific promoter element (see below). RNAP core enzyme interacts with bases of five of six nucleotides of the CRE (positions –4, –3, –2, +1, and +2) (Figs. 1A and 3A, and fig. S8). The interactions involve RNAP β-subunit residues R151, W183, D199, R371, R394, I445, D446, R451, L538, and V547. Alanine substitutions of all except one of these residues result in defects in transcription (Fig. 3B). The interactions account for the previously observed crosslinking of nontemplate-strand positions –4 though +2 to β residues 84 to 642 (2).

The interactions between RNAP and the most-downstream nucleotide of the transcription-bubble nontemplate strand, G+2, are especially noteworthy. RNAP unstacks the G+2 base, flips it out of the base stacks, and inserts it into a deep pocket formed by RNAP β subunit (“β pocket”), in a manner analogous to the manner in which σ interacts with A-11, T-7, and G-6 (Fig. 3, A and C, and fig. S8C). Six residues of the β pocket make van-der-Waals interactions with the G+2 base (β R151, I445, D446, R451, L538, and V547), and two residues of the β pocket make H bonds with Watson-Crick atoms of the G+2 base (β D446 and R451) (fig. S8C). The structures suggest that the interactions between RNAP and G+2 are sequence-specific (fig. S8C), and equilibrium-binding and kinetic experiments confirm that the interactions between RNAP and G+2 are sequence-specific (Fig. 3, D and E, and fig. S9). RNAP exhibits an equilibrium dissociation constant lower by a factor of 5, and an off-rate lower by a factor of 5, for a promoter derivative having G at nontemplate-strand position +2 than for promoter derivatives having A, T, C, or an abasic site at nontemplate-strand position +2 (Fig. 3, D and E, and fig. S9). In principle, the sequence-specific interactions between RNAP and nontemplate-strand G+2 shown here to occur in transcription initiation complexes may occur also in transcription elongation complexes. More than 30 crystal structures of elongation complexes have been determined to date (fig. S10). However, only one includes nontemplate-strand position +2: namely, PDB 3PO2, a structure of a yeast RNAPII backtracked and arrested elongation complex (20). Strikingly, in this structure, although not noted in the publication on this structure, the nontemplate-strand position +2 base is inserted into a pocket formed by RNAPII residues equivalent to βR151, βI445, βD446, βR451, βL538, and βV547, adopting a conformation similar to, and making interactions similar to, G+2 in the structures of RPo and RPo-GpA (fig. S10). This observation, together with other observations (21), suggests that the RNAP-DNA interactions that mediate recognition of G+2 in initiation complexes may occur also in elongation complexes, where they may influence sequence-dependent translocational bias (22) and sequence-dependent pausing (23). This observation further suggests that these RNAP-DNA interactions may be made not only by bacterial RNAP but also by eukaryotic RNAPII.

The interaction between RNAP and the adjacent nucleotide of the transcription-bubble nontemplate strand, T+1, also is noteworthy. RNAP β-subunit residue W183 makes an aromatic-amino-acid/base stacking interaction with T+1 (Fig. 3C and fig. S8B). This interaction forces the unstacking of T+1 and G+2—making G+2 available to interact with the β pocket—and also likely nucleates or stabilizes the stacking of nucleotides at nontemplate-strand positions –5 through +1 (Fig. 3C).

Although it has been established that RNAP-CRE interactions involving position +2 are sequence-specific (Fig. 3, D and E, and fig. S9), it remains to be determined whether RNAP-CRE interactions involving positions –4 through +1 are sequence-specific. We conclude that RNAP-CRE interactions contribute to sequence-specific promoter recognition, and we propose that RNAP-CRE interactions also contribute to the formation and maintenance of the transcription bubble. It seems likely that RNAP-CRE interactions enable RNAP to assist σ in promoter unwinding during σ-dependent initiation and enable RNAP to perform promoter unwinding during σ-independent initiation.

The structures also show that RNAP and σ “preorganize” the transcription-bubble template strand and downstream dsDNA (Fig. 4, A and B). The transcription-bubble template strand in the structures adopts the same A-form helical conformation, and makes the same interactions with the RNAP active-center “i” and “i + 1” sites, as in an elongation complex (Fig. 4A). The downstream dsDNA and the ribonucleotide primer GpA in the structures also exhibit the same conformations and interactions as the downstream dsDNA and the 3′ ribonucleotides of the RNA product in an elongation complex (Fig. 4A). We conclude that a promoter-dependent, initiation-factor–dependent transcription initiation complex is preorganized to proceed to transcription elongation without major changes in the conformation or interactions of the transcription-bubble template strand, downstream dsDNA, and RNA. The preorganization of the template strand in the promoter-dependent, initiation-factor–dependent initiation complex is in contrast to the situation in promoter-independent, initiation-factor–independent initiation complexes, in which template-strand ssDNA is disordered (24), or all but 5 nt of template-strand ssDNA is disordered (25), except in the presence of ≥4 nt of RNA. The presence of the initiation factor appears to account for the difference. The initiation factor σ makes direct interactions with the template strand that preorganize the template strand. The segment of σ region 3.2 (σR3.2) comprising σ residues 510 to 522—the “σ finger”—penetrates the RNAP active-center cleft, occupies part of the region occupied by RNA in a transcription elongation complex, makes an aromatic-amino-acid/base edge-edge interaction with the template-strand base at position –4, and makes four H bonds with Watson-Crick atoms of the template-strand bases at positions –4 and –3 (Fig. 4B and fig. S11). The interactions with DNA bases at template-strand positions –4 and –3 involve σ residues D514, D516, D517, and F522. Alanine substitutions of these residues result in defects in transcription (Fig. 4C). The interactions between the σ finger and the template strand constrain the template strand to adopt an A-form helical conformation and buttress the template strand to engage the RNAP active center in a manner compatible with binding of initiating NTPs. These interactions provide a structural explanation for the ability of the σ finger to facilitate the binding of initiating NTPs (26, 27). The interactions between the σ finger and the template strand must be disrupted, and the σ finger must be displaced, to synthesize >4 nt of RNA. The need to disrupt these interactions and displace the σ finger provides a structural explanation for effects of the σ finger on abortive initiation (26, 27).

Fig. 4

Structure of RPo-GpA: preorganization of transcription-bubble template-strand ssDNA, downstream dsDNA, and RNAP clamp. (A) Comparison of transcription-bubble template strand ssDNA, downstream dsDNA, and ribodinucleotide primer GpA in structure of RPo-GpA (colors as in Fig. 1, B and C) with those in structure of T. thermophilus transcription elongation complex (29) (cyan). (B) Interactions between σ region 3.2 (σR3.2; “σ finger”) and transcription-bubble template-strand ssDNA. (C) Effects on transcription of substitutions of σ residues that contact transcription-bubble template-strand bases. (D and E) Comparison of RNAP clamp conformations in structures of RPo-GpA (colors as in Fig. 1, B and C), T. thermophilus RNAP holoenzyme (30) (red), and T. thermophilus transcription elongation complex (29) (red).

The structures also define the conformational state of the RNAP clamp—the movable wall of the RNAP active-center cleft (28)—in the transcription initiation complex (Fig. 4, D and E). The clamp is closed by ~11° relative to the crystal structure of RNAP holoenzyme (Fig. 4D) and exhibits the same conformation as in the crystal structure of the elongation complex (Fig. 4E), consistent with fluorescence-resonance-energy-transfer results indicating that the clamp closes upon formation of RPo and remains closed during elongation (28). The finding that clamp conformations are the same in the initiation complex and the elongation complex provides further evidence that the initiation complex is preorganized to proceed to elongation.

The structures of RPo and RPo-GpA determined in this work reveal how RNAP and σ interact with the transcription-bubble nontemplate strand to accomplish promoter recognition and promoter unwinding and reveal how RNAP and σ interact with the transcription-bubble template strand and downstream dsDNA to preorganize the transcription initiation complex for subsequent reactions. The structures provide a foundation for understanding transcription initiation and transcriptional regulation.

Supplementary Materials

www.sciencemag.org/cgi/content/full/science.1227786/DC1

Materials and Methods

Figs. S1 to S11

References (3148)

References and Notes

  1. Materials and methods are available as supplementary materials on Science Online.
  2. Acknowledgments: We thank the Brookhaven National Synchrotron Light Source and Cornell High Energy Synchrotron Source for beamline access; the Argonne Photon Source CCP4 School for training; and S. Borukhov, P. deHaseth, K. Kuznedelov, L. Minakhin, K. Severinov, and D. Temiakov for plasmids and discussion. This work was funded by NIH grants GM41376 and AI072766 and a Howard Hughes Medical Institute Investigatorship to R.H.E. PDB accession codes are 4G7H, 4G7Z, and 4G7O.
View Abstract

Navigate This Article