Structure of the Escherichia coli RNA Polymerase α Subunit Amino-Terminal Domain

See allHide authors and affiliations

Science  10 Jul 1998:
Vol. 281, Issue 5374, pp. 262-266
DOI: 10.1126/science.281.5374.262


The 2.5 angstrom resolution x-ray crystal structure of the Escherichia coli RNA polymerase (RNAP) α subunit amino-terminal domain (αNTD), which is necessary and sufficient to dimerize and assemble the other RNAP subunits into a transcriptionally active enzyme and contains all of the sequence elements conserved among eukaryotic α homologs, has been determined. The αNTD monomer comprises two distinct, flexibly linked domains, only one of which participates in the dimer interface. In the αNTD dimer, a pair of helices from one monomer interact with the cognate helices of the other to form an extensive hydrophobic core. All of the determinants for interactions with the other RNAP subunits lie on one face of the αNTD dimer. Sequence alignments, combined with secondary-structure predictions, support proposals that a heterodimer of the eukaryotic RNAP subunits related to Saccharomyces cerevisiae Rpb3 and Rpb11 plays the role of the αNTD dimer in prokaryotic RNAP.

Escherichia coli RNAP comprises an essential catalytic core of two α subunits (each 36.5 kD), one β subunit (150.6 kD), and one β′ subunit (155.2 kD), which are conserved in sequence from bacteria to human. In addition to playing key roles in transcription initiation, the α subunit initiates RNAP assembly (1) by dimerizing into a platform with which the large β and β′ subunits interact. Deletion mutagenesis and limited proteolysis indicate that the α subunit comprises two independently folded domains, the NH2-terminal domain (NTD; residues 8 to 235) and COOH-terminal domain (CTD; residues 249 to 329), connected by a flexible, 14-residue linker (Fig. 1A) (2). The αCTD is dispensable for RNAP assembly and basal transcription but is required for the interaction with an upstream promoter element (3) and is the target for a wide array of transcription activators (4). The solution structure of αCTD consists of a compact fold of four short α helices (5). The αNTD is essential in vivo and in vitro for RNAP assembly and basal transcription (2, 6). The regions of conserved sequence between α homologs of prokaryotic, archaebacterial, chloroplast, and eukaryotic RNAPs (α motifs 1 and 2) (Fig. 1A) are contained within the NTD (7), as are the determinants for α interaction with the RNAP β and β′ subunits (2, 6, 8–12). We crystallized a mutant αNTD with an Arg to Ala substitution at position 45 (αNTDR45A) because we were unable to obtain crystals of wild-type αNTD suitable for structure determination (13). The structure was determined by multiple isomorphous replacement (MIR) and refined to a resolution of 2.5 Å (Table 1).

Figure 1

Structure of the αNTD dimer. (A) Schematic diagram showing the domain structure of E. coliRNAP α (2). The black box indicates the NTD crystallized in this study (α residues 1 to 235). The gray boxes denote regions conserved in sequence between α homologs of prokaryotic, archaebacterial, chloroplast, and eukaryotic RNAPs. (B) RIBBONS (28) diagram of the three-dimensional structure of the αNTD dimer. One αNTD monomer is colored green and the other is yellow. Unmodeled, disordered regions are indicated as dotted lines. (Top) View along the dimer twofold axis; (bottom) view perpendicular to the dimer twofold axis.

Table 1

Summary of crystallographic analysis. αNTDR45A (19) (15 mg/ml) was crystallized by vapor diffusion against 5 mMβ-mercapotethanol, 0.2 M MgCl2, 100 mM tris-HCl, 100 mM NaCl (pH 8.6 to 8.9), and 18 to 22% polyethylene glycol 400 (PEG400) at 4°C. Heavy-atom derivatives were prepared by soaking crystals for 4 hours in 1 mM HgCl2, 12 hours in 1 mM MetHgCl and MetHgPO4, 24 hours in 5 mM K2PtCl4, and 1 week in 10 mM UO2Ac2, all dissolved in the crystallization solution. For cryocrystallography, crystals were soaked in steps of increasing PEG400 concentration (2% each step every 30 min) into 40% PEG400 before flash-freezing. Data were collected in the laboratory or at National Synchrotron Light Source beamline X4a (native II, HgCl2 derivative, and Se derivative only) on anR-axis IV area detector, and processed with DENZO and SCALEPACK (Z. Otwinowski and W. Minor). Mercury positions were located manually by Patterson methods with PHASES (20) and confirmed with HEAVY (21) and SHELX-90 (22) Additional heavy-atom sites were located in cross-phased difference Fouriers. Heavy-atom parameter refinement and solvent flattening were carried out with PHASES. By use of this initial electron density map, all four copies of domain 1 within the asymmetric unit were nearly completely modeled, whereas only partial polyalanine models of the four copies of domain 2 could be constructed. Three noncrystallographic symmetry (NCS) operators were determined mapping copies 2, 3, and 4 of domain 1 onto copy 1, and the partial models of domain 2 were similarly determined, for a total of six NCS operators. The map was improved by NCS averaging between the four copies of domain 1 and separately between the four copies of domain 2 in the asymmetric unit with DM (23). Map interpretation and model building were done with the program O (24). The map was improved by cycles of refinement with X-PLOR (25) with NCS constraints (six total constraints, as above), and phase combination with SIGMAA (26). A final refinement was performed with relaxed NCS restraints. The final model contains residues 1 to 161 and 165 to 232 for molecule one, residues 1 to 159 and 165 to 235 for molecule two, residues 1 to 159 and 168 to 235 for molecule 3, and residues 1 to 160 and 165 to 235 for molecule four. Water molecules were added if they had at least one hydrogen bond with a protein atom or with other waters, and they were kept after refinement if the B factor remained below 40 Å2. A total of 250 water molecules were added in the final refinement. Stereochemical values are all within or better than the expected range for 2.5 Å structure, as determined with PROCHECK (27). The coordinates have been submitted to the Brookhaven Protein Data Bank.

View this table:

The 26-kD αNTD monomer comprises two domains, each containing a distinct hydrophobic core (Fig. 1B). Domain 1 contains NH2- and COOH-terminal sequences (residues 1 to 52 and 180 to 235), and domain 2 contains the intervening sequence (residues 53 to 179) (Fig. 2). Each domain has an α/β fold. Domain 1 contains a four-stranded antiparallel β sheet (S1, S2, S10, and S11) and two nearly orthogonal α helices (H1 and H3), whereas domain 2 contains seven β strands in an antiparallel arrangement (S3 to S9) and one α helix (H2) (Fig. 1B).

Figure 2

Partial results of a sequence alignment of α homologs from bacteria and chloroplasts, and eukaryotic Rpb3 and Rpb11 proteins. Numbers at the beginning of each line indicate amino acid positions relative to the start of each protein sequence. Numbers along the bar on top indicate the amino acid position in E. coliα. Amino acid identity >50% in the full alignment is indicated by a black background, similarity >50% is indicated by a yellow background. Gaps are indicated by dashed lines, insertions by boxed out regions. The secondary structure of E. coli αNTD is indicated schematically above the α sequences, helices H1 to H3 are indicated by rectangles (α helices are labeled, some 3/10 helices are shown but not labeled), β strands S1 to S11 are indicated by arrows, loops are indicated by a black line. The colored areas in the amino acid numbering bar above the α sequences denote regions of theE. coli α sequence protected from hydroxyl-radical cleavage by β (green) or β′ (magenta) (12), and a region that interacts with CAP at class II CAP-dependent promoters (red) (15). Green and magenta dots in the diagram of α secondary structure denote mutations that cause defects in β or β′ binding, respectively (10, 11). The black dots indicate residues participating in the hydrophobic core of the α dimer interface. Shown schematically above the Rpb3 and Rpb11 sequences is the predicted secondary structure (18). Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.

The αNTD dimer forms an elongated, flat structure with dimensions of about 120 Å by 60 Å by 25 Å (Fig. 1B). Almost all of the monomer-monomer interactions that form the dimer interface arise from H1 and H3 in domain 1 (Fig. 3). The unusual dimer interface can be described as two pairs of nearly orthogonal α helices (one pair from each monomer) that interlock like two V's intersecting through their open ends, resulting in two pairs of antiparallel α helices abutting each other with orthogonal orientations (Figs. 1B and 3). The primary monomer-monomer interaction occurs through an antiparallel coiled-coil–like interaction between H3 of monomer 1 (H31) and H3 of monomer 2 (H32). The other pair of antiparallel α helices (H11 and H12) are shifted slightly apart and do not make extensive interactions with each other, but they contribute to the dimer interface by making hydrophobic interactions with H3 of the opposite monomer. The extensive hydrophobic interface sandwiched between the orthogonal pairs of antiparallel α helices accounts for the high stability of the α dimer. The hydrophobic core of the dimer interface is made up of one residue from S2, six residues within or adjacent to H1, and seven residues from H3 (Figs. 2 and 3). Over these 14 positions, hydrophobic residues are nearly absolutely conserved among α homologs from eukaryotes, such as proteins related toSaccharomyces cerevisiae Rpb3 and Rpb11 (Fig. 2).

Figure 3

Dimer interface of αNTD. Ribbon model, viewed along the dimer twofold axis, showing the conserved residues that form the hydrophobic core of the dimer interface (labeled on one monomer only). One αNTD monomer is colored green, and the other is yellow. Helices H1 and H3 of each monomer are labeled. The figure was made with the program GRASP (29).

The asymmetric unit of the crystal contains four crystallographically independent αNTD monomers. The overall fold of each monomer is the same, but comparison of different monomers reveals that the linkage between domain 1 and 2 within a monomer is highly flexible. The flexibility can be described as a hinge motion with a range in all directions of at least 15° (14), with the hinge centered in the vicinity of Pro52/Gly53 (in the strand connecting domain 1 to domain 2) and Pro179 (in the connection from domain 2 back to domain 1). The high conservation of Pro and Gly residues at these positions (Fig. 2) suggests that the flexible linkage between domains 1 and 2 is an evolutionarily conserved feature, implying some functional consequence.

Mutagenesis and hydroxyl-radical protein footprinting studies have localized determinants in α that are important for interactions with β and β′, all within α motifs 1 and 2 of the NTD (Fig. 2) (2, 6, 8–12). These are mapped onto the αNTD structure in Fig. 4. It is not known at present how these interactions with β and β′ are distributed between the αNTD monomers. For clarity, the data are presented such that the determinants involved in β interactions are shown on one monomer and the determinants involved in β′ interactions are shown on the other. Of particular interest are two point substitutions, at positions 45 and 48, that cause defects in β binding (10). Both of these positions lie on the solvent-exposed face of H1 within the region most strongly protected from hydroxyl-radical cleavage by binding of β (Fig. 4). These observations together strongly support the conclusion that this exposed face of H1, and Arg45 and Leu48 in particular, directly interact with β. A two–amino acid insertion at position 80 also results in defective β binding (9), and this site lies immediately adjacent to a second region of the β footprint on α (Fig. 4).

Figure 4

Protein-protein interactions with αNTD. (Top) Backbone representation of the αNTD dimer viewed along the dimer twofold axis as in Fig. 1B. Backbone residues are color-coded according to the hydroxyl-radical footprinting data of (12), so that regions protected from hydroxyl-radical cleavage by β or β′ are colored green or magenta, respectively. Shown in yellow or light blue are the α-carbon positions of mutations that cause defects in β or β′ binding, respectively (10, 11). The region of αNTD found to interact with CAP AR2 at class II CAP-sites (αNTD residues 162 to 165) are shown in red (15). (Bottom) View along the dimer twofold axis from the opposite direction as the top view. The COOH-termini of the two α-NTD monomers are indicated. The figure was made with the program GRASP (29).

Small insertions at positions 108 and 200 cause defects in β′ binding without affecting assembly of α2β (11). These two sites fall within the regions of α protected from hydroxyl-radical cleavage by the binding of β′. Two point substitutions at positions 86 and 173 also interfere wtih β′ binding (10). Although far apart in the sequence, these two sites fall close to each other in the αNTD structure (Fig. 4). However, these sites are far from the regions footprinted by β′ binding, and one of the substitutions, V173A, replaces a highly conserved buried hydrophobic residue with a less bulky residue, which might be expected to cause a structural perturbation or destabilization, thereby making interpretation of this pair of substitutions in terms of specific effects on β′ binding less certain.

Despite any caveats from the above considerations, it is clear that all of the regions of the α peptide backbone that are protected from hydroxyl-radical cleavage by the presence of β and β′, and all of the mutants that deleteriously affect the binding of β or β′, are exposed on one face of the αNTD dimer (Fig. 4). On the opposite face, sites known to interact with the other RNAP subunits are not found, and located on this face are the COOH-termini of the two αNTD monomers (Fig. 4). Thus, the αCTDs and the β and β′ subunits are located on opposite faces of the αNTD structure.

Although the αCTD has been identified as the target for a wide array of transcription activators (4), at least one interaction between an activator (catabolite activator protein, or CAP) and αNTD, which is essential for activation at class II CAP-dependent promoters, has been identified. The protein-protein interactions between CAP and αNTD occur between the basic activating region 2 of CAP and a stretch of four acidic residues, Glu162-Glu163-Asp164-Glu165, of αNTD (15). This region of the αNTD structure comprises a highly exposed loop (Fig. 4, shown on only one αNTD monomer). A short stretch of residues (160 to 163) in this region is disordered in the crystal structure.

The largest subunits of prokaryotic RNAPs (β′ and β) exhibit strong sequence conservation with homologs in eukaryotic RNAPs (16). Less obvious evolutionary relationships have been proposed between α and two families of eukaryotic RNAP subunits related to S. cerevisiae Rpb3 and Rpb11, and recent studies suggest that an Rpb3-Rpb11 heterodimer serves as the eukaryotic analog of the prokaryotic α2 homodimer (17). A sequence alignment of 5 Rpb3 homologs, 5 Rpb11 homologs, 17 prokaryotic α sequences, and 6 chloroplast α sequences was performed, taking into consideration the predicted secondary structure for the eukaryotic proteins (18) and the structure of E. coliαNTD. The sequence and predicted secondary structure of the Rpb3 and Rpb11 homologs align very well with sequences of α corresponding to domain 1. With one exception, gaps or insertions occur only in exposed loops between secondary structural elements and are expected to be compatible with the αNTD fold. The exception is one large gap in the Rpb11 homologs that corresponds to domain 2, which is completely lacking. This also seems to be compatible with the αNTD fold because the COOH-terminal end of the first domain 1 fragment (the COOH-terminal end of H1) is less than 6 Å from the NH2-terminal end of the second domain 1 fragment (near the NH2-terminus of S10). These considerations, combined with the observation already noted that all of the hydrophobic residues that comprise the hydrophobic core of the α dimer interface are nearly absolutely conserved in the Rpb3 and Rpb11 homologs, support the suggestion that an Rpb3-Rpb11 heterodimer plays the role of the α2homodimer in prokaryotes. The Rpb3-Rpb11 dimer interface is predicted to be structurally very similar to the α2 dimer interface. Rpb3 is predicted to have a two-domain architecture like that of α, with domain 1 structurally related to α domain 1, whereas the structure of Rpb3 domain 2 diverges from that of α. Rpb11 is predicted to be a single domain with a fold closely related to that of α domain 1.

  • * To whom correspondence should be addressed. E-mail: darst{at}


View Abstract

Navigate This Article