Structure of the Amino-Terminal Protein Interaction Domain of STAT-4

See allHide authors and affiliations

Science  13 Feb 1998:
Vol. 279, Issue 5353, pp. 1048-1052
DOI: 10.1126/science.279.5353.1048


STATs (signal transducers and activators of transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. The crystal structure of an NH2-terminal conserved domain (N-domain) comprising the first 123 residues of STAT-4 was determined at 1.45 angstroms. The domain consists of eight helices that are assembled into a hook-like structure. The N-domain has been implicated in several protein-protein interactions affecting transcription, and it enables dimerized STAT molecules to polymerize and to bind DNA cooperatively. The structure shows that N-domains can interact through an extensive interface formed by polar interactions across one face of the hook. Mutagenesis of an invariant tryptophan residue at the heart of this interface abolished cooperative DNA binding by the full-length protein in vitro and reduced the transcriptional response after cytokine stimulation in vivo.

The STATs constitute a family of transcription factors that are necessary for the activation of distinct sets of target genes in response to cytokines and growth factors (1). The STAT proteins are activated in the cytoplasm by phosphorylation on a single tyrosine residue (2). Each STAT molecule contains an SH2 domain, and reciprocal SH2-phosphotyrosine interactions between two STAT molecules result in the formation of active dimers that translocate to the nucleus and activate gene expression (Fig.1A) (2). The canonical recognition site for a STAT dimer encompasses 9 to 10 base pairs (TTCN3–4 GAA) of DNA (3). However, analysis of the binding of activated STATs to DNA targets revealed that the STAT binding sites can extend over two or more adjacent canonical sites (4, 5).

Figure 1

(A) Sequence alignment of the conserved N-domain of the STAT family and secondary structure of this domain of STAT-4 (27). Human (hSTAT), murine (mSTAT), and Drosophila(DSTAT) proteins are included. The numbering is according to STAT-4. α-Helices α1 to α8 are drawn as cylinders. The blackened part of helix α2 indicates a 310 helix. Invariant residues are highlighted with an asterisk below the alignment. Conserved residues in the hydrophobic core are marked with filled circles above the STAT-4 sequence. Residues in helices α6 and α7 that contribute to the packing of the coiled-coil are boxed, and their position in the helical repeats is indicated (a or d). (B) Schematic representation of two STAT dimers bound to adjacent target sites. Interactions between N-domains (N, circled) allow the dimers to bind to each other. Phosphotyrosines are indicated by Y attached to encircled P symbols. DBD, DNA binding domain; SH2, SH2 domain; TAD, transactivation domain.

Mammalian transcription factors activate transcription and achieve biological specificity through interactions with other transcription factors, trans-activators, or the general transcription machinery (6). Although the molecular basis for these phenomena is poorly understood, direct protein-protein interactions among multiple promoter-bound proteins appear to mediate this synergistic activation (7). In the case of the STATs, a small NH2-terminal domain mediates a number of important protein-protein interactions that influence STAT function (8). This domain allows cooperative interactions between STAT dimers bound to adjacent target sites on DNA, leading to a prolonged half-life of the protein-DNA complex (9). Functional assays exploring transcriptional regulation of the hepatic Spi 2.1 gene revealed the necessity for cooperative STAT binding to two adjacent recognition sites for a full growth hormone response (10). These cooperative contacts also affect the binding site selection of different STATs on a natural promoter that contains multiple potential STAT recognition sites (4). Deletion of the NH2-terminal ∼100 residues of STAT-1 or STAT-4 abolishes cooperative binding to DNA (4, 9). The truncated protein fully retains binding to a single target site as a dimer, suggesting that the N-domain is dispensable for dimer formation and DNA binding (9), but is necessary for interaction between STAT dimers and binding site discrimination (4). The N-domain of STAT-1 is also required for interaction between STAT-1 and the transcriptional coactivator protein CBP, a large (∼2500 amino acids) polypeptide with transacetylase activity (11). Additionally, the amino-terminal region of STAT-2 is involved in binding to the intracellular region of the interferon-α receptor (12).

The NH2-terminal 131 residues of STAT-1 form a stable domain that is readily cleaved off the intact molecule, indicating that it is an independently folded module (9). Sequence alignments show that the N-domain is highly conserved (Fig. 1B). The average sequence identity for this region between mammalian STAT proteins is 40%, and ranges from 51% between STAT-1 and STAT-4 to 20% between STAT-5 and STAT-6. Over the ∼750 amino acids that span the length of the common core of the STATs, only the SH2 domain is more highly conserved (13). The N-domain is also found in the Drosophila STAT (dStat92E) (Fig. 1B) (14) and in a recently discovered STAT inDictyostelium discoideum (15). The first gene defect established in the DStat92E gene is a misspliced variant that produces both normal mRNA and an mRNA encoding only the NH2-terminal 41 residues. Expression of this fragment has a partial dominant negative effect on transcriptional activation by the wild-type protein in cell culture and in the fly is associated with a weak abnormal phenotype (16).

We solved the crystal structure of the N-domain of STAT-4 (17) by multiwavelength anomalous diffraction and have refined an atomic model to a resolution of 1.45 Å (R = 19.4%, R free = 22.3%) (Table1). The STAT-4 N-domain is all helical, with an unusual architecture. Instead of the up-down connectivity of helix bundles or the box-like helical packing of the globin fold, the N-domain is constructed from three distinct structural elements that pack together. The NH2-terminal 40 residues encompass the first four helices (α1 to α4), which form a ring-shaped element (colored red in Fig. 2 and Fig.3B). A small helix (α5) connects this ring to the next structural element, an antiparallel coiled-coil formed by helices α6 and α7. The heptad repeat of hydrophobic amino acids, characteristic of coiled-coils, is conserved across the STATs (Fig.1B). Finally, the distal surface of the ring-shaped element forms a docking site for the last helix in the structure (α8). The overall appearance of the structure is that of a hook, with the inner surface of the hook being formed by the intersection of the proximal surfaces of the ring-shaped element and the coiled-coil. The N-domain of STAT-4 is dimeric in solution (18), and a twofold symmetry axis in the crystal generates a dimer with an extensive polar interface that involves one face of the hook.

Figure 2

Tertiary structure of the N-domain of STAT-4. (A) Overall representation of two monomers (green and gray) in the crystallographic dimer, viewed approximately orthogonal to the molecular twofold axis, which is vertical. The ring-shaped NH2-terminal element is colored red in one monomer. (B) Orthogonal view of one of the N-domains shown in (A), depicting details of the architecture of the ring-shaped element. Side chains that participate in a charge-stabilized hydrogen-bond network are shown in a ball-and-stick representation. The side chain and backbone carbonyl of buried R31 are shown in magenta. For clarity, the indole ring of the invariant residue W4 that seals off this arrangement on the proximal side is drawn with thinner bonds. The blue sphere denotes a buried water molecule. Hydrogen bonds are indicated by dotted lines. Oxygen, nitrogen, and carbon atoms are red, blue, and yellow, respectively. Q3-N marks the position of the backbone amide group of residue Q3. The light-red segment of helix α2 highlights its 310 helical conformation. Fig. 2 and Fig. 3, B and C were created with the program RIBBONS, version 2.0 (28).

Figure 3

Structure of the dimer of N-domains. (A) Surface representation of the N-domain dimer indicating the wedge-shaped groove and the dimerization interface. Shown are two monomers of a dimer with the left one rotated 90° around the vertical axis away from the original position in the dimer. Note the hook-like appearance of the monomer with the coiled-coil of helices α6 and α7 pointing out of the planar surface formed by the ring-shaped element comprising the NH2-terminal 40 residues. Residues from three separate regions of the N-domain make direct or water-mediated contacts in the dimer and are color-coded according to their position. Interface residues at the NH2-terminus are in green, those in helices α3 and α4 are in blue, and amino acids located in helix α6 are yellow. The position of the critical W37 is highlighted in red. The figure was created using GRASP (29). (B) A view at the dimerization interface with amino acids represented as ball-and-stick models and the cα backbone as a ribbon. The monomer is in the same orientation as the one on the right side of (A). Side chains are colored as in (A); the backbone ribbon is colored as in Fig. 2B, with the first 40 residues highlighted in red. L33 makes a backbone carbonyl group contact, and its position is represented by the filled circle. In the STAT-4 recombinant N-domain used for crystallization, M1 was replaced with G plus four additional small amino acids, one of which (G1) is visible in the electron density map. In the crystals, the NH2-terminus of G1 is part of the dimer interface, possibly substituting for the native M1. (C) Close-up stereoview of the intermolecular hydrogen-bonding network in the dimer. Selected side chains surrounding the conserved W37 (magenta) in helices α4 and α6 of two monomers (green and gray) are shown. W37 makes direct (E66′) and water-mediated contacts (Q63′). Water molecules are depicted as blue spheres.

Table 1

Crystallographic analysis. The STAT-1 N-domain formed only small, needle-like crystals. However, hanging drops (1 μl) of STAT-4 N-domain were mixed with equal volumes of reservoir buffer containing 0.2 M Na+CH3COO, 0.1 M tris HCl (pH 8.0), 17% PEG4000, and hexagonal crystals (0.2 mm by 0.2 mm by 0.2 mm) were routinely grown overnight at 20°C. The crystals contain one molecule of the STAT-4 NH2-terminal domain in the asymmetric unit and are in space group P6522 (a= 79.51 Å, b = 79.51 Å, c = 84.68 Å). Crystals were cryoprotected in reservoir solution enriched in PEG to 20% and glycerol to 22.5% before flash-freezing. Heavy-atom derivatives were prepared by soaking crystals for 30 min in a 1:20 diluted (with cryoprotective solution) saturated solution of p-hydroxy-mercuribenzoic acid (PHMB). Data for the native crystal were collected at Brookhaven National Laboratory (BNL) at beam-line X25, using a Mar imaging-plate detector system (Mar, Norderstadt, Germany). A multiwavelength anomalous diffraction (MAD) experiment on a PHMB-derivatized crystal was performed at BNL on beam-line X4A, using Fuji imaging plates. Data processing and reduction were done with HKL, DENZO, and SCALEPACK (Z. Otwinowski and W. Minor). Model building was performed with O (30). Bulk solvent correction and anisotropic B-factor scaling was applied during refinement, using X-PLOR (31). Of the five heterologous residues at the NH2-terminus, the first three residues (GSG), as well as the COOH-terminal residue Q124, are not visible in the electron density map. No amino acids occupy disallowed regions of the ramachandran plot, and 95% fall into the most favored region.

View this table:

The N-domain has a well-defined hydrophobic core that is conserved across the STATs, consistent with a stable and defined fold (Fig. 1B and Fig. 2). However, the NH2-terminal ring-shaped element is stabilized by polar interactions involving buried charges. The ring is closed off by α helix-dipole interaction between the NH2-terminal region of helix α1 and the carboxylate group of Glu39, presented by the COOH-terminal region of helix α4 (Fig. 2B). Glu39 forms a hydrogen bond with the amide nitrogen of residue Gln3 and is oriented correctly for this charge dipole interaction by the side chain of Arg31, which in turn forms a buried ion pair with Glu112. Glu112 is positioned by interactions with Tyr22 and a buried water molecule. Each of the side chains involved is invariant in all STATs (Fig. 1B), indicating that the ring-shaped element is conserved in architecture.

A consequence of the use of these polar groups in the ring-shaped element is the formation of a compact and potentially specific interaction surface. This structural element forms a relatively flat molecular surface that packs at an angle against another surface presented by the coiled-coil formed by helices α6 and α7. The juxtaposing of the surface of the ring-shaped element with that of the coiled-coil results in a wedge-shaped groove. This groove is lined with hydrophobic residues, with polar residues at the center, and appears as though it could be a site of interaction with other proteins. A possible function for this groove is suggested by the fact that replacement of Arg31 or Glu39 in STAT-1 by Ala results in a molecule that is more slowly dephosphorylated after interferon-γ induction than the wild-type protein (19). Thus, a phosphatase that controls STAT dephosphorylation might bind to the groove in the N-domain.

There is one molecule in the asymmetric unit of the STAT-4 crystal, and it is related to another by a twofold symmetry axis (Fig.2A and Fig. 3A). There is an extensive interface between the two monomers of the dimer that buries 1714 Å2 of surface area (Fig. 3A). An extended intermolecular hydrogen bonding network is formed at the interface that includes 15 amino-acid side chains and 12 water molecules per monomer (Fig. 3B). In addition, five backbone contacts are also observed in each monomer. Eleven of the 15 residues at the interface make direct hydrogen-bonding contacts to the other monomer. The water molecules at the interface are very well defined in the electron density map, and many of them have low temperature factors (<10 Å2) (Fig. 3C).

The interface between N-domain monomers is almost entirely polar. In contrast to the leucine zipper, wherein hydrophobic residues are used to generate the intermolecular interface by the formation of a coiled-coil across the dimer interface (20), the coiled-coil in the N-domain is firmly anchored within the domain and its role is to serve as an architectural support for the presentation of a number of interacting side chains at the interface between N-domains and at the potential interaction groove. Whereas hydrophobic interactions are associated with stabilization of folded protein structures and are often found at the core of tight interfaces, polar interactions can provide both stability and specificity in protein-protein interactions (21). In contrast to the residues that constitute the buried core of the N-domain, which are conserved across STATs, the majority of the residues at the dimer interface are not conserved (Fig. 1B and Fig.3B). This variation may provide specificity in STAT dimer-dimer interactions on DNA. Only two of the residues at the interface are invariant in all STATs: Trp37, a central anchor residue at the interface, and Glu39, which also participates in the formation of the ring-shaped element.

To test the physiological relevance of the dimer of N-domains that is observed in the crystal structure, we determined whether mutation of the critical residue Trp37 to Ala37(W37A) at the interface would disrupt or reduce oligomerization in vitro and transcriptional activation in vivo. These experiments were done with STAT-1 (22). Because of the close similarity between the N-domains of STAT-1 and STAT-4 (51% amino acid identity) we expect structural information derived from the STAT-4 crystal structure to represent the STAT-1 architecture as well.

We used an oligonucleotide bearing tandem binding sites that binds two STAT-1 dimers. This site contains two weak binding sites, spaced 10 base pairs apart (23). Competition experiments show that the off time is long for wild-type STAT-1 (greater than 15 to 30 min), indicating the formation of a stable tetrameric complex. In contrast, if an oligonucleotide containing only a single weak site is used instead, the off-time is very short (<30 s) (9). The stabilization of STAT-1 on oligonucleotides containing tandem binding sites is not observed if the N-domain is deleted (9). The W37A mutant protein bound to the DNA probe with tandem sites as a dimer, but the tetrameric interaction was completely displaced by the addition of unlabeled oligonucleotides (Fig.4A). In contrast, the wild-type protein was resistant to displacement for more than 15 min. We used the same two tandem weak binding sites to drive transcription from a reporter gene in an interferon-dependent transcriptional assay (24) using U3A cells, which lack endogenous STAT-1 (25). In U3A cells transfected with the reporter gene and with either wild-type or W37A mutant STAT-1, the rather weak transcriptional induction of about twofold by interferon γ was abolished by the mutation (Fig. 4B).

Figure 4

Importance of the invariant residue W37 for STAT-1 oligomerization (tetramerization) and mediation of gene activation. (A) A gel mobility shift. Comparison of tetramer stability between wild-type (WT) STAT-1 (lanes 3 and 4) and the W37A mutant (lanes 1 and 2). Radiolabeled DNA containing a tandem binding site was incubated with equal amounts of active protein of either Tyr-phosphorylated WT-STAT or the mutant protein and then with excess (30-fold) unlabeled oligonucleotide for the indicated amount of time. Positions of tetrameric [2×(dimer)] and dimeric (Dimer) complexes are indicated. Samples loaded at the later time point (15 min) were subjected to electrophoresis for a shorter time and therefore ran higher on the gel. The position of unbound oligonucleotide (Free) is marked. (B) Effect of STAT-1 W37A mutation on interferon-γ (IFN-γ)–stimulated gene activation in vivo. U3A cells were transfected with expression clones containing either wild-type or mutant STAT-1 along with a luciferase reporter containing a tandem STAT-binding site as an enhancer. After stimulation with IFN-γ for 10 hours, luciferase expression was determined spectroscopically. Each bar represents average and standard deviation of 10 individual parallel experiments.

Activated and dimeric STAT proteins do not form detectable tetramers in solution in the absence of DNA (18). It is not known whether this is a consequence of limited binding affinity between N-domains or whether the conformation of the STAT molecule in the absence of DNA impedes further oligomerization. In any case, the presentation of highly polar and unique interaction surfaces by the N-domains of the STATs provides a ready means for generating very specific interactions between adjacent STAT dimers on DNA, because the hydrogen bonding constraints of the interacting groups place stereochemical constraints on potential partners. Whereas each N-domain dimer is closed, the fact that each STAT dimer presents two N-domains for interaction makes possible the generation of open-ended STAT-STAT interactions that are limited only by the nature and number of the adjacent DNA binding sites.

  • * Present address: Max-Planck-Institut für Biochemie, Abteilung für Zelluläre Biochemie, Am Klopferspitz 18, 82152 Martinsried, Germany.

  • To whom correspondence should be addressed. E-mail: kuriyan{at}


View Abstract

Navigate This Article