Synthetic transcription elongation factors license transcription across repressive chromatin

See allHide authors and affiliations

Science  22 Dec 2017:
Vol. 358, Issue 6370, pp. 1617-1622
DOI: 10.1126/science.aan6414

Chemical control of transcription

Friedreich's ataxia, a devastating neurodegenerative disease with no effective therapy, is caused by an expansion of intronic repeats and hence a reduced expression of the FXN gene. Erwin et al. synthesized a molecule that specifically targets the expanded repressive repeats. This molecule thereby licenses productive transcription elongation and restores FXN expression to normal levels. In the future, similar interventions may be effective in a diverse array of diseases caused by unstable expansions in microsatellite repeats.

Science, this issue p. 1617


The release of paused RNA polymerase II into productive elongation is highly regulated, especially at genes that affect human development and disease. To exert control over this rate-limiting step, we designed sequence-specific synthetic transcription elongation factors (Syn-TEFs). These molecules are composed of programmable DNA-binding ligands flexibly tethered to a small molecule that engages the transcription elongation machinery. By limiting activity to targeted loci, Syn-TEFs convert constituent modules from broad-spectrum inhibitors of transcription into gene-specific stimulators. Here we present Syn-TEF1, a molecule that actively enables transcription across repressive GAA repeats that silence frataxin expression in Friedreich’s ataxia, a terminal neurodegenerative disease with no effective therapy. The modular design of Syn-TEF1 defines a general framework for developing a class of molecules that license transcription elongation at targeted genomic loci.

A long-standing challenge at the interface of chemistry, biology, and precision medicine is to develop molecules that can be programmed to regulate the expression of targeted genes (1). It is increasingly evident that RNA polymerase II (Pol II) pauses during transcription (2, 3). Regulated release from the paused state into productive elongation is emerging as a critical step in gene expression. The number of diseases associated with proteins that play a role in implementing the pause or subsequent release into productive elongation is rapidly growing (46). Given this context, we focused on creating molecules that enable Pol II to surmount barriers to productive elongation at targeted genomic loci. At their core, these synthetic transcription elongation factors (Syn-TEFs) incorporate two distinct chemical moieties: (i) programmable DNA binders that target desired genomic loci and (ii) ligands that engage the transcription elongation machinery.

Pyrrole- and imidazole-based polyamides have emerged as a class of synthetic molecules that can be programmed to bind specific DNA sequences by using well-defined molecular recognition rules (7, 8). Recent examination of the genome-wide distribution of two polyamides designed to target different sequences revealed that these molecules are primarily enriched at genomic loci bearing clusters of binding sites (9). A “summation of sites” (SOS) model integrating the affinity of a given polyamide for all potential binding sites that occur within a ~400–base pair window best encapsulated the genome-wide binding preferences (9). Consistent with the SOS model, a polyamide previously designed to target a AAGAAGAAG site is enriched at repressive GAA microsatellite repeats within the first intron of frataxin (FXN) in a cell line derived from a Friedreich’s ataxia (FRDA) patient (10).

In FRDA cells, expanded GAA repeats are enriched in repressive histone marks and can also adopt uncommon DNA conformations that impede transcription (11, 12). The number of repeats positively correlates with both the extent of repression and the severity of disease (13, 14). The prevailing models in the field are that repressive chromatin and/or unusual DNA conformations present a barrier to the productive elongation of the FXN transcripts (11, 12, 1518). Efforts to reverse repressive chromatin marks with freely diffusing histone deacetylase inhibitors or the use of a polyamide intended to drive uncommon structures toward canonical B-form DNA conformation did not elicit sufficient FXN expression (10, 19). Therefore, we reasoned that a synthetic molecule capable of binding repressive GAA repeats and actively assisting productive elongation would restore FXN expression to levels observed in normal cells. A pivotal step in the transition of a paused Pol II into productive elongation is the recruitment of the positive transcription elongation factor b (P-TEFb). This complex contains the cyclin-dependent kinase 9 (CDK9), which phosphorylates multiple proteins, including Pol II, to facilitate transcription elongation (2, 5, 20).

To avoid perturbing CDK9 kinase activity, we focused on ligands of BRD4, a protein that binds acetylated histones and engages active P-TEFb at transcribed genes (20). Among BRD4 ligands, JQ1 has been extensively characterized and shown to competitively displace BRD4 from regulatory regions of the genome (21). JQ1 therefore functions as a broad-spectrum inhibitor of oncogene-stimulated transcription, and a chemical derivative is now in clinical trials (21). On the basis of its mechanism of action, we reasoned that tethering JQ1 to specific genomic loci would mitigate its global inhibitory properties and convert this molecule into a locus-specific stimulator of transcription. Moreover, rather than stimulating transcription initiation, we reasoned that JQ1-dependent recruitment of the elongation machinery across the length of the repressive GAA repeats would enable Pol II to actively overcome the barrier to transcriptional elongation across the silenced FXN gene.

To design bifunctional Syn-TEFs, we examined the crystal structures of the polyamide-nucleosome complex and the JQ1-BRD4 bromodomain complex and identified optimal sites for chemical conjugation (Fig. 1A) (21, 22). Polyamides PA1 and PA2 were conjugated to JQ1 to generate Syn-TEF1 and Syn-TEF2, respectively (Fig. 1B, figs. S1 and S2, and table S1) (10, 23). Genome-wide binding profiles confirm that the linear polyamide PA1 binds GAA repeats, whereas the hairpin polyamide PA2 targets an unrelated sequence (9). The ability of Syn-TEFs to stimulate expression of endogenous FXN was examined in GM15850 cells, a FRDA patient-derived cell line (Fig. 1C). In this lymphoblastoid cell line, FXN levels are reduced by ~90% compared with GM15851 cells from the patient’s healthy sibling (Fig. 1C and fig. S4). In a dose-dependent manner, Syn-TEF1 restored FXN expression in FRDA cells to the levels observed in healthy cells (Fig. 1C and fig. S5B). Syn-TEF2, which does not target GAA repeats, did not activate FXN expression in either cell line, demonstrating the requirement for sequence-specific DNA targeting (Fig. 1C). FXN transcripts that are induced by Syn-TEF1 function were spliced and translated to levels comparable to those in healthy cells (Fig. 1D).

Fig. 1 Synthetic transcription elongation factors (Syn-TEFs) selectively activate FXN expression.

(A) Co-crystal structures of JQ1 bound to BRD4 [Protein Data Bank (PDB) ID, 3MXF] and polyamide bound to nucleosomal DNA (PDB ID, 1M1A). The distance allowing interaction of these complexes is estimated. (B) Linear PA1 and Syn-TEF1 target the DNA sequence 5′-AAGAAGAAG-3′. Hairpin PA2 and Syn-TEF2 target 5′-WTACGTW-3′, where W is A or T. The structures of N-methylpyrrole (open circles), N-methylimidazole (filled circles), 3-chlorothiophene (squares), and β-alanine (diamonds) are shown. N-methylimidazole is bolded for clarity. The structure of JQ1 linked to polyethylene glycol (PEG6) is represented as a blue circle. The asterisk indicates the site where the R group attaches to the polyamide. (C) Relative expression of FXN mRNA in GM15850 (left) and GM15851 (right) cell lines by quantitative reverse transcription polymerase chain reaction. Results are means ± SEM (n = 4), normalized to the relative expression of FXN in GM15851 cells (fig. S4). All treatments were 24 hours with 1 μM of the indicated molecule, except dimethyl sulfoxide (DMSO, represented by the dash; 0.1%) and Syn-TEF1 (0.1, 0.5, or 1 μM). *P < 0.05; **P < 0.01. (D) Immunoblot of FXN and α-tubulin (TUB) with treated GM15850 (left) and GM15851 (right) cells. Cells were treated as in (C). (E) Volcano plots of RNA-seq data display the change in global gene expression after 24 hours of treatment of GM15850 (left) and GM15851 (right) cells with 1 μM Syn-TEF1 (n = 4). Values represent the posterior probability of equal expression (PPEE) versus fold change in expression normalized to DMSO-treated samples (n = 4). FXN and c-MYC are labeled in red and blue, respectively.

Fig. 2 Syn-TEF1 recruits BRD4 to its target sites and licenses productive Pol II elongation at FXN.

All data are from GM15850 cells treated for 24 hours with the indicated molecules. Signal traces are in reference-adjusted reads per million reads per base pair (rrpm/bp). (A) Summation of sites (SOS) profile of PA1 and Syn-TEF1 across the FXN locus. (B) BRD4 occupancy at the FXN locus. (C) Occupancy by phosphorylated serine 2 (phospho-Ser2) of the C-terminal domain of RNA Pol II at the FXN locus. (D) Occupancy of RNA Pol II at the 5′ region of FXN. The gray bar identifies the location of the repressive GAA repeats, and cyan regions highlight unannotated reads but do not have defined quantitative values. (E) Occupancy of H3K4me3 and H3K36me3 measured at several locations within the FXN locus. Results are means ± SEM (n = 3). (F) A model of the cascade of interactions and reactions initiated by Syn-TEF1 at FXN. CTD, C-terminal domain; S2P, phospho-Ser2.

To determine the specificity of action in the context of the entire transcriptome, we performed RNA sequencing (RNA-seq) in GM15850 and GM15851 cells (Fig. 1E, fig. S6, and tables S9 to S16). FXN was the transcript most significantly stimulated by Syn-TEF1 in the diseased cell line. The global transcriptome was minimally perturbed, with 11 genes differentially expressed by more than twofold (Fig. 1E and tables S9 and S23). Milder Syn-TEF1–dependent perturbations occurred for an additional 225 genes, of which as many as 29 coincide with known FRDA expression networks (table S24) (24). Parallel treatment of the control GM15851 cells did not alter FXN expression and elicited a comparably muted effect on the global transcriptome (Fig. 1E and table S10). This result starkly contrasts with the finding of 4091 genes whose levels were significantly perturbed by freely diffusing JQ1 (fig. S6E and table S11). Consistent with its antiproliferative properties (21), freely diffusing JQ1 down-regulates expression of the oncogene c-MYC. Because Syn-TEF1 targets JQ1 to specific genomic loci away from c-MYC, no change in c-MYC expression was observed in cells treated with this bifunctional molecule (Fig. 1E).

To determine whether Syn-TEF1 stimulates FXN by engaging the endogenous elongation machinery, we performed genome-wide chromatin immunoprecipitation analysis of BRD4, Pol II, and elongation-specific phosphorylated serine 2 (phospho-Ser2) marks on the largest subunit of Pol II. Given that polyamides bind to clustered sites in heterochromatin (9), we expected that Syn-TEF1 would localize to the first intron of FXN in disease cells that contain the GAA-repeat expansion, but not in healthy cells (Fig. 2A and fig. S7). Consistent with the SOS profile of Syn-TEF1 binding, BRD4 levels were markedly increased across the GAA-repeat expansion in FRDA cells (Fig. 2B). Because current algorithms remove sequencing reads that map to identical GAA repeats, the unannotated region (650 GAA repeats shown) is represented by a gap colored in blue in Fig. 2, A to D. Perhaps more striking is the profile of phospho-Ser2 marks placed on the productively elongating RNA Pol II (Fig. 2C and table S7). The peak of phospho-Ser2 enrichment is offset downstream of the BRD4 peaks, consistent with sequential action of P-TEFb and subsequent licensing of Pol II for productive transcription elongation. Owing to the mechanistic coupling of transcriptional processes, phospho-Ser2 marks were retained until termination, well beyond the point of BRD4 recruitment by Syn-TEF1 (Fig. 2, B and C). Unexpectedly, a promoter-proximal BRD4 peak overlapped with the paused Pol II upstream of the GAA repeats (Fig. 2D and fig. S8). Upon treatment with Syn-TEF1, the decrease in paused Pol II coincided with the increase in Pol II levels within the body of FXN, thus furnishing evidence for licensing of productive elongation (Fig. 2D and table S6). Furthermore, trimethylation of lysine 36 of histone H3 (H3K36me3), a signature of productive elongation, increased downstream of the GAA repeats in Syn-TEF1–treated cells (Fig. 2E). In support of enhanced elongation, a downstream shift in trimethylation of lysine 4 of histone H3 (H3K4me3), a promoter-proximal chromatin mark, was also observed (Fig. 2E). In agreement with previous reports, we did not observe a dramatic increase in H3K4me3 marks at the promoter upon Syn-TEF1 treatment (11, 16). Consistent with a barrier to elongation, tethering proteins that stimulate transcriptional initiation failed to enhance FXN expression, whereas tethering VP16 or derivatives that can stimulate elongation elicited modest FXN expression (25). As our results demonstrate, targeted recruitment of an elongation factor across the GAA repeats restores FXN expression in FRDA cells.

To further investigate the specificity of Syn-TEF1, we examined the enrichment of BRD4, Pol II, and phospho-Ser2 at Syn-TEF1–targeted genomic loci. These loci were rank-ordered by their affinity for Syn-TEF1 (Fig. 3A and table S22). Whereas the BRD4 enrichment profile correlated with the polyamide binding profile (compare Fig. 3, A and B), neither phospho-Ser2 nor Pol II showed any enrichment over a 10,000–base pair window centered on the polyamide-targeted genomic loci (Fig. 3B and fig. S10). Moreover, genome-wide binding of BRD4 was not perturbed by Syn-TEF1, whereas freely diffusing JQ1 markedly reduced BRD4 occupancy en masse (Fig. 3C and fig. S10B). This observation is congruent with the minimal impact of Syn-TEF1 on global transcriptome profiles in healthy and diseased cells (Fig. 1E). Thus, Syn-TEF1 displays regulatory properties distinct from those of inhibitors of BRD4 that globally disrupt the transition to transcription elongation and elicit cell cycle arrest in cancer cells (21). We next mapped Syn-TEF1–targeted genomic loci to transcription start sites of proximal genes and compared SOS scores with fold change in mRNA expression (Fig. 3D). In addition to the importance of a high SOS score proximal to a TSS, the results suggest that genes with substantial pausing of RNA Pol II [low licensing ratios (LRs)] respond to Syn-TEF1 (Fig. 3E). Taken together, the results demonstrate that in the absence of paused or stalled Pol II, simply recruiting BRD4 to a genomic locus does not elicit transcriptional initiation. This conclusion is supported by a recent study that delivered JQ1 to two endogenous promoters and enriched BRD4 at those loci but did not find an increase in targeted gene expression from either locus (26). The dependence on paused Pol II imposes mechanistic constraints on the function of DNA-tethered JQ1, and it serves as a valuable specificity filter to limit Syn-TEF1 function to FXN, with minimal perturbation of the global transcriptome.

Fig. 3 Syn-TEF1 recruits BRD4 to its target sites and selectively activates FXN.

All data are from GM15850 cells treated for 24 hours with the indicated molecules (1 μM). (A) Heatmap of the SOS profile of Syn-TEF1 across the top 250 predicted binding sites of PA1 (9, 31). (B) Heatmaps of BRD4, Pol II, and phospho-Ser2 occupancy across the same genomic loci as in (A). kb, kilobases. (C) Occupancy of BRD4 at BRD4 binding sites across the genome after treatment with Syn-TEF1 or PA1 and JQ1. ChIP, chromatin immunoprecipitation. (D) Scatterplot of SOS score versus distance of the predicted Syn-TEF1 binding site to the transcription start site (TSS) for the top 500 genes predicted to be targeted by Syn-TEF1. Each gene is shaded according to the change in gene expression after Syn-TEF1 treatment. (E) Scatterplot of the SOS score, change in gene expression (after Syn-TEF1 treatment), and licensing ratio (LR) of RNA Pol II for the top 500 genes predicted to be targeted by Syn-TEF1. FC, fold change.

Next, we examined the impact of Syn-TEF1 on primary cells and cell lines derived from more than 20 FRDA patients with different genetic backgrounds and different GAA-repeat expansions. In lymphoblastoid cells, fibroblasts, and induced pluripotent stem cells (iPSCs) derived from FRDA patients, Syn-TEF1 stimulated FXN expression, whereas the control molecules or treatments did not (Fig. 4, A to E). To examine the ability of Syn-TEF1 to stimulate FXN expression in disease-relevant cell types, we differentiated GM23913 pluripotent cells to cardiomyocytes (27). Upon differentiation, cardiomyocytes expressed cardiac-specific markers and displayed rhythmic beating in culture (Fig. 4, B and D, figs. S11 and S12, and movie S1). Syn-TEF1 robustly stimulated FXN expression in these cells, whereas JQ1, with or without PA1, led to cytotoxicity (Fig. 4C). Like cardiomyocytes, neurons are particularly vulnerable to a reduction in FXN expression (28). Sensory neurons derived from three iPSC lines (Fig. 4F and fig. S13) displayed evidence of Syn-TEF1–responsive production of processed mature FXN protein (Fig. 4E). In addition to cultured cells, primary peripheral blood mononuclear cells obtained from 11 FRDA patients were genotyped and treated in parallel with Syn-TEF1. FXN expression was stimulated by Syn-TEF1 in all but one sample from the 11 FRDA patients (Fig. 4G).

Fig. 4 Syn-TEFs activate FXN expression in primary patient cells and patient-derived fibroblasts, iPSCs, cardiomyocytes, sensory neurons, and mouse xenografts.

All treatments were 24 hours, except where specified. (A) Relative expression of FXN mRNA, normalized to GAPDH in three lymphoblastoid cell lines derived from three different FRDA patients. All treatments were 1 μM (means ± SEM; n = 3). *P <0.05; **P < 0.01. (B) Expression of cell type–specific markers in GM23913 iPSCs or iPSC-derived cardiomyocytes after treatment with 0.1% DMSO (means; n = 2). (C) Syn-TEF1–dependent induction of FXN mRNA in GM23913-derived cardiomyocytes (60 hours of treatment; means; n = 2). (D) Immunohistochemistry of GM23913 iPSCs and iPSC-derived cardiomyocytes. iPSCs were fixed and stained with OCT4 and SOX2. iPSC-derived cardiomyocytes were fixed and stained with TNNT2 and MYL2. Scale bars, 100 μm. (E) Immunoblot of FXN and β-actin (β-ACT) after treatment of three different primary FRDA fibroblasts, fibroblast-derived iPSCs, and sensory neurons with the indicated molecules. Fibroblasts were collected from patients UAB4259 (550/1000), UAB4230 (1000/1200), and UAB66 (90/1025). Cells were treated for 72 to 96 hours. (F) Immunohistochemistry of FRDA patient–derived iPSCs and iPSC-derived sensory neurons. iPSCs were fixed and stained with OCT4 and SSEA-4. Sensory neurons were fixed and stained with neuronal markers CGRP and MAP-2. Scale bars, 100 μm. (G) (i) Genotyped repeats and (ii) relative expression of FXN mRNA normalized to GAPDH in peripheral blood mononuclear cells (PBMCs) from 11 patients after 24 hours of treatment with 1 μM Syn-TEF1. (H) Bioluminescent images of two representative mice harboring xenografts (HEK293 FXN-Luc with six and ~310 GAA repeats in the left and right flanks, respectively) (29). Mice were treated with either vehicle (DMSO) or 0.5 nmol Syn-TEF1 administered subcutaneously into each tumor (1 nmol total per mouse). Mice were imaged 22 hours after treatment. (I) Relative expression of FXN-Luc with six or ~310 GAA repeats after Syn-TEF1 treatment of mice as described in (H). Results are means ± SEM (n = 4 and 3 Syn-TEF1–treated and DMSO-treated mice, respectively). *P < 0.05. (J) Aconitase activity in GM16214 lymphoblastoid cells after 72 hours of treatment with DMSO (0.1%), PA1 (125 nM), or Syn-TEF1 (62.5 or 125 nM). Aconitase activity was normalized to GM16215 cells. Results are means ± SEM (n = 3).

To examine the utility of Syn-TEF1 in restoring FXN levels in vivo, we transplanted human cells bearing a luciferase reporter fused in frame within the fifth exon of FXN into the flanks of immunocompromised mice (Fig. 4, H and I). The first reporter cell line contained only six GAA repeats, and the second reporter cell line contained ~310 GAA repeats (29). Consistent with cell culture results (fig. S16) (29), we observed reduced FXN-Luc levels in transplanted cells bearing ~310 GAA repeats (Fig. 4, H and I). Upon Syn-TEF1 treatment, luciferase expression was stimulated in the cells with ~310 GAA repeats, nearly restoring levels to those seen in the reporter cell line with six repeats (Fig. 4, H and I). As evidence of recovery of mitochondrial function, we observed ~90% recovery of aconitase activity in patient-derived lymphoblastoid cells that were treated with Syn-TEF1 (Fig. 4J). Taken together, our results showed that in multiple cell types from 20 FRDA patients with a broad range of repeat expansions and diverse genetic backgrounds, the prototype synthetic transcription elongation factor stimulated FXN expression and restored biological function.

Syn-TEF1 meets key design criteria, including the ability to (i) target desired loci in the genome, (ii) access cognate sites in repressive chromatin, (iii) engage the endogenous elongation machinery, and (iv) license productive transcriptional elongation by a paused Pol II. In essence, our prototype Syn-TEF defines a general framework for the design of a class of molecules that could act on a diverse array of diseases caused by different stages of transcriptional dysfunction, especially at unstable microsatellite repeats (30). The growing mechanistic understanding of gene regulation, and how it goes awry in disease, offers new opportunities for intervention and remediation with precision-tailored synthetic molecules.

Supplementary Materials

Materials and Methods

Figs. S1 to S16

Tables S1 to S24

References (3144)

Movie S1

References and Notes

Acknowledgments: We thank members of the Ansari laboratory, especially D. Bhimsaria, and members of the Raines laboratory, especially T. Smith, for helpful discussions; L. Vanderploeg and E. N. Korkmaz for help with figures; J. Thomson, R. Stewart, J. Bolin, and S. Swanson for help with RNA-seq; A. Kumar, S. Mohapatra, and D. Waseem for early experiments; R. Dolmetsch and A. Kaykas for facilitating the collaboration with the Neuroscience group at NIBR; and S. Brahmachari and A. Agarwal for facilitating the collaboration with the Ataxia group at IGIB. The NIBR team thanks M. Napierala [University of Alabama at Birmingham (UAB)], D. Lynch [Children’s Hospital of Philadelphia (CHOP)], and FARA (Friedreich’s Ataxia Research Alliance) for making the UAB Friedreich’s ataxia patient lines available for use. Friedreich’s ataxia patient fibroblasts and iPSC-derived sensory neurons were obtained under a material transfer agreement with CHOP. We thank R. Wade-Martins (University of Oxford) for the luciferase reporter cell lines. This work was supported by NIH grants CA133508, GM117362, and HL099773 and a W. M. Keck Medical Research Award to A.Z.A. and by NIH grant P30 AR066524 to D.N.S. G.S.E. was supported by NIH grant GM07215 and a Peterson Fellowship. M.P.G. was supported by a Hilldale scholarship. A.A. was supported by an Indo-US Postdoctoral Fellowship from the Science and Engineering Research Board of the Government of India. RNA-seq and ChIP-seq data are deposited in the National Center for Biotechnology Information Gene Expression Omnibus under GSE99403. A.Z.A., G.S.E., and M.P.G have filed patent applications relating to the work in this manuscript, including U.S. provisional patent applications 62/478291 and 15/472852, filed 30 March 2016 by the Wisconsin Alumni Research Foundation.
View Abstract

Navigate This Article