Report

A DNA methylation reader complex that enhances gene transcription

See allHide authors and affiliations

Science  07 Dec 2018:
Vol. 362, Issue 6419, pp. 1182-1186
DOI: 10.1126/science.aar7854

DNA methylation promotes transcription

DNA methylation generally represses transcription, but in some instances, it has also been implicated in transcription activation. Harris et al. identified a protein complex in Arabidopsis that is recruited to chromatin by DNA methylation. This complex specifically activated the transcription of genes that are already mildly transcribed but had no effect on transcriptionally silent genes such as transposable elements. The complex thereby counteracts the repression effect caused by transposon insertion in neighboring genes while leaving transposons silent. Thus, by balancing both repressive and activating transcriptional effects, DNA methylation can act to fine-tune gene expression.

Science, this issue p. 1182

Abstract

DNA methylation generally functions as a repressive transcriptional signal, but it is also known to activate gene expression. In either case, the downstream factors remain largely unknown. By using comparative interactomics, we isolated proteins in Arabidopsis thaliana that associate with methylated DNA. Two SU(VAR)3-9 homologs, the transcriptional antisilencing factor SUVH1, and SUVH3, were among the methyl reader candidates. SUVH1 and SUVH3 bound methylated DNA in vitro, were associated with euchromatic methylation in vivo, and formed a complex with two DNAJ domain-containing homologs, DNAJ1 and DNAJ2. Ectopic recruitment of DNAJ1 enhanced gene transcription in plants, yeast, and mammals. Thus, the SUVH proteins bind to methylated DNA and recruit the DNAJ proteins to enhance proximal gene expression, thereby counteracting the repressive effects of transposon insertion near genes.

DNA methylation frequently marks transposable elements (TEs) in eukaryotic genomes (13). In plants, the RNA-directed DNA methylation (RdDM) pathway is responsible for the initial establishment of methylation in CG, CHG, and CHH contexts (4). TE insertions can exert a transcriptional effect on neighboring genes (58), and promoter methylation is typically associated with gene repression (9). However, exceptions exist where promoter methylation is required for gene expression (1014). The downstream factors that perceive methylation to mediate these divergent transcriptional effects are still poorly characterized, and little is known of how methylation can stimulate gene transcription.

To identify proteins in Arabidopsis thaliana that recognize methylated DNA, we incubated nuclear extract from floral bud tissue with either methylated or unmethylated biotinylated double-stranded DNA oligonucleotides, affinity purified the DNA, and subjected the associated proteins to high-resolution mass spectrometry followed by label-free comparative analysis (15) (fig. S1). We used DNA sequences that are naturally methylated in vivo and two distinct DNA sequences for each of the CG, CHG, and CHH methylation contexts (fig. S2). A total of 41 proteins were significantly methyl enriched in at least one pull-down assay, including many candidates with known or predicted methyl-binding activity involved in gene silencing and methylation control (fig. S3). By requiring that candidates be significantly enriched in both DNA sequences for each of CG, CHG, and CHH, we obtained a stringent list of 10 candidates (Fig. 1A). Of these, relatively little is known about the role of the highly related SUVH1 and SUVH3 proteins (16) or the DNAJ proteins.

Fig. 1 Comparative interactomics identifies methyl reader proteins.

(A) Heatmap of methyl-binding preferences for proteins identified as significantly enriched in two different underlying DNA sequences per methyl-cytosine (mC) context (mCG, mCHG, mCHH). NA, the protein was not detected. FWA, MEA, SDC, and SUP represent four in vivo-methylated loci. Probes are listed in fig. S2. (B) FP binding assays to quantify the interaction of SUVH1 with methylated or unmethylated probes in CG, CHG, and CHH contexts (left) or an amino acid change version, SUVH1Y277A, predicted to abrogate methyl binding (18) (right). Binding affinities are indicated by dissociation constants (Kd) values. Error bars represent SEM of technical replicates. The data are representative of two independent experiments.

Recently, SUVH1 was isolated from an antisilencing screen and was shown to promote the expression of promoter methylated genes (17). As SUVH1 and SUVH3 contain a SET- and RING-associated (SRA) domain (18), they are predicted to bind methylated DNA directly. Using fluorescence polarization (FP) and microscale thermophoresis (MST), we confirmed an SRA-dependent methyl-binding preference for recombinant SUVH1 and SUVH3 proteins, respectively, in CG, CHG, and CHH contexts (Fig. 1B and fig. S4). Chromatin immunoprecipitation sequencing (ChIP-seq) of transgenic lines expressing FLAG-tagged SUVH1 or SUVH3 showed that their localization was essentially identical (fig. S5A) and that they colocalized with CHH methylation deposited by the RdDM pathway (Fig. 2A and fig. S5B). SUVH1 and SUVH3 displayed enrichment directly over NRPE1 sites (19) [the largest subunit of the RdDM component RNA polymerase V (Pol V)] (Fig. 2B and fig. S5C) and showed preferential localization over short TEs and at the edges of long TEs (Fig. 2C and fig. S5D), which are hallmarks of RdDM localization (20, 21). There was a positive correlation between SUVH1 and SUVH3 enrichment and RdDM-deposited CHH methylation (mCHH) at both local and genome-wide scales (fig. S5, E to H). Using random forest regression, we observed that mCHH was the strongest predictor for SUVH1 binding in vivo (Fig. 2, D and E).

Fig. 2 SUVH1 is recruited by RdDM-associated mCHH.

(A) SUVH1 enrichment at loci defined by loss of methylation (hypomethylation). Differentially methylated regions (DMRs) in mutant genotypes are indicated. The DRM1 and DRM2 methyltransferases are responsible for mCHH at RdDM target sites, while mCG, mCHG, and heterochromatic mCHH are maintained by MET1, CMT3, and CMT2, respectively (1). *, met1 hypo CG DMRs that overlap with drm1/2 hypo CHH DMRs were removed. (B) SUVH1 enrichment at NRPE1 peaks. (C) SUVH1 enrichment at NRPE1-associated short (<500-bp) vs. long (>5-kb) TEs. (D) Relative importance of genomic features in predicting SUVH1 binding, based on the random forest regressor algorithm. Error bars represent SEM from five random permutations of the training set. (E) Area under receiver–operating characteristic curves (AUC) model accuracy using all features (left) vs. accuracy using mCHH alone (right). (F) Boxplot of SUVH1 enrichment in suvh1, nrpe1, nrpd1, and drm1 drm2 mutant backgrounds at SUVH1 peaks. (G) Scatterplot of SUVH1 over SUVH1Y277A enrichment vs. mCHH methylation percentage at SUVH1 peaks. Line of best fit is shown in blue, with adjusted R2 and P values indicated. Data in the lower panel indicate kernel density for mCHH. Average methylation levels and enrichment are calculated from the 200-bp regions surrounding the peak summits.

The nearly perfect colocalization of SUVH1 with RdDM sites predicts that RdDM pathway mutants might reduce SUVH1 occupancy. ChIP-seq of SUVH1 in nrpd1, nrpe1, or drm1/2 RdDM mutant backgrounds (4) showed that SUVH1 enrichment was essentially eliminated (Fig. 2F and fig. S6). To exclude the possibility that interaction with RdDM proteins, rather than DNA methylation itself, was responsible for SUVH1 recruitment, we compared ChIP-seq results for an SRA domain amino acid change mutant [with tyrosine-277 mutated to alanine (Y277A)] that abrogated methyl binding, SUVH1Y277A (Fig. 1B). Indeed, SUVH1Y277A showed highly reduced recruitment and association with CHH methylation (Fig. 2G and fig. S7).

Whole-genome bisulfite sequencing (WGBS) revealed that SUVH1 ChIP-seq peaks were characterized by local CHH methylation maxima and that in suvh1, suvh3, and double mutant suvh1 suvh3 plants, methylation levels were unperturbed (17) (fig. S8A). This indicated that SUVH1 and SUVH3 are not required for methylation maintenance and act strictly as methyl readers. RNA sequencing (RNA-seq) of suvh1, suvh3, and suvh1 suvh3 confirmed many of the previously identified (17) promoter methylated genes that require SUVH1 for expression (fig. S8B) and showed reduced expression at genes proximal to RdDM sites (22) (fig. S8C).

SUVH1 and SUVH3 might enhance transcription by directly impacting chromatin (18), as both encode SET domains of the SU(VAR)3-9 family that typically methylate histone H3’s lysine-9 (23). However, we were unable to detect histone methyltransferase (HMT) activity in vitro (fig. S9) or changes in dimethylation of histone 3 lysine-9 (H3K9me2) levels in suvh1 suvh3 mutants in vivo (17) (fig. S10). Furthermore, SUVH1Y524F and SUVH1Y638F predicted HMT catalytic mutants (18), but not the SUVH1Y277A methyl-binding mutant, were able to complement suvh1, indicating that HMT activity is nonessential for function in vivo (fig. S11). Chromatin accessibility, as profiled by ATAC-seq (a sequencing technique based on an assay for transposase-accessible chromatin), was also unchanged in suvh1 suvh3 mutants (fig. S12).

Next, we assessed whether SUVH1 and SUVH3 might enhance transcription by acting as a recruitment platform (24). Immunoprecipitation followed by mass spectrometry (IP-MS) of SUVH1 and SUVH3 identified that each pulled down the other and also DNAJ1 and DNAJ2 (Figs. 1A and 3A and fig. S13). IP-MS of DNAJ1 and DNAJ2 showed that each of these pulled down the other and also SUVH1 and SUVH3 (Fig. 3A and fig. S13), indicating that SUVH1, SUVH3, DNAJ1, and DNAJ2 interact in vivo. We confirmed the interactions between SUVH1 and SUVH3 with DNAJ and DNAJ2 by coimmunoprecipitation in Nicotiana benthamiana and in yeast two-hybrid assays (figs. S14 and S15). To assess the strength of the interaction, we expressed all four proteins in the same bacterial cell and performed affinity purification of either SUVH1 or SUVH3, finding that both DNAJ1 and DNAJ2 remained associated even under 500mM NaCl conditions (fig. S16).

Fig. 3 SUVH1, SUVH3, DNAJ1, and DNAJ2 interact, colocalize, and are required for the expression of proximal genes.

(A) IP-MS results for tagged lines. Only proteins present in each of the four transgenic [but not wild-type (WT)] pulldowns are presented. NSAF, normalized spectral abundance factor, averaged from two biological replicates. (B) Representative browser track showing ChIP-seq of SUVH1, SUVH3, DNAJ1, and DNAJ2 (normalized reads, FLAG-tagged versions minus WT) (top four lines) and methylation fraction (bottom three lines) at a methylated locus. (C) Pearson’s correlation of genome-wide ChIP-seq profiles at 1-kb resolution. H3K23ac from (20) was used as an outgroup control. (D) Scatterplot of FPKM fold change over WT of dnaj1 dnaj2 double vs. suvh1 suvh3 double at genes that were differentially expressed in suvh1 suvh3. Line of best fit is shown in red, with adjusted R2 and P values indicated. (E) Boxplot of expression change for genes proximal to SUVH1 binding sites. n, number of genes. *P < 0.05 (Mann-Whitney test).

DNAJ1 and DNAJ2 lack any discernible methyl-binding domain, but they are robustly associated with SUVH1 and SUVH3, suggesting that SUVH1 and SUVH3 may be responsible for recruiting DNAJ1 and DNAJ2 to methylated DNA (Fig. 1A). We repeated a CHH context pulldown experiment with suvh1 suvh3 and dnaj1 dnaj2 double mutant plants. DNAJ1 and DNAJ2 were no longer associated with methyl-DNA in suvh1 suvh3, while SUVH1 and SUVH3 methyl-DNA binding was unaffected in dnaj1 dnaj2 (fig. S17). Thus, SUVH1 and SUVH3 are required to recruit DNAJ1 and DNAJ2 to methylated DNA. We performed ChIP-seq of DNAJ1 and DNAJ2 and found a tight genome-wide correlation with SUVH1 and SUVH3 (Fig. 3, B and C, and fig. S18, A and B). As with suvh1 suvh3, there was no effect on DNA methylation levels in dnaj1 dnaj2 mutants, consistent with a downstream reader function (fig. S18C). To assess whether DNAJ1 and DNAJ2 are required for the transcriptional enhancement activity of SUVH1 and SUVH3, we performed RNA-seq on dnaj1, dnaj2, and double mutant dnaj1 dnaj2 plants. The dnaj1 dnaj2 transcriptome was strongly positively correlated with that of suvh1 suvh3 (Fig. 3D and fig. S19), and RdDM proximal genes showed reduced expression in both suvh1 suvh3 and dnaj1 dnaj2 double mutants (fig. S20). ROS1 is one of the few loci known to require methylation for expression (11, 12), and indeed we observed reduced expression of ROS1 in both the suvh1 suvh3 and dnaj1 dnaj2 backgrounds, despite methylation levels being maintained (fig. S21). Furthermore, genes with promoters proximal to SUVH1 peaks generally showed reduced expression in both the suvh1 suvh3 and dnaj1 dnaj2 double mutants (Fig. 3E). Together, these data indicate that DNAJ1 and DNAJ2 interact with SUVH1 and SUVH3, are recruited to sites of RdDM, and promote the expression of proximal genes.

The yeast two-hybrid experiments revealed that binding domain (BD)-fused DNAJ1 induced expression of the reporter even when cotransformed with an unfused activation domain construct (fig. S15). This suggested that DNAJ1 alone may be sufficient to stimulate expression of the reporter, which we confirmed in a yeast one-hybrid assay (fig. S22A). We fused DNAJ1 to a zinc finger protein (ZF108) (24) behind the UBQ10 promoter and cotransformed it into N. benthamiana with a reporter construct containing either the ZF108 target site or a scrambled target site in the promoter region. Expression of the ZF108 target reporter was increased by approximately threefold above that of the scrambled promoter (fig. S22B). To assess whether DNAJ1 can function in a mammalian context (25), we transfected N2a cells and found that Gal4 DNA-binding domain (Gal4BD)–fused DNAJ1 was able to stimulate transcription of the reporter by 5- to 10-fold (fig. S22C).

Next, we generated stable transgenic A. thaliana lines using the UBQ10::ZF108-DNAJ1 construct. The first-generation independent transgenic lines displayed severe morphological defects (fig. S23). RNA-seq and ChIP-seq (Fig. 4A) on these UBQ10::ZF108-DNAJ1 lines showed that up- but not down-regulated genes were significantly enriched for overlap with ZF108-DNAJ1 ChIP-seq peaks (observed over expected = 2.26, hypergeometric test P = 7.7e−71) (fig. S24). As controls, we generated UBQ10::ZF108-YPET and UBQ10::DNAJ1 (without ZF108) transgenic plants and found no morphological defects or transcriptional changes associated with ZF108 peaks, indicating that neither ZF108 binding nor DNAJ1 overexpression was sufficient to cause the transcriptional defects observed (fig. S24). In addition, bulk levels of RNA were increased over ZF108-DNAJ1 peaks (n = 4951), and there was a clear promoter proximal effect on transcription (Fig. 4, B to D). In contrast, neither up- nor down-regulated gene sets showed an association with TEs or RdDM sites, indicating that ZF108-DNAJ1 acts primarily at ectopic locations driven by ZF108 binding (fig. S25). Together, these data showed that recruitment of DNAJ1 increases the expression of proximal neighboring genes.

Fig. 4 ZF108-DNAJ1 transcriptionally activates mildly expressed proximal loci.

(A) Browser track showing the ZF108-DNAJ1 ChIP-seq profile at FWA. The red arrow indicates the genomic location of the designed ZF108 target binding site. (B) Metaplot of expression change, centered on ZF108-DNAJ1 vs. random peaks. (C) Boxplot of expression changes for genes with promoters proximal to ZF108-DNAJ1 binding sites. n, number of genes. *P < 0.05 (Mann-Whitney test). (D) Observed over expected ratio for overlap of ZF108-DNAJ1 sites with up- or down-regulated ZF108-DNAJ1 gene promoters. (E) Boxplot of expression change for genes that overlap with ZF108-DNAJ1 peaks (upper panel), arranged by ascending WT expression decile (lower panel). Genes that lacked expression in both genotypes were removed. *P < 0.05 (Mann-Whitney test).

Given that SUVH1, SUVH3, DNAJ1, and DNAJ2 are localized at RdDM sites, including many TE sequences, an interesting paradox is what prevents TEs themselves from being reactivated. FWA, the gene that ZF108 was designed to target (24), is stably silent in wild-type plants and experienced no transcriptional up-regulation in transgenic plants, despite clear localization of ZF108-DNAJ1 to FWA (Fig. 4A and fig. S26). We reasoned that the transcriptional enhancement effect of DNAJ1 may be limited to genes that are already expressed, as opposed to traditional transcriptional activator proteins, such as VP16, that can activate transcription of stably silent genes (26). Parsing the ZF108-DNAJ1 overlapping genes into expression deciles revealed that only genes with moderate expression in the wild type, but not those in the lowest or higher expression deciles, experienced transcriptional assistance (Fig. 4E). This provides a simple explanation for the paradox, as only proximal expressed genes would be affected, leaving TEs silent.

We propose that SUVH1 and SUVH3 in complex with DNAJ1 and DNAJ2 evolved to counteract the repressive effect of TE insertion near genes (8, 27, 28), thereby facilitating access to the gene regulatory diversity provided by TE proliferation (2931). This is consistent with SUVH1, SUVH3, DNAJ1, and DNAJ2 being recruited downstream of the RdDM pathway, which is known to target evolutionarily young TEs and to cause mild repression of genes near TEs (22). The complex of SUVH1, SUVH3, DNAJ1, and DNAJ2 also reveals a potential mechanism to explain examples of methylation-dependent gene expression (1113). Overall, these findings shed light on how methylation can act to fine-tune gene expression by balancing both repressive and activating transcriptional effects.

Supplementary Materials

www.sciencemag.org/content/362/6419/1182/suppl/DC1

Materials and Methods

Figs. S1 to S26

References (3260)

References and Notes

Acknowledgments: We thank S. Feng and M. Akhavan for the high-throughput sequencing performed at the UCLA Broad Stem Cell Research Center BioSequencing Core Facility. We thank J. A. Long for advice on nuclear isolation, M. F. Carey for providing plasmids for N2a transfections, and Y. Ma, J. Appell, J. Zhao, A. Thai, J. Nail, G. Anigol, R. Sahu, and A. Desouza for technical assistance. Funding: This work was supported by grants NIH R01 GM60398 (to S.E.J.), NIH R01 GM089778 (to J.A.W.), and NIH R35 GM124736 (to S.B.R), by an EMBO Long-Term Fellowship (ALTF 1138-2014) (to C.J.H), and by a Ruth L. Kirschstein National Research Service Award (GM007185) (to L.Y.). S.E.J. is an investigator of the Howard Hughes Medical Institute. Author contributions: F.B. and S.E.J. conceived the study; C.J.H., M.S., J.A.W., J.D., S.B.R., F.B., and S.E.J. designed the research; C.J.H., S.P.W., Y.X., L.Y., J.G.B., and M.G. performed the experiments; M.S. performed the comparative interactomics; W.L. performed the random forest regression analysis; E.M.C. and R.M.V. performed the FP assays; X.L. and W.C. performed MST assays; W.D.B. and S.R. performed the mass spectrometry from immunoprecipitated samples; C.J.H., W.L., and Z.Z. performed bioinformatic analysis; C.J.H and S.E.J. wrote the paper. Competing interests: The authors declare no competing interests. Data and materials availability: The high-throughput sequencing data generated in this paper have been deposited in the Gene Expression Omnibus (GEO) database (GSE108414).
View Abstract

Navigate This Article