PerspectiveMolecular Biology

Epigenetic Islands in a Genetic Ocean

See allHide authors and affiliations

Science  09 Nov 2012:
Vol. 338, Issue 6108, pp. 756-757
DOI: 10.1126/science.1227243

DNA methylation denotes the addition of a methyl group to DNA, which in eukaryotes occurs predominantly at cytosines that are adjacent to guanine (CG). Because methylation does not alter the DNA sequence, it is referred to as an epigenetic mark. The sequence symmetry of the CG dinucleotide enables propagation of the methyl mark through cell division in a process that is mechanistically well understood (1). This inheritability makes DNA methylation highly attractive as a potential means to store information in a form of epigenetic memory that regulates genes over developmental processes or in response to environmental conditions. However, it has proven difficult to substantiate this function because it requires showing not only that a DNA methylation pattern coincides with a particular transcriptional state, but more importantly, that it controls it.

DNA methylation appears to be obligatory in plants and vertebrates, but it is present in only a subset of fungi and insects, and its deposition pattern varies considerably among clades. Plants and lower chordates show DNA methylation primarily at genes and repetitive sequences, whereas methylation in vertebrate genomes can be almost anywhere with the exception of certain regulatory regions (2, 3). A further complication is the nonuniform distribution of CG dinucleotides in mammalian genomes (2). CG-rich regulatory regions called CG islands are mostly unmethylated, but are efficiently repressed when methylated. Yet, CG islands make up only ∼1% of our genome; for the remaining 99% that contains fewer CGs, the consequence of DNA methylation for gene regulation remains largely unclear.

Prominent examples of gene repression by DNA methylation are the CG island promoters of imprinted genes, where alleles are differentially methylated, bound by different factors, and expressed depending on the parental origin (4). At methylated CG islands, the sheer density of CGs could mediate repression by, for example, recruiting locally large amounts of methyl-CG–binding domain–containing proteins, which can recognize methylcytosine regardless of the surrounding sequence (2). Further experimental support for methylated CG islands causing gene repression comes from genetic deletions of de novo methyl-transferases, which showed that a subset of genes become reactivated in the absence of methylation at linked islands (5, 6).

It is also possible that certain transcription factors are sensitive to DNA methylation within their binding sites (see the figure), as has been suggested for MYC and YY1 (7, 8). In these cases, methylation of gene regulatory regions such as promoters and enhancers could repress gene expression by inhibiting transcription factor binding and stabilizing the off state even if these sequences are within a CG-poor region.

Recent advances in DNA sequencing have made it possible to measure cytosine methylation at base-pair resolution across entire genomes (9, 10). The resulting methylomes of mouse and human cells have revealed new details on the genomic location of DNA methylation, particularly at CG-poor regions. Intriguingly, CG-poor distal regulatory regions show reduced amounts of DNA methylation, but only when active and occupied by transcription factors (see the figure). Although this links DNA methylation with enhancer activity, the limited experimental data available to date indicate that transcription factor binding occurs before changes in DNA methylation, thus arguing against a regulatory role for methylation in enhancer activity (1114). While it is unclear if these findings can be generalized, it is nevertheless likely that many changes in DNA methylation at CG-poor sequences are a consequence of gene regulation rather than its cause. These observations further predict that DNA sequence variation between individuals translates into changes in methylation if they occur within regulatory regions (see the figure). Indeed, a substantial amount of the DNA methylation differences between individuals might be caused by such underlying genetic variation and thus, by definition, would not be epigenetic.

The analysis of high-resolution methylomes also exposed differences in methylation between exons and introns, suggesting that DNA methylation could be involved in regulating splicing (15). Although differential splicing is clearly linked to the elongation speed of the RNA polymerase (16), there is very little experimental evidence that methylation could modulate this (3). Furthermore, the observed differences in DNA methylation between exons and introns cannot readily be detected at individual genes but only become evident when the average methylation of thousands of exons and introns is calculated. It remains unclear how such subtle difference would have an effect at individual genes.

Available base-pair–resolution methylomes are limited by the static picture of DNA methylation that they provide and are uninformative with respect to the stability of the marks over time. Waves of active demethylation occur in the germ line and early embryo (17), and demethylation of specific loci has been reported in association with differentiation of somatic cells. However, active demethylation remains an unresolved issue because many mechanisms are theoretically possible, and because demethylation can also occur passively by lack of maintenance over cell division (18).

The recent discovery of the ten-eleven translocation (TET) family of enzymes (19, 20) that oxidize 5-methylcytosine, however, provides a convincing mechanism for active demethylation and an opportunity to identify sites of action, because the product of the enzymatic process (5-hydroxymethylcytosine) can be detected in several ways, even at the level of individual bases (21, 22). TET proteins are present in nondividing somatic cells, as is hydroxymethylcytosine, suggesting that active turnover of DNA methylation could happen in every cell. This active demethylation does not seem to occur at random. For example, TET1 localizes preferentially to unmethylated CG islands (23). It is unclear whether TET proteins actively generate these unmethylated states or if they safeguard them by removing any aberrantly set mark. It is further unclear whether specific proteins recognize hydroxymethylation or other demethylation products and thus if these modifications function in gene regulation. Elucidating the kinetics and sites of DNA methylation turnover is also crucial for models of epigenetic memory. Indeed, how can DNA methylation store information at a genomic site where it is constantly turned over?

Genetics and epigenetics of transcription factor binding.

(A) Binding of a transcription factor in CG-poor regions leads to a local unmethylated state. (B) Mutations in the binding site prevent binding and result in increased methylation. (C) Some transcription factors could be sensitive to methylation even in CG-poor regions. (D) Transcription factor binding in a CG-rich area (CG island) requires the region to be unmethylated, and (E) can be blocked if methylated. (Hexagon) Transcription factors, (black circles) methylated CG, (white circles) unmethylated CG.

Large epigenomics initiatives are currently generating base-pair methylomes in multiple tissues (24), which should create the needed quantitative framework to better link genotype and epigenotype. It can be expected that a large part of methylation is likely hard-wired within the DNA sequence for the reasons pointed out above (11, 13, 14, 25). Understanding this wiring will be essential to correctly interpret methylation maps and to further discriminate between sites where methylation changes are a consequence of transcription factor binding, and sites where methylation is instructive for gene regulation.



Navigate This Article