Review

Molecular Signals of Epigenetic States

See allHide authors and affiliations

Science  29 Oct 2010:
Vol. 330, Issue 6004, pp. 612-616
DOI: 10.1126/science.1191078

Abstract

Epigenetic signals are responsible for the establishment, maintenance, and reversal of metastable transcriptional states that are fundamental for the cell’s ability to “remember” past events, such as changes in the external environment or developmental cues. Complex epigenetic states are orchestrated by several converging and reinforcing signals, including transcription factors, noncoding RNAs, DNA methylation, and histone modifications. Although all of these pathways modulate transcription from chromatin in vivo, the mechanisms by which epigenetic information is transmitted through cell division remain unclear. Because epigenetic states are metastable and change in response to the appropriate signals, a deeper understanding of their molecular framework will allow us to tackle the dysregulation of epigenetics in disease.

Adaptation to environmental changes and cell specialization in multicellular organisms require a complex orchestration of the transcriptional output of the genome. From the simplest prokaryote to the most sophisticated human neuron, cells have evolved forms of molecular memory of past stimuli that can often be transmitted through cell division. The maintenance of cell identity in multicellular organisms constitutes a classic example of such inheritable cellular memory: Starting from the same zygotic genome, subsets of progeny cells become engaged in distinct programs of gene expression that dictate their developmental trajectory and specific functions. Typically, cell identities are maintained for a lifetime, even when the differentiation signal was experienced only once, during embryonic development (1). This is no trivial achievement, as a complex pattern of gene expression must be faithfully transmitted to each progeny cell upon division.

We use the term epigenetics to classify those processes that ensure the inheritance of variation (“-genetic”) above and beyond (“epi-”) changes in the DNA sequence (Box 1). Unlike genetic alleles, epi-alleles do not differ in their DNA sequence; the epigenetic information resides in self-propagating molecular signatures that provide a memory of previously experienced stimuli, without irreversible changes in the genetic information. The nature of these molecular signatures and the manner by which they initiate, maintain, and reverse epigenetic states is the subject of this Review.

Epigenetic Signals in cis and trans

As long as a transcriptional response is self-sustaining in the absence of the originating stimulus, it can be categorized as epigenetic. This can be achieved by self-propagating, trans-acting mechanisms or by cis-acting molecular signatures physically associated with the DNA sequence that they regulate (Fig. 1).

Fig. 1

Cis and trans epigenetic signals. (A) Trans epigenetic signals (yellow circles) are transmitted by partitioning of the cytosol during cell division and maintained by feedback loops. As an example, a simple regulatory loop in which the epigenetic signal induces its own expression is shown here. (B) Cis signals (yellow flags) are molecular signatures physically associated with the DNA and inherited via chromosome segregation during cell division.

Box 1

Epigenetics, What’s in a Name?

The term “epigenetics” was coined by C. H. Waddington in the 1940s, fusing the word “genetics” with “epigenesis,” the latter indicating the theory by which the adult form develops from the embryo through gradual steps, as opposed to being fully pre-formed in the zygote. Waddington intended to found a discipline to study the genetic control of developmental processes, merging the fields of embryology and genetics (82). More than a decade later, D. L. Nanney defined “epigenetic systems” as “auxiliary mechanisms … involved in determining which specificities [genes] are to be expressed in any particular cell.” Nanney also warned that “[c]ellular memory is not an absolute attribute” and that using inheritance to define epigenetics might undermine the full understanding of the molecular pathways involved, which may also stabilize transcription patterns in nondividing cells (83). With the discovery of inheritable patterns of DNA methylation, the idea that epigenetic traits were inherited as regulatory signals in addition to genetic information quickly took hold, and the definition of epigenetics became “the study of mitotically and/or meiotically heritable changes in gene function that cannot be explained by changes in DNA sequence” (84).

The discovery of the regulatory role of histone posttranslational modifications and their correlation with transcriptional states promoted a looser use of epigenetics, to mean any molecular signature found on chromosomes (especially histone marks) and broadly defined as “the structural adaptation of chromosomal regions so as to register, signal or perpetuate altered activity states” (85). At the same time, a more narrow definition has been proposed as “a stably heritable phenotype resulting from changes in a chromosome without alterations in the DNA sequence” (86).

Here, we intend epigenetics to signify “the inheritance of variation (-genetics) above and beyond (epi-) changes in the DNA sequence.” Rather than altering this basic definition, we prefer to append to it appropriate modifiers. The chromosome-dependent definition given in (86) corresponds to what we call cis epigenetics, as opposed to self-propagating patterns that operate in trans. We convey the distinction between transgenerational inheritance and cellular memory by referring to meiotic versus mitotic epigenetics, both cases of vertical epigenetics, as opposed to the RNA-mediated horizontal transmission of epigenetic states observed in plants (53).

Self-propagating transcriptional states that are maintained through feedback loops and networks of transcription factors (TFs) (2) are the most common type of trans epigenetic states (Fig. 1A). These are often the system of choice for cellular memory in simple organisms, such as prokaryotes and single-cell eukaryotes. If a TF activates its own transcription (or represses antagonistic networks), it yields an epigenetic state that is self-sustaining after the originating stimulus is removed. After each cell division, inherited TFs resume their trans function on regulatory DNA sequences. Some small RNAs (sRNAs) can also act as trans epigenetic signals (35).

In contrast, cis epigenetic signals are physically associated and inherited along with the chromosome on which they act (Fig. 1B)—for example, as a covalent modification of the DNA itself, such as DNA methylation, or as changes in histones, which constitute the protein backbone of chromatin. Histones can carry information in their primary sequence (histone variants), in posttranslational modifications often present on their N- and C-terminal tails, or in their position (remodeling) relative to the DNA sequence (68). Cis epigenetic information might also be encoded in chromatin through stable association of nonhistone proteins, higher-order chromatin structure, and nuclear localization.

It is often difficult to distinguish experimentally between trans and cis epigenetic signals. For example, initial observations implicated SWI/SNF chromatin remodelers in transcriptional memory at the Saccharomyces cerevisiae GAL1 locus (9), but cell-fusion experiments rigorously demonstrated that the site of memory was the cytosol (10), a case of trans epigenetics. However, if two identical DNA sequences are differentially regulated in the same nucleus, cis epigenetic mechanisms must be responsible. This is observed for mono-allelic gene expression in diploid cells, imprinting, and X inactivation in mammals, wherein large portions of one X chromosome are inheritably silenced while its homolog continues to transcribe in the same nucleus (11). In fact, X chromosome inactivation involves many putative epigenetic signals and provides an excellent experimental and didactical model to study epigenetics.

If trans-acting transcriptional memory systems were readily available during evolution, why did the appearance of multicellularity expand the repertoire of cis epigenetic signals? One possibility is that trans mechanisms were simply inadequate for tackling the increased complexity and number of transcriptional networks in a large multicellular organism. Epigenetic states that are encoded in cis need to be set only once, and many transcriptional patterns can be maintained by a relatively small number of common molecular pathways, without having to deploy trans-acting feedback loops for each gene network.

Establishment

Most epigenetic states are established by transiently expressed or transiently activated factors that respond to environmental stimuli, developmental cues, or internal events (e.g., the reactivation of a transposon). These establishment signals converge on chromatin to shape the transcriptional landscape and are then converted into cis epigenetic signatures.

TFs orchestrate lineage-specification programs and are leading candidates as establishment signals (12). In addition to recruiting factors that modulate transcription transiently, TFs also influence cis epigenetic states. In Drosophila, TFs encoded by the segmentation genes establish cellular fate during early embryo development, and, although they disappear after only a few hours, cell identities are maintained into adulthood by the Polycomb group (PcG) of proteins that are associated with transcriptional repression, and the trithorax group (trxG) of proteins that are associated with transcriptional activation. Many members of both groups possess chromatin-modifying activities (13), and, given that Drosophila contains little DNA methylation, the PcG/trxG system is likely to be directly involved in the transmission (maintenance) of epigenetic states.

It is not clear how the transition from TF-driven regulation to a cis epigenetic state takes place. One possibility is that TFs directly recruit chromatin-modifying enzymes to their genomic targets. For example, sequence-specific DNA-binding proteins are required for PcG-mediated memory in Drosophila and recruit PcG complexes to regulatory regions called Polycomb response elements (PREs) (14), but recruitment does not always translate into repression of the downstream locus (15), suggesting that PRE localization of PcG and trxG complexes is necessary but not sufficient for the maintenance of transcriptional memory. Similarly, several TFs physically interact with DNA methyltransferases and recruit them to target genes (16).

In addition, TFs might establish epigenetic states through the process of transcription itself. Transcription of Nesp is required for the establishment of DNA methylation at the imprinted Gnas locus downstream, suggesting a cis mechanism (17), and transcription of the noncoding PREs induces an inheritable active state (18). Transcription also guides the deposition of histone posttranslational modifications, including H3K4me3, H3K36me3, H3K79me, and H2BK123ub (19), but these modifications have yet to pass the maintenance test for bona fide epigenetic signals (see below).

The transcription process affects chromatin structure, but it is often difficult to ascribe this effect to the physical passage of RNA polymerase II (RNAPII) or to the synthesis of noncoding RNAs (ncRNAs). Noncoding regions of the genome are heavily transcribed, giving rise to a constellation of ncRNAs that often have regulatory functions (20). Although early investigations focused on posttranscriptional gene silencing by microRNAs and other sRNAs, pioneering work in Schizosaccharomyces pombe and Arabidopsis thaliana established that sRNAs also affect epigenetic states (3). In particular, plants use sRNAs to repress transposable elements and regulate gene expression through the process of RNA-dependent DNA methylation (21). Other sRNAs with proposed effects on cis epigenetic states are PIWI-interacting RNAs (piRNAs) (22) and sRNAs that bind to (and perhaps mediate the recruitment of) PcG proteins (23). Even more classes of sRNAs may be involved in chromatin regulation, although a functional link has not yet been demonstrated; these include promoter- and termini-associated RNAs, tiny RNAs, and endo–small interfering RNAs (20).

Small ncRNAs are well suited for a role in bridging chromatin modifiers with the genome (5), but to fulfill this function they must interact in sequence-specific fashion with chromatin. We envision three modes of sequence recognition: (i) RNA:RNA interactions with nascent transcripts (Fig. 2A) (3), (ii) RNA:single-stranded DNA (ssDNA) heteroduplex (Fig. 2B), and (iii) RNA:double-stranded DNA (dsDNA) triplex (Fig. 2C) (24). These recognition modes are not mutually exclusive: piRNAs guide PIWI to Drosophila chromatin by both RNA:RNA and RNA:DNA interactions (25). Recruitment of chromatin modifiers might occur directly or via adaptor proteins, as in the case of Stc1, an S. pombe LIM domain–containing protein, that bridges Ago1 and its associated sRNAs to the H3K9 methyltransferase complex CLRC (26).

Fig. 2

RNA-directed deposition of epigenetic signals. (A to C) Sequence-specific recognition models for both sRNAs (shown here) and lncRNAs. (A) Interaction of an sRNA with complementary sequences on a nascent transcript. (B) Interaction of an sRNA with ssDNA to form a heteroduplex. (C) Recognition of sequence motifs by sRNA in a closed dsDNA duplex. (D and E) Additional recruitment models for lncRNAs. (D) A folded lncRNA recognizes a DNA sequence via complex surface-mediated interactions. (E) A lncRNA acting locally in a cotranscriptional fashion by being anchored to chromatin by RNAPII. RBP, RNA binding protein; A, hypothetical adaptor protein; CMC, chromatin-modifying complex.

Long ncRNAs (lncRNAs) may also function as establishment signals for epigenetic states (27). Because of their size, these molecules can fold into complex structures with molecular surfaces dedicated to protein binding, while retaining the ability to recognize nucleotide sequences by base-pair interactions as described above for sRNAs (28), although their tertiary structure may also allow for recognition modes that do not involve direct base-pair interactions (Fig. 2D). Some lncRNAs, such as HOTAIR, appear to act globally to recruit chromatin modifiers to target loci (29). Other lncRNAs act locally and establish cis epigenetic states on the genomic region from which they are transcribed. This second category includes lncRNAs involved in the classic epigenetic phenomena of parental imprinting and X chromosome inactivation, and their localized effect may be explained by two mechanisms: (i) The 5′ terminus may fold and interact with chromatin-modifying complexes, while the 3′ terminus is still tethered to the encoding locus by a transcribing RNAPII (Fig. 2E), and (ii) antisense transcription of a lncRNA might interfere with transcription of a neighboring gene or create double-stranded RNA species that are processed locally and establish cis epigenetic signal(s) to silence the locus.

Reinforcement and Spreading

Cis epigenetic states, such as those presumably encoded by histone posttranslational modifications and DNA methylation, can be reinforced locally or spread to adjacent areas to form larger chromatin domains. Feedback loops exist, in which enzymes responsible for the installment of a histone modification also interact with factors that bind to it. Examples of such histone modifier/binder pairs are SUV39H1 and HP1 for H3K9me, PR-SET7 and L3MBTL1 for H4K20me1, and EZH2 and EED for H3K27me (3032). The local reinforcement may be necessary, because histone modifications are not permanent and may be removed by dedicated enzymes or histone turnover.

On the other hand, spreading in cis may be required to extend the reach of epigenetic regulation beyond the confined area in which establishment took place. Spreading of chromatin domains is the basis of classical epigenetic phenomena such as position-effect variegation in Drosophila and formation of silent domains in S. cerevisiae (11), and it was also observed for artificially established H3K27me3 domains in human cells (33).

Epigenetic states can also be reinforced by cross talk among histone modifications and DNA methylation (34). De novo DNA methyltransferases and associated factors bind to unmethylated H3K4 (35), whereas the H3K4 methyltransferase MLL preferentially binds to unmethylated DNA (36), providing a molecular explanation for the anti-correlation between H3K4me and DNA methylation levels (37). H3K9me is a prerequisite for all DNA methylation in Neurospora crassa and CHG DNA methylation in A. thaliana (38). In mice, the H3K9 methyltransferases SUV39H1/2 and EHMT2 (G9A) can direct de novo DNA methylation to certain loci, although the catalytic activity of G9A is not always necessary (3941). This interplay suggests that, when present, DNA methylation may serve as a reinforcing signal for preexisting but less stable epigenetic signatures such as histone modifications.

Transmission

Three independent criteria should determine whether a certain molecular signal is indeed epigenetic: (i) mechanism for propagation; that is, pathways that explain how the molecular signature is faithfully reproduced after DNA replication/cell division; (ii) evidence of transmission; i.e., the demonstration of self-sustaining transmission to the progeny; and (iii) effect on gene expression; that is, a bona fide epigenetic signal should be sufficient to cause a transcriptional outcome reminiscent of that caused by the establishing stimulus (42).

DNA methylation satisfies all three requirements: (i) Because of the semiconservative nature of DNA replication, a DNA sequence carrying symmetrical methylation marks on both strands gives rise to two hemi-methylated double strands, which can be restored to fully methylated status by maintenance methyltransferases (38) (Fig. 3A); (ii) in vitro methylated DNA remains methylated after several rounds of DNA replication in vivo (43); and (iii) methylation regulates transcription.

Fig. 3

Transmission of epigenetic states. (A) Transmission of DNA methylation patterns after DNA replication. (B) Hypothetical model for the maintenance of a histone-associated epigenetic signal, using H3K27me3 (yellow flags) as an example. H3K27me3 is diluted during DNA replication by the deposition of unmodified octamers. Binding of EED to H3K27me3 stimulates the enzymatic activity of EZH2, which places more H3K27me3 marks on neighboring nucleosomes, thus restoring a full epigenetic signature on both chromatids (32). (C) Maintenance of a chromatin domain via a secondary signal. S-phase transcription of heterochromatic repeats in S. pombe generates sRNA species that recruit chromatin-modifying complexes to reestablish heterochromatic signatures at the target loci.

The case for histone posttranslational modifications is less clear, and each mark should be considered separately. Some modifications exhibit strong correlation with transcriptional states (7); however, correlation does not imply causation, and experimental evidence for the epigenetic inheritance of histone modifications remains scarce.

Propagation mechanisms (criterion i) have been proposed for several histone modifications in the form of the same histone modifier/binder interactions involved in signal reinforcement and spreading (Fig. 3B). This model assumes that the information to re-establish chromatin domains is transferred from the parental nucleosomes containing such modifications to those deposited on the two daughter strands. However, it remains unclear whether and how parental histones (and their associated modifications) are reassembled after DNA replication in vivo. Alternatively, domains of histone modifications could be propagated via an intermediary (secondary) epigenetic signal. This appears to be the case for H3K9me in S. pombe heterochromatin, where S-phase–restricted transcription of repetitive sequences generates sRNAs that direct the re-establishment of H3K9me after replication (Fig. 3C) (44).

Whether histone posttranslational modifications are transmitted (criterion ii) remains largely unknown. This question can be addressed by artificially recruiting a histone modifier to chromatin using the GAL4/upstream activation sequence (UAS) system and then measuring the persistence of the resultant histone modification through cell division after terminating the expression of the histone modifier. The GAL4/UAS system can also be used to demonstrate that histone modifications cause (not only correlate with) a transcriptional response (criterion iii) (31). To date, only short-term (4 days) transmission of H3K27me3 in cultured human cells has been observed (33, 45), but doubts remain regarding incomplete repression of the GAL4-fused histone modifier. In addition, Polycomb repressive complex 1 remains bound to chromatin (independently of histone modifications) during DNA replication in vitro (46), and MLL (a trxG protein) appears to associate with mitotic chromosomes (47), suggesting that some chromatin modifiers may also function directly as cis epigenetic signals.

Some epigenetic information is transmitted through meiosis and gamete formation in multicellular organisms, giving rise to transgenerational inheritance. Many epigenetic signals appear capable of meiotic transmission, including maternally deposited TFs and piRNAs (48), RNAs involved in paramutation in mice (49), histone modifications in sperm chromatin (50), and DNA methylation in plants (51).

Finally, epigenetic signals are also transmitted from cell to cell in a horizontal fashion (5). This phenomenon is at the basis of the inheritance of RNA interference in Caenorhabditis elegans and occurs in plants, both in the germ line and in the soma (52, 53), two cases in which sequence information is transmitted across cells to silence transposable elements. As no genomic DNA is exchanged between these cells, the epigenetic information must be transmitted in trans. In fact, it takes the form of sRNAs that direct DNA methylation of genomic sequences, converting a trans epigenetic signal into a cis epigenetic state (53).

Reversal

Epigenetic regulatory mechanisms are conservative in that no information is lost, and, given the appropriate signal, an epigenetic state can transition to a different one, as exemplified by the generation of induced pluripotent stem (iPS) cells from mouse embryonic fibroblasts (MEFs) by transient overexpression of cocktails of TFs (54). Although embryonic stem (ES) cells and MEFs have very different gene expression profiles, H3K4me and H3K27me distribution, and DNA methylation patterns, MEF-derived iPS cells closely resemble ES cells, both at the transcriptional and epigenetic levels (55). These observations not only demonstrate the plasticity of epigenetic signals, but also confirm the interdependence between cell identity and epigenetic states.

Reversible chromatin changes and antagonism between TFs provide the basis for cellular plasticity. Opposing TF networks reinforced by feedback loops direct the specification of hematopoietic and embryonic lineages (12, 56). Histone modification profiles are also the result of a delicate balance between antagonistic pairs of histone-modifying enzymes—for example, histone acetyl-transferases versus deacetylases and histone methyl-transferases versus demethylases (7).

Nonetheless, the forced transition between two metastable epigenetic states requires a considerable “activation energy,” as evidenced by the poor efficiency of epigenetic reprogramming by nuclear transfer or overexpression of pluripotency factors. Cells that fail to fully overcome this barrier are trapped in an intermediate state, probably because of a failure in resetting epigenetic signatures (55). For example, improper silencing by DNA hypermethylation and histone hypoacetylation of the imprinted Dlk1-Dio3 cluster correlates with the failure of many iPS cell lines to generate chimeras (57). Thus, small-molecule inhibitors of histone modifiers and DNA methyl-transferases that stimulate reprogramming may do so by facilitating the creation of an “epigenetic tabula rasa” (58).

The most dramatic epigenetic plasticity is observed during early embryo development and germline specification in mammals. At fertilization, the condensed paternal genome is hypermethylated, until a wave of DNA demethylation restores it to an active state (59). Notably, some imprinted genes are not affected, suggesting that they must be marked with a different epigenetic signal (60). During development, germ cells undergo a second wave of DNA methylation (61), possibly in response to environmental cues in the gonads, demonstrating that even the most stable epigenetic signal can be reversed by the appropriate stimuli.

These two cases in which genome-wide DNA methylation is rapidly lost likely involve the most elusive mode of epigenetic plasticity: DNA demethylation (62). Two mechanisms for DNA demethylation appear possible: (i) active demethylation through oxidation and (ii) base excision repair (BER). Hydroxylation of 5-methylcytosine by TET1-3 (63, 64) argues in favor of the first mechanism, although the resulting 5-hydroxymethylcytosine must be further converted to unmethylated cytosine via presently unknown mechanisms. In plants, genetic evidence supports DNA demethylation via BER (65), but the importance of this pathway in animals is unclear. Activation-induced deaminase is involved in DNA demethylation in mammalian cells (66, 67), yet its inability to act on dsDNA (68) and the lack of developmental defects in Aid−/− mice argue against a central role in DNA demethylation and development (69, 70).

Thus, epigenetic states are stable enough to maintain cell identity, but they are also reversible enough to allow, when the proper signals are delivered, transitions among states. For example, the epigenetic plasticity of germ cells allows for cell specialization during gamete development, while retaining the potential for returning to the ground state of totipotency after fertilization.

Future Directions

Epigenetic signals have traditionally been studied in the context of development and transposon control, but we believe that epigenetic mechanisms will be the crux of many other biological processes. For example, brain function and memory formation may involve epigenetic signals (71, 72). Although most neurons do not divide, they are very long-lived and must maintain a defined transcriptional state for an extremely long time. By our own definition, this phenomenon is not truly epigenetic, yet signals that evolved to maintain epigenetic states through cell divisions might have been co-opted by neurons to stabilize transcriptional profiles after the establishing stimulus has disappeared (72).

Another area in which a better understanding of epigenetic signals may benefit research is that of stem cell function. Stem cells divide asymmetrically, giving rise to one self-renewing cell and one cell committed to differentiate (73). This is usually attributed to external cues and asymmetric partitioning of trans-acting factors, but evidence suggests that chromatids (and any cis epigenetic signal they may carry) can also be segregated asymmetrically (74). Thus, asymmetric partitioning of epigenetic signals may participate in cell-fate decisions, as proposed in the “silent sister” hypothesis (75).

Epigenetics is also likely to affect aging, given that epi-mutations, especially in DNA methylation, may contribute to the aging phenotype (76). Epigenetic transmission is more error prone than DNA replication (43) and probably accumulates large amounts of inheritable errors over a lifetime. Pathways that impinge on chromatin states are also implicated in aging (77), but we are far from understanding the molecular mechanisms involved in this process.

Conserved molecular machineries manage genetic information in all organisms, yet different epigenetic signals predominate in different species. For this reason, epigenetic research has been conducted on an unusually broad spectrum of model organisms. If we had studied only Drosophila, we would know little about DNA methylation. Extending our studies to additional model organisms will undoubtedly be beneficial. Two emerging models appear to be very promising: (i) the planarian Schmidtea mediterranea, which displays incredible epigenetic flexibility in regenerating a complete adult from small patches of dissected tissue (78), and (ii) social insects, in which a single genome gives rise to epigenetically distinct castes that display dramatic physiological, morphological, and behavioral differences (79, 80).

References and Notes

  1. More generally, any phenotypic change may be transmitted epigenetically. For the sake of simplicity, we only discuss epigenetic control of gene expression in this Review. For an example of epigenetic transmission of altered protein function, see (81).
  2. We thank L. Vales for critical reading of this manuscript. Work in the Reinberg laboratory is supported by HHMI and NIH grants GM064844 and GM037120. R.B. is a fellow of the Helen Hay Whitney Foundation.
View Abstract

Navigate This Article