Epigenetic balance of gene expression by Polycomb and COMPASS families

See allHide authors and affiliations

Science  03 Jun 2016:
Vol. 352, Issue 6290, aad9780
DOI: 10.1126/science.aad9780

A balancing act in modifying chromatin

Chromatin modifiers add chemical groups to histones, the proteins that package DNA. These modifications are central to cellular development, and mutations in their molecular machinery are linked to a variety of human diseases. Piunti and Shilatifard review the balance between the prototypic chromatin modifiers Polycomb and COMPASS complexes and their role in gene regulation and normal development. Although originally identified as indispensible regulators of fruit fly development, related roles have been identified in other organisms. Furthermore, mutations in human homologs have been implicated in various cancers. As such, these complexes may serve as effective targets for epigenetic therapies.

Science, this issue p. 10.1126/science.aad9780

Structured Abstract


Multicellular organisms depend on the precise orchestration of gene expression to direct embryonic development and to maintain tissue homeostasis through their life spans. Exactly how such cell type–specific patterns of gene expression are established, maintained, and passed on to the next generation is one of the most fundamental questions of biology. Eukaryotes package their DNA into nucleosomes to form chromatin fibers. Chromatin plays a central role in regulating accessibility to DNA in many different DNA templated processes, including machineries that transcribe DNA into RNA, i.e., transcription. Transcriptional control through sequence-specific DNA binding factors (transcription factors) is well established; however, proteins that change chromatin structure (chromatin modifiers and remodelers) provide an additional layer of regulation and are considered major epigenetic determinants of cell identity and function. Among the numerous chromatin modifiers, the members of the Polycomb group (PcG) and the Trithorax group (TrxG) of proteins, in particular, have been scrutinized genetically and biochemically for decades. However, new and unexpected functions for these complexes are constantly emerging because of the intense interest in the critical role these proteins play in maintaining a balanced state of gene expression.


In the classical view, PcG and TrxG proteins regulate the repressed and activated states of gene expression, respectively. Both PcG and TrxG are organized in multiprotein complexes, which include the Polycomb repressive complex 1 and 2 (PRC1 and PRC2, respectively), and the complex of proteins associated with Set1 (COMPASS) family. Polycomb and COMPASS families are well known for their opposing roles in balancing gene expression, a phenomenon initially characterized using classical Drosophila melanogaster genetic approaches at a time when their biochemical functions were still unknown. Later studies demonstrated that Polycomb and COMPASS complexes have enzymatic activities modifying different sites on a common target, the nucleosome. Nucleosomes can be posttranslationally modified in a variety of ways, many of which strongly correlate with different states of gene expression. Through their ability to regulate gene expression, several components of both the Polycomb and COMPASS complexes are involved in a plethora of crucial biological processes ranging from the regulation of embryonic development to widespread involvement in neoplastic pathogenesis.


Recent genome-wide studies have demonstrated that a large number of the components of the Polycomb and COMPASS families are often mutated in different forms of cancer. Some mutations result in gene deletion or early termination, such as loss-of-function (LOF) mutations, whereas gain-of-function (GOF) mutations increase or change their normal activities. Although it cannot be excluded that some of those are passenger rather than driver mutations, they suggest a relevant function of these proteins in tumorigenesis. Moreover, animal models have provided convincing evidence supporting a role for these complexes in tumor progression. However, even after decades of study, how Polycomb and COMPASS control normal or aberrant gene regulatory networks is not fully understood yet. From the perspective of their catalytic activities, the degree to which catalytic versus noncatalytic functions contribute to their roles in development and cancer has just begun to emerge. Concurrently, the possibility that PcG and TrxG enzymatic activities modify non-nucleosome substrates remains a fascinating, although largely unexplored, hypothesis. Ongoing efforts to decipher how mutations affecting members of these complexes disturb transcriptional balances and promote oncogenesis could provide critically needed new strategies for cancer therapeutics.

The balanced state of gene expression. The scale symbolizes the transcriptional status of a gene. Each dish contains a nucleosome that is either lysine 4 trimethylated (green light) or lysine 27 tri-methylated (red light) on histone H3 (one tail is depicted for simplicity). These two histone marks strongly correlate, respectively, with transcriptional activation induced by COMPASS and transcriptional repression induced by PcG. [Figure by Mark Miller. Nucleosomes are adapted from a custom model from 3D Molecular Designs]


Epigenetic regulation of gene expression in metazoans is central for establishing cellular diversity, and its deregulation can result in pathological conditions. Although transcription factors are essential for implementing gene expression programs, they do not function in isolation and require the recruitment of various chromatin-modifying and -remodeling machineries. A classic example of developmental chromatin regulation is the balanced activities of the Polycomb group (PcG) proteins within the PRC1 and PRC2 complexes, and the Trithorax group (TrxG) proteins within the COMPASS family, which are highly mutated in a large number of human diseases. In this review, we will discuss the latest findings regarding the properties of the PcG and COMPASS families and the insight they provide into the epigenetic control of transcription under physiological and pathological settings.

Any given cell is constantly challenged by different endogenous or exogenous perturbations or signals. The ability of cells to integrate these stimuli and respond to them largely involves precise and complex changes in the cellular transcriptome that can either be transient (e.g., stress response) or permanent (e.g., cellular differentiation). Although transcription factors are essential for driving the transcriptional response as one of the immediate steps of gene expression, chromatin modifiers and remodelers collectively referred to as “epigenetic factors” are indispensable for shaping and maintaining the transcriptional response during biological processes, such as development (1). Indeed, once transcription factors instruct the transcriptional output, in some cases, the resulting change is meant to be maintained and inherited during cell divisions by epigenetic factors. These factors generally function as part of a multiprotein complex to modify nucleosomes, the main subunit of chromatin. Nucleosomes are mainly composed of DNA and histones, both of which can be extensively modified. DNA changes—such as methylation, hydroxymethylation, and further oxidations—play important roles in gene expression. Histones can undergo a plethora of different posttranslational modifications including methylation, acetylation, phosphorylation, and ubiquitination, and these modifications strongly correlate with different transcriptional outcome, depending on the context of the modified residues (2, 3). Some histone modifications are associated with a specific transcriptional status independently of the specific modified residue. Indeed, lysine acetylation on histones is generally regarded as a positive regulator of transcription. However, other modifications, such as histone methylation, correlate with different transcriptional contexts depending on the specific residue modified. A classic example is given by the opposite activity of histone methylation implemented by the Polycomb group (PcG) and Trithorax group (TrxG) of proteins. These groups of proteins include additional chromatin modifiers and, during the past three decades, have been among the most extensively studied epigenetic factors. Proteins belonging to the PcG and TrxG families were originally described in Drosophila melanogaster to respectively repress or activate the same developmental target genes (4, 5). Thus, the antagonistic activities of the PcG and TrxG proteins establish a balanced and regulated state of gene expression. This review provides an overview of the biochemical composition and enzymatic properties of the Polycomb and Trithorax groups of protein complexes and examines recent findings on the role of these factors in regulation of gene expression in the embryonic stem cells and throughout development. Further, we discuss how the misregulation of the Trithorax and Polycomb group enzymatic activities results in the pathogenesis of human disease, including cancer.

Trithorax and Polycomb groups of proteins

Trithorax group proteins (TrxGs) are a heterogeneous group of factors with varied activities mainly related to chromatin modification and remodeling to activate transcription (5). As genetically defined in D. melanogaster, genes classified to the TrxG family are involved in the activation of developmental genes, such as homeotic genes, and in counteracting the silencing activity of the Polycomb group proteins (PcGs) (6). The Trithorax (Trx) gene was serendipitously discovered in D. melanogaster because of the homeotic phenotype of the partial transformation of a haltere to a wing as the result of a spontaneous mutation (7). Evidence of its activity as a transcriptional activator goes back >30 years, when it was demonstrated that mutations in Trx prevented the derepression of the bithorax complex (Hox genes) in a PcG mutant background (8). Other Drosophila TrxG proteins include the adenosine triphosphate–dependent nucleosome remodeler Brahma (Brm), related to mammalian SWI2 (9), and the Drosophila protein female sterile homeotic (Fsh), related to mammalian BRD4. These factors were initially discovered as homeotic modulators in D. melanogaster, and both contain bromodomains that bind acetylated lysine localized at active chromatin (10).

Like TrxGs, Polycomb group proteins (PcGs) were initially identified genetically in Drosophila as a heterogeneous group of factors whose mutations resulted in homeotic defects during fly development (11). The first Polycomb phenotype was described in 1947 by Pamela Lewis as an x-ray–induced Drosophila mutant displaying an abnormal number of sex combs and an Antennapedia-like phenotype of antenna-to-leg transformation (12). Thirty years later, the first Polycomb gene (Pc) locus was mapped (13), and the connection between Pc and repressive activity on the Bithorax complex (BX-C) locus was realized (14). The antagonistic roles of PcG in transcriptional repression and TrxG in transcriptional activation of homeotic genes in Drosophila led to the paradigm of the tight regulation of transcription by PcG and TrxG for the spatiotemporal control of developmental genes (8, 15).

After the identification of Pc, many other Polycomb group genes were identified in Drosophila on the basis of their similarity to the Pc homeotic phenotype (16). Subsequent work demonstrated that many PcG and TrxG proteins are organized in multiprotein complexes that have chromatin-modifying activities (17). In contrast to the TrxG, no PcG proteins exist in Saccharomyces cerevisiae, although homologs are present in the filamentous fungus Neurospora crassa and in the budding yeast Cryptococcus neoformans (18, 19). This difference in conservation suggests that TrxG-independent functions, beyond the classical counteraction of PcG activity, are likely to exist. Here, we provide a current analysis of the balanced state of gene expression orchestrated by the interaction of Trx and related proteins within the COMPASS family and PcG family proteins of the PRC1 and PRC2 complexes. We will also discuss how the misregulation of their activities tips the balanced state of gene expression, which results in disease pathogenesis, including cancer (Tables 1 to 3).

Table 1 The COMPASS family in development and cancer.

The COMPASS family components in S. cerevisiae, D. melanogaster, and Homo sapiens are listed. Developmental role of each component is reported as a phenotypical outcome arising from knockout mice for each of them as described in the literature. Representative examples of each component’s role in cancer are indicated. Additional abbreviations: E#, embryonic day; ND, not determined; AML, acute myeloid leukemia; LOF, loss of function; NPCs, neural precursor cells; HR, homologous recombination.

View this table:
Table 2 PRC2 in development and cancer.

PRC2 components in D. melanogaster and H. sapiens are listed. Developmental role of each component is reported as phenotypical outcome arising from knockout mice for each of them as described in the literature. Representative examples of each component’s role in cancer are mentioned. Additional abbreviations: GOF, gain of function; MDS, myelodysplastic syndrome; ESS, endometrial stromal sarcoma; NSCLC, non–small cell lung cancer.

View this table:

TrxG family members and COMPASS-like complexes are histone H3K4 methylases

The cloning of the mixed-lineage leukemia (MLL) gene revealed that this proto-oncogene shares extensive homology with Drosophila Trx (5, 17, 20). In addition, the yeast protein Set1 has strong sequence similarity to the C terminus of Trx and MLL (21). Endeavoring to understand the function of MLL in leukemia, yeast Set1 was purified biochemically to homogeneity and found within a multiprotein complex named COMPASS (complex of proteins associated with Set1) (21). Set1 within COMPASS was the first histone H3 lysine 4 (H3K4) methylase purified and biochemically characterized (5, 21, 22).

After its discovery in S. cerevisiae (21), COMPASS homologs that methylate histone H3K4 were readily identified in fruit flies, mice, and humans, which suggests the fundamental importance of these multiprotein complexes during evolution (23). Although yeast COMPASS (yCOMPASS) is present as a single H3K4 methyltransferase complex in S. cerevisiae, there are three different COMPASS-like complexes in D. melanogaster, and six in mammals that represent duplications of each of the three Drosophila complexes (24, 25) (Table 1 and Fig. 1). These three types of complexes can be divided into Set1-like, Trx-like, and trithorax-related (Trr)–like (Fig. 1). Set1 and/or COMPASS is responsible for global di- or trimethylated histone H3 at lysine 4 (H3K4me2/3) deposition in Drosophila (24, 26). Although Trx mutations result in homeotic and other developmental defects and Trx’s role in the positive regulation of the Ubx gene is well known (27), the overall requirement for Trx’s histone methyltransferase (HMT) activity in vivo is not yet completely defined. Trx’s down-regulation in imaginal discs or in Drosophila S2 cells’ effect on the global reduction of histone H3K4 methylation levels is not dramatic (24, 26). This suggests that Trx may control histone H3K4 methylation deposition at a specific subset of genes. Indeed, Trx binds to and is required for the transcription of genes containing promoter-associated TrxG- and PcG-responsive elements (TRE and PRE) (28). TRE and PRE are specifically found in Drosophila and are defined as the minimal genomic region capable of recruiting PcG proteins to chromatin (6). These sequences create a highly dynamic platform for PcG and TrxG proteins that, from different stimuli, can maintain, respectively, the repressive or active transcriptional status of genes that were initially induced by specific transcription factors and thus generate what is called “epigenetic memory” (4, 29).

Fig. 1 The COMPASS family of histone H3K4 methylases.

(Left) mCOMPASS and its activity exerted by SET1A/B on mono-, di-, and trimethylation of lysine 4 on histone H3. (Center) mCOMPASS-like activity at enhancers exerted by the monomethyltransferase activity of MLL3/4-COMPASS on lysine 4 of histone H3. Highlighted is the putative demethylase function of UTX (member of MLL3/4 COMPASS–like) that can counteract H3K27me2/3 repressive marks deposited by PRC2 and thus possibly favor H3K27ac deposition by the histone acetyltransferases (HATs) CBP/p300. (Right) mCOMPASS-like activity at bivalent promoters exerted by MLL2/COMPASS as a trimethyltransferase toward lysine 4 on histone H3. This is suggested to positively regulate, when stimulated, the expression of genes transcriptionally poised because of the restrictive presence of PRC2 activity. MLL1/COMPASS is also a trimethylase that functions in the regulation of developmental gene expression. In the figure, only one N-terminal tail of one histone H3 is depicted. For a detailed description on the composition of the COMPASS families and the regulation of their enzymatic and catalytic properties, see (5). TSS, transcription start site.

Drosophila Trr/COMPASS and MLL3/4 COMPASS in enhancer function

The specific spatiotemporal transcriptional regulation of developmental genes is controlled by cis-regulatory elements known as enhancers (3033). Given that only 2% of the human genome encodes for protein-coding genes, vast expanses of noncoding DNA may serve as regulatory elements and enhancers. Enhancers communicate with promoters through DNA looping (Figs. 1 and 2) and can be classified as poised or active, depending on their potential or active support to transcription. Typically, poised enhancers can be found close to developmental genes ready to be activated by environmental stimuli, whereas active enhancers support the transcription network characteristic of the specific cell type, which ensures proper cell identity. Epigenetic and chromatin studies have demonstrated that the enhancer regions can be characterized by containing high levels of histone H3K4me1 and histone H3K27me3 for poised enhancers or H3K27 acetylation for active enhancers (25). In Drosophila, H3K4me1 is largely dependent on the COMPASS family member Trr (Trithorax-related), a specific histone H3K4 monomethyltransferase that acts in a dCOMPASS-related multiprotein complex (25) (Fig. 1). On active enhancers, the implementation of histone H3K27 acetylation (H3K27ac) is thought to be accomplished by cAMP response element–binding protein (CREB)–binding protein (CBP) or p300 (34). Histone H3K27ac is also deposited at active promoters and positively correlates with histone H3K4me3 (35). Together, these studies demonstrate functional specification among the COMPASS family in Drosophila development.

Fig. 2 PRC2.

(Top) PRC2 composed by the core components EED, SUZ12, RBAP46/48, JARID2, and EZH2 can deposit all three levels of methylation (mono, di, and tri) on lysine 27 of histone H3. PRC2 di- and trimethylates enhancer regions, which prevents CBP/p300 from accessing and acetylating those regions that could render them active enhancers. Bivalent regions are marked by H3K4me3 by MLL2/COMPASS and H3K27me3 by PRC2 are also represented. (Bottom) PRC2 has a recently described activity in gene bodies where it can monomethylate H3K27 and favor gene expression. (Top right corner) Schematic of the misregulation of the implementation of different methylation states by PRC2 in cancer. The EZH2 Y641 oncogenic mutant–containing PRC2 has enhanced enzymatic activity on an H3K27me2 substrate, therefore, it makes more H3K27me3 in contrast to wild-type EZH2 containing PRC2, whose primary enzymatic activity is toward unmodified or H3K27me1 substrates. In the figure, only one N-terminal tail of one histone H3 is depicted.

Mammalian COMPASS family

In mammals, all of the genes encoding the central HMT proteins that form the different COMPASS and COMPASS-related complexes are duplicated (5, 36) (Table 1). The central core of all of the complexes is called WARD (or WRAD), because it is composed of WDR5, ASH2, RBBP5, and DPY30 factors, and it is shared and functionally required by all of the COMPASS and COMPASS-related complexes (37, 38) (Fig. 1). Set1A/B are the central H3K4 trimethylases for the mammalian COMPASS (mCOMPASS), which have been shown to play a default global histone H3K4me3 activity in mammalian cells (5, 39). However, the extent to which the overall biological function of mCOMPASS is related to its HMT activity still remains poorly characterized and understood in mammals.

MLL1 and MLL2 are the two mammalian HMT homologs of Drosophila Trx, and like Trx, their proteins are found within similar macromolecular complexes called COMPASS-related complexes (36) (Fig. 1). Although they share the same domain architecture, MLL (KMT2A or MLL1) and MLL2 (KMT2B) have nonredundant functions, as demonstrated by the frequent leukemogenic chromosomal translocations that involve MLL1, but never the MLL2 gene (40). Both MLL1 and MLL2 can be part of a highly similar COMPASS-related complex, although they are mutually exclusive, with no MLL2 being found in MLL1/COMPASS and vice versa (41, 42). Consistent with their nonredundant functions, MLL2 has been demonstrated to be mainly responsible for the histone H3K4me3 deposition at bivalent clusters in mouse embryonic stem cells (mESCs), with MLL1 having little or no major role in histone H3K4me3 deposition in these cells despite its being expressed (36, 43). Histone H3K4me3 at bivalent promoters was proposed to be essential for rapid transcriptional induction from prodifferentiation stimuli; however, despite the almost complete absence of this histone mark at bivalent promoters upon MLL2 depletion or deletion, rapid transcription induced by retinoic acid in mESCs is not altered (36, 43).

Another major source of histone H3K4 methylation comes from the COMPASS-related complexes containing MLL3 (KMT2C) or MLL4 (KMT2D) (36) (Fig. 1 and Table 1). These two HMTs are the mammalian homologs of the Drosophila HMT Trr (25). Like Trr in Drosophila, MLL3 and MLL4 COMPASS-related complexes control the H3K4me1 deposition at enhancer elements in mammalian cells (44) (Fig. 1). However, MLL3/4-dependent H3K4me1 deposition can also be found at the promoters of myoblast genes when they are repressed; depletion of MLL3/4 induces the transcription of these genes, which coincides with the recruitment of both mCOMPASS and MLL1/mCOMPASS (45). These findings suggest that multiple mCOMPASS family members can coregulate a subset of genes during development, albeit at different regulatory steps (Fig. 1).

PcG proteins: Transcriptional repressive complexes

The PcG proteins have been extensively studied in mammals in relation to development and cancer (Tables 2 and 3); however, a unifying theory on PcG’s role in mammals is still not clear (46). The PcG proteins are organized into two types of macromolecular complexes: Polycomb repressive complex 1 (PRC1) and Polycomb repressive complex 2 (PRC2). As their names suggest, these complexes are generally regarded as transcriptional repressors; however, a growing body of evidence suggests that they may also play a role in transcriptional activation in a context-specific manner (46, 47).

Table 3 PRC1 in development and cancer.

PRC1 components in D. melanogaster and H. sapiens are listed. Developmental role of each component is reported as a phenotypical outcome arising from knockout mice for each of them as described in the literature. Representative examples of each component’s role in cancer are mentioned.

View this table:

PRC2 is the smaller of the two complexes and has a well-defined core composition. This complex is characterized by the presence of EZH2 (or its homolog EZH1), a SET domain–containing protein that has HMT-specific activity toward histone H3 lysine 27 (H3K27) (48). The other three core PRC2 components are EED, SUZ12, and RBAP46/48. EED and SUZ12 are essential for PRC2 complex integrity and for EZH2/1 catalytic activity, whereas RBAP46/48 are most likely involved in nucleosome recognition, but dispensable for PRC2 catalytic activity (4953).

Along with its core components, PRC2 is also characterized by several other components, which have been reported to regulate its molecular and biological functions (Fig. 2). JARID2 has been demonstrated to globally regulate PRC2 binding to chromatin in ESCs (5456). Recently, it was also shown that Jarid2 is a nonhistone substrate of PRC2, and methylated JARID2 is a binding partner for the EED component of PRC2 (57). Another component of PRC2, the zinc finger protein AEBP2, can function with JARID2 to potentiate PRC2 catalytic activity toward unmodified nucleosome substrates in vitro, but their precise roles in modulating PRC2 enzymatic activity in vivo are still debated (58). Other interesting noncore PRC2 components are the Polycomb-like proteins (PCL1-3), also known respectively as PHF1, MTF2, and PHF19 (47). All the PCL proteins have been shown to regulate PRC2 catalytic function and recruitment (53, 59, 60). Of particular interest, PCL3/PHF19 was initially suggested to inhibit PRC2 catalytic activity by the recognition of and binding to H3K36me3 through its Tudor domain (61). This was consistent with the inhibitory activity exerted on PRC2 by activating histone marks in vitro (62). However, other studies have shown that PHF19 is crucial for PRC2 recruitment and functions at several promoters in mESCs through the recognition of H3K36me3. This modification is then removed by certain H3K36me3 demethylases, such as NO66 and KDM2B, which allows PRC2-dependent deposition of H3K27me3 and gene silencing (6365). The H3K36me3/PRC2 duality is another example of how the balanced state of gene transcription through histone-modifying machineries can be achieved.

PRC2 and transcriptional repression

Despite years of intense investigation, precisely how PRC2 localizes to specific regions of chromatin in mammalian cells is unclear. In Drosophila, the PREs mediate PRC2 recruitment. PREs can be a few hundred base pairs in length and likely encompass binding sites for multiple transcription factors that help to recruit PRC2, including Zeste, GAGA factor, and PHO (66). PHO is most closely related to mammalian YY1; however, YY1 is not part of mammalian PRC2 and does not recruit PRC2 to chromatin (67). Accordingly, no mammalian counterpart for a PRE has been identified (68). The CpG-rich domains (CpG islands) have been shown to be overrepresented in PRC2 binding regions and are thought to contribute to PRC2 recruitment (69); however, it is not clear how such a low percentage of the CpG islands are selected as PREs in the mammalian cells. Recently, it was proposed that transcriptionally silent unmethylated CpG islands recruit PRC2 (70, 71). Forced recruitment of PRC2 to transcriptionally active chromatin can induce transcriptional silencing, and this takes place independently of its catalytic activity (72). Endogenous PRC2, which presents wild-type catalytic activity, is required to maintain the repressed state (72). However, recently, it was shown that PRC2 is, in general, dispensable for transcriptional repression in mESCs but is required for the proper maintenance of repression during differentiation (71).

Histone H3K27 methylation by PRC2 and developmental regulation

The PRC2 complex controls the three different methylation states of histone H3K27 (H3K27me1/2/3) in mESCs, and the distribution of these marks is mutually exclusive. Whereas H3K27me3 is mainly deposited at promoters, especially at bivalent genes (Fig. 2), H3K27me2 localizes to the intergenic regions, and H3K27me1 is deposited in the gene bodies of actively transcribed genes (73) (Fig. 2). Although H3K27me3 is required for the maintenance of transcriptional repression during mESCs’ differentiation (71), H3K27me2 can serve to protect spurious H3K27ac deposition and the firing of unscheduled enhancers, whereas H3K27me1 can facilitate the transcriptional activation of genes required for mESCs differentiation (73) (Fig. 2). Even though the H3K27me1 deposition in gene bodies is dependent on PRC2, residual levels of this modification are still detectable in PRC2-null mESCs (73), which suggests that one or more other enzymes are responsible for its deposition; however, this issue remains unexplored.


PRC1 is larger than PRC2 and is more heterogeneous in composition, particularly in mammalian cells (Table 3 and Fig. 3). RING1A and RING1B (abbreviated to RING1A/B) are the central PRC1 components, and since their identification, they were reported to be part of different multiprotein complexes (7477). Only recently, a detailed and comprehensive biochemical characterization identified PRC1 as a family of biochemically distinct complexes (78). RING1A/B are E3 ubiquitin ligases that monoubiquitinate lysine 119 of histone H2A (H2AK119ub) (79). H2AK119ub is proposed to facilitate chromatin compaction and transcriptional silencing, although RING1B ubiquitin ligase activity is dispensable for the repression of Hox loci in mESCs (80). Very recently, however, it was demonstrated that the E3 enzymatic activity of RING1B is dispensable for early mouse development and for global target gene repression in mESCs and in Drosophila embryos, which raises the question of the precise function of this posttranslational modification (81, 82).

Fig. 3 PRC1.

(Left) The canonical–PRC1 containing CBXs, which mediate the recognition of the H3K27me3 mark deposited by PRC2. PCGF2/4 proteins assist the E3 ubiquitin ligases RING1B/A in mediating the monoubiquitination of H2AK119. (Right) The noncanonical PRC1 contains RYBP, which is a common component of all of the noncanonical PRC1, which can also contain any of the different PCGF proteins (PCGF1/3/5/6). The binding of noncanonical PRC1 to chromatin is PRC2-independent but has been implicated in the recruitment of PRC2 through the deposition of monoubiquitinated H2AK119. In the figure, only one N-terminal tail of one histone H2A or H3 is depicted.

PRC1 complexes can be classified as being canonical or noncanonical (Fig. 3), with canonical PRC1–containing one of the Polycomb-like chromobox homolog (CBX) proteins that recognizes the H3K27me3 mark implemented by PRC2 (78). Canonical PRC1 complexes also contain BMI1 (PCGF4), which is required for the correct formation of canonical PRC1 and which is also required for RING1A/B E3 ligase activity (83). BMI1’s closest homolog, MEL18 (PCGF2), can substitute for BMI1 to form a stable PRC1 complex but fails to enhance RING1A/B enzymatic activity in vitro (83). This could allow PRC1 recruitment, subsequent H2AK119ub deposition, chromatin compaction, and transcriptional silencing (84). This is generally regarded as the canonical sequential PRC2-dependent PRC1 recruitment model (Fig. 3).

The existence of noncanonical PRC1 was first suggested when it was demonstrated that PRC2 depletion does not cause a dramatic effect on the global H2AK119ub deposition in mESCs, which, in turn, suggests alternative mechanisms of PRC1 recruitment on chromatin (85) (Fig. 3). Indeed, recruitment of RYBP containing PRC1 was demonstrated to be PRC2-independent (85) (Fig. 3). Although canonical (PRC2-dependent and CBX-containing PRC1 subcomplexes) and noncanonical (PRC2-independent and RYBP/YAF2-containing PRC1 subcomplexes) are two distinct biochemical entities, they can both colocalize in mESCs on several developmental loci characterized by full repression and high PRC2 occupancy (78, 86).

The coexistence of canonical and noncanonical PRC1 can be explained by the recent finding that noncanonical PRC1 can recruit PRC2 on chromatin through the H2AK119ub mark (87) (Fig. 3). This is consistent with the findings demonstrating that PRC2 binds H2AK119ub-modified nucleosomes, which enhances PRC2’s H3K27me3 catalytic activity through AEBP2 and JARID2 in vitro (88). Although canonical and noncanonical PRC1 biochemical compositions have been clarified, their diverse biological and molecular activities remain to be fully investigated, along with a possible cooperation at specific genomic loci (86).

Landscape of PRC1 and PRC2 in mESCs and cellular differentiation

Both TrxG and PcG have been implicated as central factors required in development and cellular differentiation from Drosophila to human (89) (Tables 1 and 2). Initial studies in Drosophila highlighted the importance of these two classes of genes in regulating the spatiotemporal transcriptional outcome at important homeotic loci, for example, Bithorax, through the integration of external stimuli and a dynamic competition at the regulatory elements (PRE/TRE) (90). In mammalian systems, several TrxG and PcG loss-of-function mouse models show severe developmental abnormalities, which further asserts the importance of these factors during mammalian development (91). One of the most important cellular models used to characterize the molecular functions of both of the TrxG and PcG proteins during cellular differentiation are mESCs. Several conditional or straight PcG and TrxG knockout mESCs have been generated and studied both at the transcriptional and phenotypical levels, which has uncovered the fundamental role of the epigenetic control of cellular differentiation and mouse development (92) (Tables 1 and 2). The importance of these proteins in ensuring a correct lineage commitment was also highlighted by the strict requirement for them during cell reprogramming (93).

The PcG has been extensively studied in mESCs; however, many aspects of their function in these cells still remain to be elucidated (46). Strong molecular evidence that PcG directly contributes to the transcriptional program of mouse embryonic development comes from studies that mapped PcG target genes using chromatin immunoprecipitation coupled to promoter array (ChIP-chip) in mESCs (94). Both PRC1 and PRC2 proteins were demonstrated to be bound to key genes implicated in cellular differentiation and embryonic development in mESCs (94). As expected by their very early requirement during embryonic development, the PRC1 catalytic components RING1A/B are indispensable to maintaining mESCs’ pluripotency. Indeed, mESCs acutely deprived of Ring1b/a reveal an acute proliferation defect accompanied by a loss of pluripotency and the up-regulation of a set of PRC1 target genes involved in cellular differentiation programs or tissue development (95). RING1B and OCT3/4 co-occupancy was demonstrated at some loci that are essential for mESCs differentiation, with RING1B recruitment depending on OCT3/4 occupancy, and its loss correlating with gene derepression upon differentiation (95). However, a more thorough genome-wide analysis of co-occupied sites could better reveal an overall interaction between PRC1 and the core pluripotency network.

It is not surprising, given the reciprocal recruitment of PRC2 and PRC1 on common genomic loci in mESCs (Fig. 3), that the vast majority of PRC1-binding sites are also bound by PRC2, and this is especially evident at bivalent genes important for cellular differentiation and development (96). Even though PRC2 overlaps with PRC1 at many loci, PRC2 seems to have a less crucial role in mESCs. Indeed, mESCs without the core components of PRC2, maintain pluripotency and demonstrate minimal transcriptional alterations compared with wild-type mESCs (71). In contrast, the general differentiation potential of PRC2 null mESCs, as measured by the embryoid body formation assay, is severely altered (73). This potentially reiterates PRC2’s role during embryogenesis, that is, dispensable at very early stages (i.e., inner cell mass formation) but required when massive tissue specification and cellular differentiation occurs (i.e., gastrulation). An intriguing and emerging area of PcG control of mESCs’ pluripotency is their ability to shape the spatial organization of the genome (97). This important activity was initially observed in Drosophila, mainly by using chromosome conformation–capturing techniques that revealed how PRE and, in general, PcG targets are organized in higher chromatin structures that are important for PcG-mediated gene silencing (98, 99). This regulation is conserved in mammals. Indeed, EED is indispensable for maintaining the interaction between PRC2-occupied promoters in mESCs without altering the overall chromatin structure organization (100). Only recently, RING1A and B were reported to control genome organization in mESCs (101). These main components of PRC1 spatially constrain four Hox gene clusters by maintaining enhancer-promoter interactions. The acute genetic deletion of both of the main components of PRC1 induces the release of these interactions, without altering enhancer-promoter interactions, which leads to the subsequent transcriptional activation of Hox genes present in the clusters (101). An interesting and still open question in this area remains whether or not the catalytic activity of both PRC1 and PRC2 are required for their activity in regulating higher chromatin architecture.

PcG recruitment to chromatin: A complex issue

An important issue, still not completely understood, concerns both PRC1 and PRC2 recruitment to chromatin in mESCs. Recent studies show that, although transcriptional silencing is sufficient to recruit PRC2 and trigger H3K27me3 deposition (70, 71), GC-rich DNA, including an exogenous bacterial-derived region, can recruit PRC2 in mESCs (69). Despite these observations, global transcriptional silencing is not sufficient to recruit PRC2 to all CpG islands present in the mouse genome (71, 102), which suggests alternative pathways for PRC2 recruitment other than silent CpG islands. Global genome-wide PRC2 binding to chromatin in mESCs is dependent on its component JARID2 (5456). However, although a positive role of JARID2 in modulating PRC2 catalytic activity in vitro has been clearly demonstrated (58), this role in vivo is still unclear. JARID2 is proposed to be the DNA binding protein that mediated PRC2 recruitment on chromatin; however, the ARID domain (AT-rich interacting domain) of JARID2 has no preference toward a CG-rich–containing sequence in vitro (103), and in Drosophila, Jarid2 demonstrates both PcG-dependent and -independent functions (104). Considering that PRC2 binding on chromatin in mESCs is highly overlapping with CpG islands, (71) this opens up the possibility that other DNA binding proteins besides JARID2 could be responsible for targeting PRC2 to CpG islands.

It has been suggested that PRC1-dependent H2AK119ub contributes to PRC2 recruitment in mESCs (87). One of the noncanonical PRC1 complexes contains the histone demethylase KDM2B/FBXL10, which contains a ZF-CxxC domain that specifically recognizes unmethylated CpG islands (105). Indeed, KDM2B depletion (knockdown) in mESCs impairs RING1B recruitment to chromatin, implementation of H2AK119ub, and proper transcriptional repression in mESCs (106108). Furthermore, deletion of Kdm2b’s ZF-CxxC domain in mESCs leads to a reduction of SUZ12, RING1B, H3K27me3, and H2AK119ub, which results in transcriptional derepression. Consistently, Kdm2b ΔZF-CxxC heterozygous mice show an axial posterior transformation, a common PcG-depletion homeotic phenotype (87). This noncanonical PRC1-dependent recruitment of PRC2 opens up the possibility that the canonical PRC1 (CBX-containing PRC1) (Fig. 3) is indirectly recruited by the noncanonical (RYBP-containing PRC1) (Fig. 3), which explains the overlapping profiles of CBX7 and RYBP at developmental gene promoters (86). CBX7 is the main CBX component of the canonical PRC1 in mESCs, and although its relevance in the maintenance of mESCs pluripotency is debated, it plays an essential role in regulating differentiation during embryoid body formation. For example, CBX7 loss of function in this regard can lead to the deregulation of lineage marker expression, derepression of Hox genes, with some defects already evident at the pluripotent state (109, 110). Although all of the currently published evidence clearly supports an overall role for PRC1 in the maintenance of mESC pluripotency, further studies are needed to investigate what the specific determinants for both canonical and noncanonical PRC1 are that play a role in mESCs’ pluripotency maintenance.

Chromatin bivalency: A functional balance between PcG and COMPASS?

Although histone H3K4me3 marks actively transcribed genes and histone H3K27me3 marks the repressed state of gene expression, subsets of H3K4me3-marked gene promoters are cooccupied by the repressive histone mark H3K27me3. These dually active/repressed marked promoters are referred to as “bivalent domains” (111). It has been suggested that these bivalent domains are enriched at the developmental genes in pluripotent cells that are kept in a poised transcriptional state, ready to be activated or repressed upon stimulation by prodifferentiation signals (111). The presence of both the H3K4me3 and H3K27me3 marks on poised genes suggests that they were under the control of the TrxG and PcG complexes similar to what is seen at the Hox genes in Drosophila (111). Although intriguing, this model has been challenged by several studies. From such a model, bivalency should be rapidly resolved to an activated or repressed state upon differentiation stimuli. However, nonpluripotent stem cells, such as hematopoietic stem cells (HSCs), retain thousands of bivalent marks, and some of them resolve upon erythrocyte differentiation (112). This suggests that bivalent marks can play a general role in poising genes required for differentiation. However, cells with limited or absent differentiation potential, such as mouse embryonic fibroblasts (MEFs) and T cells, contain a large number of bivalent genes, which indicates that the presence of these opposing marks on gene promoters is not restricted to pluripotent or multipotent cells (113115). Although the H3K27me3 at bivalent and not bivalent promoters is deposited by PRC2 (73), the H3K4me3 at bivalent genes is specifically deposited by MLL2/COMPASS (36, 43). Even though bivalency was suggested to be important during rapid transcriptional induction, MLL2/COMPASS depletion, and the consequent specific loss of H3K4me3 at bivalent genes, none of these affects pluripotency or the rapid transcriptional induction of those genes upon retinoic acid stimulation (36, 43).

Although dispensable for rapid transcriptional activation of bivalent genes, the loss of Mll2 affects mESCs embryoid body formation with delayed ectodermal and mesodermal differentiation (116). Single deletion of Mll1, Mll2’s closest homolog, has little impact on H3K4me3 deposition and gene transcription in mESCs (43). Double Mll1 and Mll2 acute deletion shows that MLL1/COMPASS can potentially substitute for MLL2-dependent H3K4me3 deposition at some bivalent genes during mESCs differentiation induced by retinoic acid treatment (43). However, whether double Mll1/2 knockout affects mESCs pluripotency has not yet been clarified. Both Mll2–/– and Setd1a–/– mESCs were reported to have a reduced cell proliferation and an increased apoptotic rate (116, 117). However, only Setd1a–/– mESCs show a global decrease of all three methylation states of H3K4 (H3K4me1/2/3) (117). This demonstrates SET1A/COMPASS’s default activity toward that of H3K4 trimethylation (Fig. 1).

Balance of Polycomb and COMPASS in cancer

Various components of the PcG and COMPASS family of complexes are involved in tumor progression and cancer (84, 118), including the well-known and highly frequent rearrangements of the methyltransferase MLL1 that generate oncogenic fusion, as well as the BMI1 deregulation that is crucial for leukemia pathogenesis and maintenance (119, 120) (Tables 1 to 3). Note that a growing body of evidence suggests that chromatin modifiers are major targets for mutations in cancer (118, 121). Members of the COMPASS family and several PRC2 components are indeed found frequently mutated in different types of cancer (118, 122). Whether all of the translocation and mutations reported so far are driving, cooperating, or passenger mutations remains unknown. Several such examples are described in detail below.

MLL/COMPASS translocations and transcriptional elongation checkpoint defects in leukemogenesis

MLL1 was the first TrxG protein reported to be directly involved in tumor formation, as it was cloned from a genomic region frequently translocated in leukemia (5, 20, 123). In all of the MLL translocations, MLL1’s SET domain is lost; however, the chimeric proteins still have the MLL1-CxxC domain (124). This domain is essential for MLL fusion proteins to bind unmethylated DNA, and it is strictly required to maintain their oncogenic activity; indeed, point mutations that disrupt the CxxC domain abolish the tumorigenic potential of MLL chimeras (124, 125). The central role for this domain in recruitment of the MLL chimeric protein to chromatin independently of the fusion partner suggests a possible common mechanism for the leukemogenic potential based on the CxxC-dependent recruitment. The ELL protein was the first MLL fusion partner for which a biochemical function was assigned and biochemically well characterized (126). ELL was demonstrated to play a positive role in transcriptional elongation control by increasing the catalytic properties of elongating RNA polymerase II (126). On the basis of these very early observations, we proposed more than 20 years ago that transcriptional elongation control could play a central role in leukemic pathogenesis through MLL translocations. Similar to ELL, other MLL translocation partners have been shown to have positive roles in transcriptional control, such as ENL (127), AFF1 (AF4), and AF9 (128).

Detailed biochemical purifications of several of the MLL chimeras found in childhood leukemia led to the identification of AFF4 (a known rare MLL partner in leukemia) as a common factor found in many of the purified chimeras (129). When AFF4 was purified to homogeneity from nonleukemic cells, it was demonstrated that AFF4 is a central component of a transcriptional elongation complex containing many of the known MLL partners found in childhood leukemia including AFF1, AFF4, ENL, and AF9, as well as well-known elongation factors, such as ELL, ELL2, ELL3, and P-TEFb (129). This complex, named the super elongation complex (SEC), demonstrates that it is the translocation of MLL1 into SEC that is involved in the misrecruitment of SEC to MLL target genes, which perturbs transcription elongation checkpoints at MLL1 target loci and results in leukemic pathogenesis (129). Accordingly, deletion of the AFF4-binding domain from MLL-AFF4 strongly impairs its oncogenic activity and the transactivation of the canonical MLL1 downstream target HOXA9 (129, 130). It is now known that not only is SEC involved in the regulation of transcription of the MLL1 target genes in leukemia but also that this elongation complex is the central factor regulating transcription elongation control and rapid transcriptional response in many of the cellular models tested (118). This observation, and subsequent published studies, confirms our model proposed more than 20 years ago that misregulation of transcriptional elongation control is central to leukemic pathogenesis through MLL translocations.

MLL fusions and the mammalian COMPASS

Some other members of the mammalian COMPASS family have been shown to be essential for MLL fusion protein oncogenic activity. MEN1 (Menin) depletion causes proliferation arrest of MLL-AF9–driven leukemia, and this is partially driven by the positive direct transcriptional control of HOXA9 (131). It has been suggested that MEN1 is capable of recruiting MLL1 to HOXA9 and mediating its expression; however, in global studies, MLL1 recruitment in the absence of MEN1 has not been demonstrated. Furthermore, the endogenous MLL1 is essential for MLL-AF9 driven leukemogenesis, likely through maintaining the active transcription of essential oncogenes, such as HOXA9 (132). Notably, a small-molecule–inhibiting WDR5 binding to MLL1 has been recently reported to reduce the leukemogenic potential of MLL fusion proteins (133); however, the molecular mechanism for this process and its specificity through WDR5, which is a shared component of all of the COMPASS family is missing. Another small-molecule targeting the MLL1-WDR5 interaction was also reported to specifically affect the leukemogenic potential of p30/CEBPα driven leukemia (134). However, its efficacy in MLL fusion protein–driven leukemia was not tested nor was the specificity for MLL1/COMPASS versus other family members determined. Nonetheless, these findings suggest the possibility that targeting the mammalian COMPASS family protein interactions in different types of leukemia could be a successful strategy for the treatment of these cancers.

Mutations in enhancer-associated COMPASS-like MLL3/4 in cancer

Mutations in epigenetic factors, both loss and gain of function, are becoming more evident as hallmarks of cancer (118, 135). Genes encoding for the mammalian COMPASS family are among the most mutated chromatin modifiers in several types of cancer (118). MLL4 (KMT2D) mutations were initially found in patients affected by the Kabuki syndrome, a rare genetic disease that causes multiple malformations (136). Two comprehensive analyses of genetic lesions in the two most common non-Hodgkin’s lymphomas (NHLs), diffuse large B cell lymphomas (DLBCLs) and follicular lymphomas (FLs), identified MLL4 as a highly frequently mutated gene (around 23 to 32% of primary DLBCL and 89% of primary FL) (137, 138). However, almost all of the mutations found were monoallelic, which suggests that MLL4 haploinsufficiency might play a role in DLBCL and FL pathogenesis (137, 138). Notably, two independent groups demonstrated that Mll4 deficiency in mice accelerates lymphomagenesis, which confirms the crucial role of MLL4/COMPASS activity at enhancers in tumor suppression in the germinal center (139, 140). High rates of missense, nonsense, and frameshift indel mutations in the MLL4 gene have also been reported in medulloblastomas (MBs) (141). Although with less frequency, this study also identified MLL3 mutations in MBs. MLL3 gene mutations are also found in other tumors, such as kidney and bladder tumors (142144). Almost half of homozygous Mll3 ΔSet mice are characterized by an unscheduled hyperproliferation of urothelial cells and urothelial neoplasia. MLL3 serves as a p53 transcriptional coactivator in these tumors and, accordingly, all Mll3 ΔSet; p53 +/− compound mice develop urothelial tumors with an earlier incidence compared with Mll3 ΔSet mice (145).

Mutations of the MLL3/4 COMPASS components and UTX and enhancer malfunction as a common driver of cancer

In addition to the mutations of MLL3 and MLL4 in cancer, the methyltransferase components of COMPASS-like complexes, a component common to both complexes, UTX, is frequently mutated in bladder cancer and in a variety of other forms of cancers (Fig. 1) (146). In particular, UTX is frequently mutated in T cell acute lymphoblastic leukemia (T-ALL), and accordingly, murine models of Notch-driven T-ALL are more aggressive when Utx is genetically deleted (147, 148). UTX could conceivably demethylate H3K27me2/3 present at enhancers to facilitate the conversion from poised to active enhancers by means of CBP/p300 acetylation at the same residue, which leads to activation of the transcription network regulated by those enhancers (Fig. 2) (25). This suggests a tumor suppressive role of UTX through ensuring the maintenance of proper transcriptional network activation by erasing the putative oncogenic silencing induced by PRC2.

Although an antagonism between UTX-mediated demethylation and PRC2 methylation to regulate acetylation at enhancers represents an interesting hypothesis that could also extend to normal cellular differentiation, it remains largely unexplored (149). Along with mammalian COMPASS-like members, EP300 and CREBBP, which encode two major histone acetyltransferases involved in enhancer specification (150), are also mutated in the same tumors (137, 138, 143). The frequent rate of mutations to these enhancers’ regulatory proteins strongly suggests that tumor initiation and/or maintenance needs to evade tissue-specific enhancer transcriptional networks, which leads to the hypothesis that mutations at enhancer regions or at enhancers’ chromatin modifiers may be a driving force during oncogenesis, and so enhancer malfunction is identified as a new hallmark of cancer (30, 31, 118, 151).

PRC2: Oncogene or tumor suppressor?

Ezh2 is the most studied component of PRC2 in cancer (46, 84). EZH2 was initially identified as a putative proto-oncogene since it was found up-regulated in a variety of different tumors, and its down-regulation or overexpression was correlated, respectively, with a loss or gain of proliferation and transformation potential (152154). After these first studies, and with the discovery of the PRC2 methyltransferase activity (51), the role of PRC2 methylation in cancer has garnered much interest and has been widely investigated (84).

Recently, a frequent point mutation in the EZH2 SET domain [Tyr641 (Y641)] was discovered in NHL (155). Even though it was initially identified as an inactivating mutation, the frequent EZH2 overexpression in lymphomas (156) and the heterozygous nature of the mutation (155) strongly suggested an alternative explanation. Indeed, it was subsequently demonstrated that Y641 mutations are catalytic gain-of-function mutations that lead to an increased conversion of H3K27me2 to H3K27me3 (157, 158) (Fig. 2). Small molecules that are specifically designed to inhibit EZH2 Y641 mutants selectively impair the oncogenic potential of lymphoma cells both in vitro and in vivo (159, 160). A mouse model that expresses an EZH2 Y641 mutant allele in germinal centers was also generated. This mutant leads to germinal center hyperplasia, with additional genetic alterations, e.g., Bcl2 overexpression resulting in lymphoma (161).

Although several studies strongly indicate that PRC2 components have oncogenic functions, some other studies indicate that PRC2 can serve as a tumor suppressor in specific tumors. Ezh2 deletion in the hematopoietic system results in the pathogenesis of T-ALL in mice (162). This was further supported by alterations in EZH2 and SUZ12 in human T-ALL (163). It has also been demonstrated that, even though Ezh2 deletion can lead to T-ALL, the genetic deletion of its homolog Ezh1 in Ezh2 null T-ALL results in leukemia regression in mice (164). This study suggests that, although Ezh2 mutations can lead to T-ALL, overall PRC2 function is still oncogenic in these tumors. Nonetheless, highly frequent homozygous PRC2 component mutations were found in malignant peripheral nerve sheath tumors (MPNST) (165, 166). Notably, reexpression of SUZ12 in MPNST cells characterized by SUZ12 homozygous deletion restores H3K27me3 and impairs cell proliferation (165). Together, the published data in the literature so far indicate that different components of PCR2 can function as tumor suppressors or oncogenes depending on the chromatin context and tissues of origin of the cancer. Certainly, further molecular characterization of the cancers depending on the PRC2 complex and its components should shed further light on this matter.

Another interesting example of the role of histone H3K27 methylation in cancer concerns the discovery of a recurrent single-nucleotide substitution in diffused intrinsic pontine gliomas (DIPGs). This mutation results in a lysine-to-methionine conversion at position 27 in the histone variant H3.3 (H3.3K27M) (167, 168). Although this mutation only affects a single histone H3.3 allele, a global loss of H3K27me3 and reduced PRC2 catalytic activity was observed (169, 170). This indicates that reduced global PRC2 activity can be required for tumor formation. However, genome-wide analyses also demonstrated that H3K27M positive DIPG cells still have loci marked by H3K27me3 and have PRC2 loaded on chromatin, which raises the possibility that selective inhibition of PRC2 at specific loci can be tumorigenic (170, 171). The mechanism through which H3.3K27M impairs PRC2 function is still debated. Although it was initially shown that H3.3K27M can sequester EZH2 by direct binding (169), a recent study from our laboratory failed to show preferential binding of PRC2 toward H3.3K27M compared with wild-type H3.3 (172). We demonstrated that cells expressing H3K27M contain higher levels of H3K27 acetylated nucleosomes and also identified BRD4 to preferentially copurify with H3.3K27M-containing nucleosomes compared with wild-type H3.3. BRD4 is a well-known therapeutic target in cancer (173). Moreover, PRC2 mutations sensitize PMNSTs to BRD4 inhibition (174), possibly because of an increase of H3K27ac that occurs in cells deficient in PRC2 activity (172, 175) (Fig. 2). Therefore, H3.3K27M-positive DIPG tumors could be promising candidates for BRD4 inhibition therapy.

Whether the H3.3K27M mutation is an oncogenic driving force in DIPGs is still not completely understood. Histone H3.3K27M can synergize with overexpression of the gene for platelet-derived growth factor receptor α (PDGFRA) and p53 loss to enhance proliferation and transformation of human ESC–derived neural stem cells (NSCs) (176). This suggests that H3K27M could participate in tumor formation. Whether H3K27M is required for tumor maintenance and the detailed molecular mechanism of its action remain open for questions. Note that pharmacological inhibition of JMJD3, a known H3K27me3 demethylase, can selectively impair H3.3K27M DIPG cells’ tumorigenic potential by restoring normal levels of H3K27me3 (177), which supports restoration of full PRC2 activity as a potential treatment for H3.3K27M-positive DIPGs.

Future direction of studies on Polycomb and COMPASS families in development and during disease pathogenesis

PcG and COMPASS are fundamental, evolutionarily conserved families of enzymes and are central regulators of gene expression. Accordingly, perturbation of their composition and activities can substantially alter normal biological processes in development, including cellular proliferation and differentiation. Although, in Drosophila, the interplay between the two groups is well established (6), the interconnection between PcG and TrxG in mammals is less clear, given the fact that there are six COMPASS families in mammals as opposed to three in Drosophila. Whether the different phenotypes presented in this review are caused by an overactive role of PcG or TrxG because of the missing or mutated counterparts in mammals is an interesting and largely unexplored area. The COMPASS family of proteins and PRC1/PRC2 each use different enzymatic activities to posttranslationally modify nucleosomes. Although all of these nucleosome posttranslational modifications are frequently associated with differential transcriptional outcomes, the formal proof that they are directly involved in transcriptional regulation is still missing for all marks (178). In Drosophila, changing lysine 27 to arginine (K27R) in all of the genes coding for histone H3 leads to derepression of canonical PcG targets and homeotic transformation (179). Furthermore, H3K27M-expressing flies display derepression of homeotic genes and developmental defects similar to those of PRC2-deficient flies (172). This strongly suggests that, at least in Drosophila, the PRC2 catalytic activity toward the H3K27 residue is required for some of its biological functions. In contrast, dRing (Ring1a/b homolog in Drosophila) activity toward H2AK119 was shown to be largely dispensable for gene repression and canonical Polycomb-associated phenotypes (82). Similarly, mESCs expressing catalytically dead RING1B still achieve chromatin compaction and transcriptional repression at the Hox clusters (180). These studies, along with the in vitro evidence that PRC1 can compact nucleosome arrays independently of the presence of histone tails (181), support an important catalytic-independent role for PRC1 in transcriptional regulation. Examples of PRC2 catalytic-independent functions were also reported in a prostate cancer model in which PRC2 acts as a coactivator for the androgen receptor (182).

For the COMPASS family of proteins, to date, there is not any convincing evidence for catalytic-dependent versus independent functions in cancer. All the while, the precise role(s) of H3K4 methylations, besides their ability to predict transcription or chromatin states, are still elusive (150). It is not known at this time how different COMPASS-related complexes deposit specific states of methylation (H3K4me1/2/3) and how such intricate small differences in the pattern in histone modifications (H3K4me1/2/3) are deciphered by cells. For examples, Trx proteins, such as SNF5, were demonstrated to directly counteract PcG (e.g., complex eviction) at commonly regulated loci, similar to what happens in Drosophila (183). These are exciting and still unexplored areas that can help us understand the missing details of the regulation of gene expression exerted by the balance between the PcG and COMPASS family proteins, and the extent to which the balance or imbalance of their activities contributes to development and disease.

References and Notes