Report

Evolutionary changes in promoter and enhancer activity during human corticogenesis

See allHide authors and affiliations

Science  06 Mar 2015:
Vol. 347, Issue 6226, pp. 1155-1159
DOI: 10.1126/science.1260943

Of mice, men, and macaque brains

The human brain represents a unique evolutionary trajectory. To better understand how the human brain came to be, Reilly et al. sought to identify changes in gene expression between mice, macaques, and humans. They compared epigenetic marks in the embryonic cortex, which revealed changes in gene regulation in biological pathways associated with cortical development.

Science, this issue p. 1155

Abstract

Human higher cognition is attributed to the evolutionary expansion and elaboration of the human cerebral cortex. However, the genetic mechanisms contributing to these developmental changes are poorly understood. We used comparative epigenetic profiling of human, rhesus macaque, and mouse corticogenesis to identify promoters and enhancers that have gained activity in humans. These gains are significantly enriched in modules of coexpressed genes in the cortex that function in neuronal proliferation, migration, and cortical-map organization. Gain-enriched modules also showed correlated gene expression patterns and similar transcription factor binding site enrichments in promoters and enhancers, suggesting that they are connected by common regulatory mechanisms. Our results reveal coordinated patterns of potential regulatory changes associated with conserved developmental processes during corticogenesis, providing insight into human cortical evolution.

The massive expansion and functional elaboration of the neocortex underlies the advanced cognitive abilities of humans (1). Although the overall process of corticogenesis is broadly conserved across mammals, humans exhibit differences that emerge within the first 12 weeks of gestation. Among these are an increased duration of neurogenesis, increases in the number and diversity of progenitors, modification of neuronal migration, and introduction of new connections among functional areas (2, 3). The genetic changes responsible for these evolutionary novelties are largely unknown.

Changes in gene regulation are hypothesized to be a major source of evolutionary innovation during development (1, 3, 4). Critical events in corticogenesis, including the specification of cortical areas and differentiation of cortical layers, rely on the precise control of gene expression (4). The evolution of distinctly human cortical features required changes in many of these early developmental processes, which may have been driven by modifications in the gene regulatory programs that govern them. However, identifying such regulatory changes and linking them to relevant biological processes has proven to be challenging. Previous efforts have relied on comparative genomics or on gene expression comparisons at later developmental and adult stages (57). Further progress has been hindered by the lack of genome-wide maps of regulatory function during corticogenesis.

Genome-wide profiling of posttranslational histone modifications associated with regulatory functions has been used to compare regulatory element activities across species (812). In this work, we profiled H3K27ac and H3K4me2 to map active promoters and enhancers during human, rhesus macaque, and mouse corticogenesis, as well as to identify increases in their activity in humans. We examined biological replicates of whole human cortex at 7 postconception weeks (p.c.w.) and 8.5 p.c.w. and primitive frontal and occipital tissues from 12 p.c.w. (Fig. 1A). These stages span the appearance of the transient embryonic zones that generate cortical neurons from the deep to the superficial layers, when distinctly human features of the cortex begin to emerge (1315). Homologous rhesus and mouse time points were selected on the basis of cross-species studies of cortical development (1316). The mouse cortex develops over the course of 1 week [embryonic day 11.5 (E11.5) to E17.5], adhering to the same general developmental processes observed in primates during this homologous time frame (16).

Fig. 1 Comparative epigenetic analysis of corticogenesis in human, rhesus, and mouse genomes.

(A) (Top) Stages of human cortical development from 7 to 12 p.c.w. The location of the cross section shown below each whole-cortex illustration is indicated by a box. (Bottom) Schematized cross sections of the developing cortex. Progenitors in the ventricular zone (VZ) produce neurons that are amplified in number in the growing subventricular zone (SVZ) and then migrate through the intermediate zone (IZ) to their final destination in the cortical plate (CP). Cortical layers (e.g., L5, L6) present at each time point are shown. (B) Number of promoters and enhancers at each human time point that are reproducibly marked by H3K27ac, H3K4me2, or both, with the number of human gains shown in bold. (C) (Left) Human lineage epigenetic gain at a known human forebrain enhancer. The levels of H3K27ac (blue) and H3K4me2 (teal) at the orthologous locations in human, rhesus, and mouse genomes are shown. The asterisks indicate BH-corrected P value ≤ 0.001 and log2 fold increase ≥ 1.5. (Right) LacZ reporter gene expression driven by the human (top) and orthologous rhesus (bottom) enhancers in E11.5 transgenic mouse embryos. The ventral expression domain specific to humans is indicated by an arrow (see also fig. S7).

We identified 22,139 promoters (34% of genes in Gencode version 10) and 52,317 enhancers active in the human cortex during at least one developmental stage (Fig. 1B). H3K27ac and H3K4me2 are highly concordant at promoters, with 85% of sites marked by both histone modifications. Histone modification signatures were less concordant at enhancers, with 45% of all sites marked by both H3K27ac and H3K4me2. This is consistent with studies suggesting that H3K27ac and H3K4me2 identify both overlapping and distinct sets of enhancers (11). We identified 16,473 enhancers most strongly marked by H3K27ac in the cortex relative to seven other human tissues (fig. S1A) (17). These enhancers are significantly enriched near genes associated with cortical development, such as positive regulation of neurogenesis (binomial test, P ≤ 1 × 10−53) and neural precursor cell proliferation (binomial test, P ≤ 1 × 10−29) (fig. S1B) (18). Both marks also significantly enrich for enhancers active in the developing cortex versus other tissues (Fisher’s exact test, P ≤ 1 × 10−15), identifying more than 80% of known forebrain enhancers (fig. S2, A to F) (17, 19, 20). We also identified 74,189 promoters and enhancers active in the rhesus macaque genome and 74,809 in the mouse genome, generat ing a dense map of regulatory function during corticogenesis across species.

In principal component analysis (PCA), H3K27ac signals clustered first by embryonic tissue type, then by evolutionary distance (fig. S3A) (17). H3K27ac signals in human and mouse cortex were also more similar by PCA than signatures from homologous human and mouse adult tissues or embryonic stem cells (fig. S3B) (17). Spearman correlation analysis of H3K27ac and H3K4me2 cortex signals supported strong replicate reproducibility in all data sets, as well as higher correlations between rhesus and human cortex compared with mouse cortex (fig. S4, A and B).

To identify promoters and enhancers showing quantitative epigenetic gains in the human genome versus both the rhesus and mouse genomes, we compared the level of H3K27ac or H3K4me2 signal in replicating human peaks to the signals at corresponding orthologous sites in the other two species (9, 17) (fig. S5). Human gains were called on the basis of an increase in H3K27ac or H3K4me2 signal compared with all rhesus and mouse data sets for each mark (17). It is possible that we may be overestimating gains at 7 p.c.w., due to the lack of an early developmental stage in rhesus. However, this concern is mitigated by our inclusion of a comparable mouse time point and our requirement that each human site exhibit an epigenetic gain compared with all mouse and rhesus time points and tissues. In total, 8996 nonoverlapping enhancers and 2855 promoters show epigenetic gains in humans (Fig. 1B). To assess the robustness of these gains, we compared epigenetic signals at the orthologous human and mouse genomic locations for 77 human gains by chromatin immunoprecipitation (ChIP)–quantitative polymerase chain reaction using additional biological replicates (fig. S6, A and B). Sixty-seven of these sites (87%) showed a gain in humans, supporting the reproducibility of the epigenetic gain calls from our genome-wide analysis (17). We then explored this high-confidence set of gains to obtain insight into their origins and relevance to human cortical evolution.

We first considered whether epigenetic gains could be attributed to human-specific sequence changes. Forty-eight highly conserved noncoding regions displaying accelerated evolution in humans exhibit increased H3K27ac or H3K4me2 in the human cortex (table S1) (5, 6). However, gains in general do not show increased rates of human-specific sequence change, suggesting that the majority of our gains cannot be identified by sequence acceleration alone (table S1).

In light of this result, we examined epigenetic gains at known human enhancers active in the embryonic forebrain to determine whether gains reveal changes in regulatory function (19) (table S1). In a proof-of-principle experiment, we compared the activities of a human forebrain enhancer exhibiting a gain and its rhesus ortholog using a mouse embryonic transgenic enhancer assay (20). The human enhancer drove reproducible reporter gene expression in two telencephalon domains: a wide caudal-dorsal domain and a caudal-ventral stripe (Fig. 1C). The rhesus ortholog drove qualitatively weaker reporter gene expression in a similar caudal-dorsal domain but did not drive reproducible activity in the human caudal-ventral domain. Upon sectioning, we determined that the dorsal domain was restricted to the neocortex, whereas the human ventral domain corresponded to the caudal ganglionic eminence (fig. S7C).

We also searched for genomic regions with a high density of enhancers or promoters exhibiting gains. We used previously defined maps of long-range genomic interactions to demarcate putative regulatory domains maintained across tissues and species (17, 21). This analysis revealed genes within topologically delimited domains that are hotspots of epigenetic gains (fig. S8, A to D, and table S2). We identified 301 genes within a gain-enriched hotspot that included at least one gene with a promoter gain, notably TGFβR3, COL13A1, EPHA2, and LMX1B.

To obtain global insights into biological pathways associated with human lineage epigenetic gains, we integrated gains with gene coexpression network analyses (22). We generated a coexpression network using public RNA sequencing data from multiple neocortical areas spanning 8 to 15 p.c.w., which includes the periods of corticogenesis in which we mapped H3K27ac and H3K4me2 signatures (Fig. 2A, fig. S9A, and table S3) (23). This network consists of 96 modules, each of which is a set of genes showing highly correlated expression across multiple neocortical regions and developmental stages. Genes in each module may be co-regulated and may participate in related biological processes. Hub genes are defined as genes with connectivity values in the top 5% for each module, suggesting that they include important regulators that drive correlated gene expression. Epigenetic gains at promoters were directly assigned to their target genes, whereas gains at enhancers were assigned on the basis of their proximity to annotated genes (17, 18).

Fig. 2 Identifying modules of coexpressed genes enriched for epigenetic gains in human corticogenesis.

(A) Schematic illustrating integration of epigenetic gains into coexpression networks. (B) Coexpression module enriched for H3K27ac enhancer gains. Genes associated with gains are highlighted, and genes representative of the biological enrichments associated with the module are labeled. The module was rendered using multidimensional scaling (17). (C) Fold enrichment of H3K27ac enhancer gains at each human time point in this module. *P < 0.01 (BH-corrected permutation). (D) Gene ontology enrichments for genes associated with gains in this module. P values were calculated using a binomial test in DAVID (the Database for Annotation, Visualization and Integrated Discovery) (17).

We used permutation analysis to identify modules significantly enriched in human lineage gains at enhancers or promoters (fig. S9, B and C) (17). Seventeen modules are enriched for H3K27ac or H3K4me2 gains in at least one human developmental stage. Overall, gains are consistently enriched in modules containing genes associated with biological processes crucial for cortical development (table S4). For example, module 3 (Fig. 2B) is enriched for human lineage H3K27ac enhancer gains that are associated with genes implicated in neuronal progenitor proliferation. Gene ontology categories showing significant enrichment include neuronal differentiation (binomial test, P = 2.13 × 10–4) and neuron fate commitment (binomial test, P = 3.67 × 10−4) (Fig. 2D). Epigenetic gains in this module are associated with genes critical for cortical development, including PAX6, GLI3, and FGFR1. Each of these is a hub gene, consistent with their known contributions to fundamental processes in corticogenesis. Notably, PAX6 controls cortical cell number by regulating cell cycle exit of neural progenitor cells, and Pax6-null mice have a depleted progenitor pool and a reduced cortical neuron number (24). Heightened signaling through FGFR1 during rat corticogenesis increases the neuron number by more than 80% (25).

Module 15 is enriched in human H3K27ac and H3K4me2 promoter gains [Benjamini-Hochberg (BH)–corrected permutation, P = 0.003] (fig. S10A) associated with cortical-patterning ontologies such as regionalization (binomial test, P = 7.53 × 10−4) and forebrain development (binomial test, P = 1.63 × 10−5) (fig. S10B). Homeobox genes are notably enriched among genes associated with gains in this module (binomial test, P = 1.18 × 10−8).

Module 10 shows the strongest enrichment of human lineage H3K27ac and H3K4me2 promoter and enhancer gains in the network (Fig. 3, A and B). Genes implicated in extracellular matrix (ECM) functions are significantly overrepresented among gain-associated genes in this module (binomial test, P = 2.26 × 10−7) (table S5). The ECM contributes to the maintenance of human progenitor cell self-renewal and neuronal migration (26). Module 10 gain-associated genes are also enriched for transforming growth factor–β (TGFβ) and fibroblast growth factor (FGF) pathway members (binomial test, P = 2.42 × 10−3). Notably, both module 10 and module 3 include gain-associated genes belonging to the TGFβ and FGF pathways (Fig. 3C). The association of gains with biologically related genes across multiple enriched modules suggests that there may be regulatory coordination and potential transcription factor (TF) cross-talk among these modules.

Fig. 3 Enrichment of epigenetic gains in module 10.

(A) Epigenetic gains mapped onto module 10; genes associated with gains are highlighted as in Fig. 2B. (B). Fold enrichment of H3K27ac promoter or enhancer gains at each human time point in this module. *P < 0.01 (BH-corrected permutation). (C) Genes in the related FGF, TGFβ, bone morphogenetic protein (BMP), and ECM signaling pathways are associated with gains from module 10 (yellow stars) and module 3 (red stars). Genes or gene families are highlighted in orange; associated biological processes are in green. The pathway shown is derived from KEGG pathway annotations.

Consistent with this hypothesis, gain-enriched modules exhibited significantly higher gene expression correlations with each other than with other modules in the network (Wilcoxon rank sum test, P < 1 × 10−15) (Fig. 4A). Moreover, gain-associated genes in enriched modules converge on related biological functions (Fig. 4B). To identify regulatory signatures underlying the correlation of these modules, we predicted transcription factor binding sites in all active promoters and enhancers in our data set, including human lineage gains. We then identified enriched TF motifs in enhancers or promoters assigned to each module. Many motifs were enriched in promoters and enhancers assigned to the same module as the transcription factor itself. Surprisingly, we also identified TF motifs enriched across multiple modules. For example, SMAD binding motifs were enriched in active promoters in module 10, although SMAD transcription factors are not included in this module (BH permutation test, P = 7.92 × 10–3) (table S5). The observed transcription factor binding site enrichment patterns suggest regulatory cross-talk among gain-enriched modules that may contribute to their highly correlated expression.

Fig. 4 Modules enriched for epigenetic gains converge on common biological processes.

(A) Gain-enriched modules exhibit significantly higher gene expression correlation values with each other than with modules not enriched for gains. *P < 1.1 × 10−15 (Wilcoxon rank sum test). (B) Eigengene expression correlations among the top 35 modules (as ranked by number of genes). Modules enriched in gains are numbered. Ontologies associated with gains in each module are highlighted. Arrows connect modules that include each transcription factor shown with modules that are enriched for that factor’s binding motif.

Our results reveal a marked convergence of human lineage epigenetic gains on common biological processes and regulatory pathways in corticogenesis. Epigenetic gains are enriched in modules important for neuronal proliferation, cortical patterning, and the ECM. Moreover, gain-associated genes in each module are enriched for similar conserved biological functions as all genes in the entire module (table S4). These findings suggest that many human lineage regulatory changes operate within, and have potentially modified, older regulatory mechanisms and developmental processes essential for building the mammalian cortex.

The epigenetic changes associated with these conserved biological pathways also predominantly occur at sequences with ancestral regulatory activity. The majority of human lineage gains involve potential modification of promoters or enhancers marked by H3K27ac in rhesus or mouse cortex (fig. S11) (10). A smaller proportion of gains may arise from co-option of ancestral regulatory sequences active in noncortical tissues. Human gains not marked in any of the 2 rhesus or 20 mouse tissues we examined may include de novo regulatory functions arising on the human lineage. We note that epigenetic gains may be due to genetic changes in humans that directly altered regulatory functions, or they may reflect coordinated changes in cellular composition in the human cortex compared with rhesus and mouse cortex. Distinguishing between these two modalities of evolutionary change will require functional analysis of the sequences underlying epigenetic gains using mouse transgenic assays and humanized mouse models. Such studies would also provide insight into the biological relevance of the molecular changes described here.

The convergence of human regulatory innovations on developmentally related functions is also consistent with the biological complexity of the cortex. Neocortical development requires the orchestration of spatially and temporally distinct but biologically interconnected mechanisms. In the context of this interdependency, it has been postulated that human cortical evolution involved coordinated changes in multiple processes during corticogenesis (3). For example, changes in progenitor proliferation probably required concomitant changes in patterning and connectivity to generate novel cortical functions (1). The inventory of human lineage regulatory changes that we identified provides the means to evaluate this hypothesis and dissect the genetic mechanisms underlying the evolution of the human cortex.

Supplementary Materials

www.sciencemag.org/content/347/6226/1155/suppl/DC1

Materials and Methods

Figs. S1 to S12

Tables S1 to S5

References (2744)

References and Notes

  1. See supplementary materials and methods on Science Online.
  2. Acknowledgments: This work was supported by NIH grants GM094780 (to J.P.N.), DA023999 (to P.R.), NS014841 (to P.R), and F32 GM106628 (to D.E.); a Brown Coxe Fellowship in the Medical Sciences (to J.Y.); and an NSF Graduate Research Fellowship (to S.K.R.). Human tissue was provided by the Joint Medical Research Council (UK)/Wellcome Trust (grant 099175/Z/12/Z) Human Developmental Biology Resource (HDBR) (http://hdbr.org). The human tissues used in this study are covered by a material transfer agreement regarding their transfer, but tissues may be requested directly from the HDBR. We thank S. Mane, K. Bilguvar, S. Umlauf, and A. Lopez at the Yale Center for Genome Analysis for sequencing data; the members of the BrainSpan consortium for providing human brain transcriptome data to the research community; N. Carriero and R. Bjornson at the Yale University Biomedical Performance Computing Center for computing support; T. Nottoli and C. Pease at the Yale Animal Genomics Service for generating transgenic mice; and S. Wilson and M. Horn for veterinary care of nonhuman primates. All ChIP-seq data are available through the Gene Expression Omnibus under accession number GSE63649.
View Abstract

Navigate This Article