Research Article

Transcriptome and epigenome landscape of human cortical development modeled in organoids

See allHide authors and affiliations

Science  14 Dec 2018:
Vol. 362, Issue 6420, eaat6720
DOI: 10.1126/science.aat6720
  • Summary of the study, analyses, and main results.

    Data were generated for iPSC-derived human telencephalic organoids and isogenic fetal cortex. Organoids modeled embryonic and early fetal cortex and show a larger repertoire of enhancers. Enhancers could be divided into activators and repressors of gene expression. We derived networks of modules and supermodules with correlated gene and enhancer activities, some of which were implicated in autism spectrum disorders (ASD).

  • Fig. 1 Comparison of transcriptome and epigenome of organoid and isogenic fetal brain.

    (A) Dataset and sample annotation. Samples are from both our project (hiPSC lines, organoids, fetal brain samples), other PsychENCODE projects, and the Roadmap Epigenomics project. Colors correspond to datasets represented in (B) to (D). (B to D) Hierarchical clustering dendrograms of samples by transcriptomes (B) and ChIP-seq peaks of H3K27ac (C) and H3K4me3 (D). (E) Hierarchical clustering of organoids and isogenic postmortem cortexes by transcriptomes and gene-associated enhancer elements. Organoid and brain samples used for clustering are shown on top. Colors and shapes correspond to the datasets represented in the panels below. (F) Transcriptome-based classification of organoids and isogenic cortexes by age (8) against the tissues from the PsychENCODE developmental dataset (PCW, postconceptional week) from Li et al. (9). For each sample, red shading indicates the average of correlation coefficients above the cut-off as defined in (8) between our sample and those in Li et al. (9). White boxes indicate correlations below the cut-off. Correlations to brains older than 2 years of age were all below the cut-off and thus were not displayed. (G) Overlap of differentially expressed genes (DEGs) and differentially active enhancers (DAEs) between organoids at each differentiation time point and isogenic fetal cortex (CTX). (H) tSNE scatterplot of 17,837 nuclei, colored by cluster. Clusters arising predominantly from fetal cortex are circled. RG, radial glia; MGE, medial ganglionic eminence; IPC, intermediate progenitor cells; OPC, oligodendrocyte precursor cells. “Novel” means no correspondence to previous annotations. (I) Counts of DEGs and DAEs between organoids at different stages of development.

  • Fig. 2 Modules of coexpressed genes and coactive enhancers during organoid differentiation.

    (A) Unsupervised hierarchical clustering of gene modules (1 through 54) by expression eigengenes. Rows and columns represent gene modules and samples, respectively. (B) Unsupervised hierarchical clustering of enhancer modules (1 through 29) by activity eigengenes. Rows and columns represent samples and enhancer modules, respectively. (C and D) Mean module eigengenes (lines) across differentiation times grouped by gene (C) and enhancer (D) supermodules, respectively. Dots represent values of eigengenes for individual modules. (E to H) Enrichment of gene (E and G) and enhancer (F and H) modules for DEGs and DAEs and for various enhancers and genes of interest from the literature, including HGE (human-gained enhancers) (26), TF (genes encoding transcription factors during human fetal brain development) (24), ASD (genes pertinent to autism spectrum disorder) (22), and DBD (genes pertinent to developing brain disorder) (23). (I) Correspondence between the gene and enhancer networks. The strongest A-reg (pink dots) and R-reg (cyan dots) for a subset of gene modules are overrepresented in a number of enhancer modules. Black circles emphasize converging genes and enhancer modules, both of which are ASD-associated [as shown in (G) and (H)]. Panels (E) to (I) are aligned by the gene and enhancer modules shown in (A) and (B).

  • Fig. 3 ASD-associated genes modules.

    (A) Overlap of ASD gene modules MG4, MG5, and MG51 from this study with transcript modules associated with ASD from postmortem brain studies or enriched in ASD de novo mutations (DNM) (green, violet) (4, 25) and from an ASD patient-derived organoid study (brown) (6). Rows are modules from this study and columns are modules from other studies. Red shading represents the degree of enrichment between pairs of modules. Corrected p values of significant overlaps (hypergeometric test) are numerically indicated as −log10(p value). (B) Bar plots of the top-scoring biological process terms for the ASD-associated modules shown in (A). (C) Graphical representation of the strongest interacting hub genes in the MG4 module network. Circles: genes; lines: topological overlap above 0.95. Colors in circles annotate each gene as hub (red), DEG (green), SFARI gene (blue), and enhancer target (yellow). Enhancer target: genes targeted by enhancers in the ME9, ME29, ME13, and ME2 ASD-associated enhancer modules (Fig. 2I). (D) Frequency plots within the MG4 module showing that enhancer targets, DEGs, and SFARI genes have higher intramodular connectivity. x axis shows the weighted gene connectivity, from low (peripheral genes) to high (central hub genes).

  • Fig. 4 Enrichment of variants in gene-associated enhancers.

    (A) Three subsets of enhancers were selected from all gene-associated enhancers. Early: enhancers active (denoted by +) in all organoid stages but inactive (denoted by -) in fetal brain (red); late: enhancers active in fetal brain but inactive in all organoid stages (blue); constant: enhancers active in all organoid stages and fetal brain (green). Variants in 540 families from the Simons Simplex Collection were analyzed for enrichment in these enhancer sets. (B) Comparison of inherited personal SNPs between ASD probands and normal siblings from the SSC revealed significant enrichment in probands versus siblings (p ≤ 0.05 by one-sample t test) of low–allele frequency SNPs (MAF 0.1 to 5%) in early enhancers (red) and enhancer modules ME2 and ME29 (black). Dashed line at value of 0 represents no difference between probands and siblings. *p < 0.05. (C) Fractions of DNMs in enhancers were compared in probands and siblings across the whole genome. P values (shown above the bars) were calculated by using the chi-square test. (D) Count of motif-breaking DNMs in all gene-associated enhancers were compared between probands and siblings. Circles represent TFs with counts of broken motifs in probands and siblings plotted on x- and y axis. The size of the circles is proportional to the number of TFs. Circles away from the diagonal represent TFs enriched with motif-breaking DNMs in probands or siblings. A few TFs in the probands (colored circles) but not in the siblings were significantly enriched (p < 0.05 by binomial test) with motif-breaking DNMs.

Supplementary Materials

  • Transcriptome and epigenome landscape of human cortical development modeled in organoids

    Anahita Amiri, Gianfilippo Coppola, Soraya Scuderi, Feinan Wu, Tanmoy Roychowdhury, Fuchen Liu, Sirisha Pochareddy, Yurae Shin, Alexias Safi, Lingyun Song, Ying Zhu, André M. M. Sousa, The PsychENCODE Consortium, Mark Gerstein, Gregory E. Crawford, Nenad Sestan, Alexej Abyzov, Flora M. Vaccarin

    Materials/Methods, Supplementary Text, Tables, Figures, and/or References

    Download Supplement
    • Materials and Methods
    • Figs. S1 to S18
    • Tables S7, S8, and S17
    • Caption for tables S1 to S17
    • PsychENCODE Consortium Authors and Affiliations
    • References
    Table S1
    List of all brain specimens, tissues collected, hiPSC lines and organoids generated, and assays performed.
    Table S2
    Differential gene expression (DGE) analysis between fetal- and adult-derived hiPSC lines. List of differentially expressed genes with fold-change (as log2), FDR corrected p-values, and corresponding functional annotation by Gene Ontology and Canonical Pathways. Tab a: DGE analysis based on gene expression estimates after polyA selection; Tab b: DGE analysis based on gene expression estimates after rRNA depletion (see Methods for details). The number of samples analyzed includes 6 fetal hiPSC lines (two per each of 3 biological specimens: 310#4, 310#6; 313#1, 313#10, 313#14; 320#7, 320#13, 320#21) and 3 adult fibroblastderived hiPSC lines (07-01; 1120-01; 1123-01).
    Table S3
    ChIP-seq data QC. Peak numbers are based on original peaks called from individual samples. NSC = normalized strand coefficient; RSC = relative strand correlation; FRiP = fraction of reads in peaks.
    Table S4
    Differential Expression analysis between ventricular (CTX1) vs pial (CTX2) cortical regions, and between fetal cortex (CTX) vs organoids at each stage of differentiation. Tab a: List of DEGs between ventricular vs pial cortical regions; Tab b: Functional enrichment for the DEGs in ventricular vs pial cortical regions. CP=canonical pathway; GO= gene ontology. Tab c: Summary counts of DEGs for all comparisons. Tab d: CTXvsORG: List of DEGs between CTX and organoids at each differentiation stage (TD0, TD11 and TD30). Tab e: CTXvsORG Annotation: functional annotation by Canonical Pathways (CP) and Gene Ontology (GO) for each list of DEGs between CTX and organoids, reported in tab d. Tab f: CTX Venn Sets: List of gene IDs from Venn Diagram sets of common and differentiation time point specific DEGs, for CTX vs ORG at each differentiation time point. Tab g: CTX Venn Sets Annotation: functional annotation by CP and GO for the Venn diagrams sets for CTX vs ORG DEGs at each differentiation time point. See Venn diagram in Fig. 1G. The number of samples analyzed includes organoids at each of 3 time points (TD0, TD11, TD30) from 2-3 separate hiPSC lines per brain specimen (310#4, 310#6; 313#1, 313#14; 320#7, 320#13) and 2 samples from frontal cortex from brains 310, 313, 320 (CTX1 and CTX2 from samples 310, 313, 320). In total, we have analyzed 23 samples.
    Table S5
    Cell type enrichment analysis for the DEGs between fetal cortex (CTX) and organoids at each stage of differentiation. Endothelial cells: ENDO, radial glia: RG, dividing radial glia: DvRG, intermediate progenitor cells: IPC, newborn neurons: NbN, maturing excitatory neurons: ExN, interneurons: INT, intermediate progenitor cells: IPC.
    Table S6
    single nuclei RNAseq. Summary statistics and data analysis. Tab a: Sequencing Summary Statistics; Tab b: Tables of absolute number of cells for each sample across clusters, total number of cells analyzed for each samples, fraction of cells in each sample across clusters; Tab c: List of cell type markers for each cluster, along with nominal pvalue, average log2 fold change of gene expression in one cluster versus all the other clusters combined, fraction of cells in a specific cluster, fraction of cells in all other clusters, corrected pvalues, cluster number, gene symbol; Tab d: Summary annotation by cell type for each cluster, considering overlap statistics with: Nowakowski et al 2017, Nowakowski et al 2017 cell specific markers, Liu et al 2016 cell specific markers; Tab e: Nowakowski et al 2017 overlap statistics for each cluster (see also Fig. S6D). Identified Cluster: cluster ID in present data set; ClusterAnnotation: cluster annotation according to Nowakowski et al 2017; CorrectedPvalues: corrected overlap pvalue; NumGenes: number of genes in overlap with cluster in Nowakowski et al 2017; ClusterInterpretation: cluster interpretation according to Nowakowski et al 2017;, Nowakowski2017_ClusterNumber : cluster ID in Nowakowski et al 2017; CategoryGenes: gene symbols in overlap with cluster in Nowakowski et al 2017; Tab f: Nowakowski et al 2017 cell type specific markers overlap statistics for each cluster. Identified Cluster: cluster ID in present data set; ClusterAnnotation: cluster annotation according to Nowakowski et al 2017 cell type specific markers; CorrectedPvalues: corrected overlap pvalue; NumGenes: number of genes in overlap with cluster in Nowakowski et al 2017 cell type specific markers; ClusterInterpretation: cluster interpretation according to Nowakowski et al 2017 cell type specific markers; Nowakowski2017_ClusterNumber : cluster ID in Nowakowski et al 2017 cell type specific markers; CategoryGenes: gene symbols in overlap with cluster in Nowakowski et 2017 cell type specific markers; Tab g: Liu et al 2016 cell type specific markers overlap statistics for each cluster. Identified Cluster: cluster ID in present data set; ClusterAnnotation: cluster annotation according to Liu et al 2016 cell type specific markers; CorrectedPvalues: corrected overlap pvalue; NumGenes: number of genes in overlap with cluster in Liu et al 2016 cell type specific markers; ClusterInterpretation: cluster interpretation according to Liu et al 2016 cell type specific markers; Liu2016_ClusterNumber : cluster ID in Liu et al 2016 cell type specific markers; CategoryGenes: gene symbols in overlap with cluster in Liu et 2016 cell type specific markers
    Table S9
    Putative enhancers identified by annotating H3K27ac CONPs using chromatin segmentation. Genomic locations of H3K27ac CONPs are listed in columns Chrom, Start, End; unique IDs are listed under CONP_ID; for each CONP, numbers of OPs in total are listed in All_OP_No and numbers at each stage in [stage]_OP_No; annotations at each stage are in [stage]_annotation (aEnh = active enhancer, iEnh = inactive enhancer with a chromatin state of repressed or low). Refer to "Identification of enhancers" in Supplementary Methods.
    Table S10
    Definition of gene-associated enhancers based upon intersection with Hi-C datasets and proximity (within 20 Kb to gene promoters). Genomic locations of H3K27ac CONPs are listed in columns chrom, start, end; unique IDs are listed under CONP_ID; for each CONP, numbers of OPs in total is listed in column All_OP_No and numbers at each stage in column [stage]_OP_No; annotations at each stage are listed in column [stage]_annotation (aEnh = active enhancer, iEnh = inactive enhancer with chromatin state of repressed or low); log2 fold change and FDR for differential enhancer analysis in column [reference stage]_[stage to compare]_log2FC/FDR; linked gene targets are grouped by supporting data in columns confident_set1, confident_set2 and proximity (multiple genes are separated by "$", NA for no gene); enhancer module and supermodule memberships in columns "enhancer_module" and "enhancer_supermodule".
    Table S11
    Time Course analysis: Differential gene expression analysis. Differential gene expression in organoids between pairs of consecutive developmental time points, for cellular RNA. Tab a: List of DEGs for cellular RNA at the first (TD0-to-TD11) and second (TD11-to-TD30) transitions. Shown are fold change (as log2) and FDR corrected p-values. Tab b: Sub-sets of DEGs (organized as up- and down-regulated) that are common, and specific to each transition, reported in Tab a (see Venn diagram in Fig. S7). Tab c: Functional annotation by Canonical Pathways (CP) and Gene Ontology (GO) for each list, from the Venn Diagrams sub-sets, reported in tab b.
    Table S12
    Integrative analyses of 22835 genes and their associated 96375 enhancers. Relevant information is listed for each gene-enhancer pair, including gene ID (columns "Gene_symbol" and "EMBL_gene_ID") and enhancer ID (column "CONP_ID"), Spearman's correlation coefficient (column "Spearman's_rho") and FDR used to define A-reg/R-reg (FDR < 0.05 and rho > 0 for A-reg while FDR < 0.05 and rho < 0 for R-reg), enhancer module and supermodule for the corresponding enhancer as well as gene module and supermodule for the corresponding gene. NA for enhancer module/supermodule indicates the enhancer was not clustered. NA for gene module/supermodule indicates the gene was not clustered or not used for WGCNA analysis.
    Table S13
    Primer sequences used for qPCR validation of RNA-seq data.
    Table S14
    Weighted gene co-expression network modules and functional annotation. List of network modules and respective gene members from WGCNA and FDR corrected p-values for robustness analysis (tabs a,b); module functional annotation using ToppGene (45) (tab c,d); module enrichment with upregulated (red) or downregulated (blue) DEGs (tab e); Spearmann correlation coefficients between each module eigengene and differentiation time, with positive correlation in red and negative correlation in blue; list of modules significantly enriched in genes from the SFARI collection (tab g); list of modules significantly enriched in genes from the human developmental brain disorders collection (23) (tab h); FDR corrected p-values for the significant overlaps between our gene modules and the gene modules from Parikshak et al. 2013 (4) (tab i) FDR corrected p-values for the significant overlaps between our gene modules and the gene modules from Parikshak et al. 2016 (25) (tab j); FDR corrected p-values for the significant overlaps with network modules from Mariani et al 2015 (6) (tab k); distribution of Hubs, Enhancer targets, DEGs at the 0-to-11 and 11-to-30 transitions and SFARI genes for the MG4, MG5 and MG51 modules (tab l); Supermodule Functional Enrichment using ConsensusPathDB (http://cpdb.molgen.mpg.de/) (see Methods) (tab m).
    Table S15
    Gene-associated enhancers that overlap with published human gained enhancers (HGEs). These enhancers are referred to as gene-associated HGEs. For each of the eight enhancer modules enriched with HGEs, all HGE-targeted genes were subject to pathway and GO analysis using ConsensusPathDB (44). The tab "gene list" contains all genes and their linked HGEs (CONP_ID are listed. See table S10 for genome coordinates).
    Table S16
    Integrative analysis of ASD-SFARI gene dataset (1007 genes) and linked enhancers in our dataset. List of SFARI genes-linked enhancers.

Navigate This Article