Shared molecular neuropathology across major psychiatric disorders parallels polygenic overlap

See allHide authors and affiliations

Science  09 Feb 2018:
Vol. 359, Issue 6376, pp. 693-697
DOI: 10.1126/science.aad6469

Genes overlap across psychiatric disease

Many genome-wide studies have examined genes associated with a range of neuropsychiatric disorders. However, the degree to which the genetic underpinnings of these diseases differ or overlap is unknown. Gandal et al. performed meta-analyses of transcriptomic studies covering five major psychiatric disorders and compared cases and controls to identify coexpressed gene modules. From this, they found that some psychiatric disorders share global gene expression patterns. This overlap in polygenic traits in neuropsychiatric disorders may allow for better diagnosis and treatment.

Science, this issue p. 693


The predisposition to neuropsychiatric disease involves a complex, polygenic, and pleiotropic genetic architecture. However, little is known about how genetic variants impart brain dysfunction or pathology. We used transcriptomic profiling as a quantitative readout of molecular brain-based phenotypes across five major psychiatric disorders—autism, schizophrenia, bipolar disorder, depression, and alcoholism—compared with matched controls. We identified patterns of shared and distinct gene-expression perturbations across these conditions. The degree of sharing of transcriptional dysregulation is related to polygenic (single-nucleotide polymorphism–based) overlap across disorders, suggesting a substantial causal genetic component. This comprehensive systems-level view of the neurobiological architecture of major neuropsychiatric illness demonstrates pathways of molecular convergence and specificity.

Despite remarkable success identifying genetic risk factors for major psychiatric disorders, it remains unknown how genetic variants interact with environmental and epigenetic risk factors in the brain to impart risk for clinically distinct disorders (1, 2). We reasoned that brain transcriptomes—a quantitative, genome-wide molecular phenotype (3)—would allow us to determine whether disease-related signatures are shared across major neuropsychiatric disorders with distinct symptoms and whether these patterns reflect genetic risk.

We first analyzed published gene-expression microarray studies of the cerebral cortex across five major neuropsychiatric disorders (311) using 700 cerebral cortical samples from subjects with autism (ASD) (n = 50 samples), schizophrenia (SCZ) (n = 159), bipolar disorder (BD) (n = 94), depression (MDD) (n = 87), alcoholism (AAD) (n = 17), and matched controls (n = 293) (12). These disorders are prevalent and disabling, contributing substantially to global disease burden. Inflammatory bowel disease (IBD) (n = 197) was included as a non-neural comparison.

Individual data sets underwent stringent quality control and normalization (Fig. 1) (12), including rebalancing so as to alleviate confounding between diagnosis and biological (such as age and sex) or technical (such as post mortem interval, pH, RNA integrity number, batch, and 3′ bias) covariates (figs. S1 and S2). Transcriptome summary statistics for each disorder were computed with a linear mixed-effects model so as to account for any sample overlap across studies (12). Comparison of differential gene expression (DGE) log2 fold change (log2FC) signatures revealed a significant overlap among ASD, SCZ, and BD and SCZ, BD, and MDD (all Spearman’s ρ ≥ 0.23, P < 0.05, 40,000 permutations) (Fig. 2A). The regression slopes between ASD, BD, and MDD log2-FC effect sizes compared with SCZ (5.08, 0.99, and 0.37, respectively) indicate a gradient of transcriptomic severity with ASD > SCZ ≈ BD > MDD (Fig. 2B). To ensure robustness, we compared multiple methods for batch correction, probe summarization, and feature selection, including use of integrative correlations, none of which changed the qualitative observations (fig. S3) (12). Results were also unaltered after first regressing gene-level RNA degradation metrics, suggesting that systematic sample quality issues were unlikely to drive these correlations (fig. S3). Further, the lack of (or negative) overlap between AAD and other disorders suggests that similarities are less likely due to comorbid substance abuse, poor overall general health, or general brain-related post-mortem artefacts.

Fig. 1 Experimental rationale and design.

(A) Model of psychiatric disease pathogenesis. (B) Flowchart of the cross-disorder transcriptome analysis pipeline (12). Cortical gene expression data sets were compiled from cases of ASD (n = 50 samples), SCZ (n = 159), BD (n = 94), MDD (n = 87), AAD (n = 17), and matched nonpsychiatric controls (n = 293) (table S1) (12).

Fig. 2 Cortical gene expression patterns overlap.

(A) Rank order of microarray transcriptome similarity for all disease pairs, as measured with Spearman’s correlation of differential expression log2FC values. (B) Comparison of the slopes among significantly associated disease pairs indicates a gradient of transcriptomic severity, with ASD > SCZ ~ BD > MDD. (C) Overlapping gene expression patterns across diseases are correlated with shared common genetic variation, as measured with SNP coheritability (22). The y axis shows transcriptome correlations using microarray-based (discovery, red) and RNA-seq (replication, blue) data sets. (D) RNA-seq across all cortical lobes in ASD replicates microarray results and demonstrates a consistent transcriptomic pattern. Spearman’s ρ is shown for comparison between microarray and region-specific RNA-seq replication data sets (all P < 10−14). Plots show mean ± SEM. *P < 0.05, **P < 0.01, ***P < 0.001.

Disease-specific DGE summary statistics (data table S1) provide human in vivo benchmarks for determining the relevance of model organisms, in vitro systems, or drug effects (13, 14). We identified a set of concordantly down- and up-regulated genes across disorders (fig. S4) as well as those with more specific effects. Complement component 4A (C4A), the top genome-wide association study (GWAS)–implicated SCZ disease gene (15), was significantly up-regulated in SCZ (log2FC = 0.23, P = 6.9 × 10−6) and in ASD [RNA sequencing (RNA-seq); log2FC = 0.91, P = 0.014] (data table S1) but not in BD, MDD, or AAD. To investigate potential confounding by psychiatric medications, we compared disease signatures with those from nonhuman primates treated with acute or chronic dosing of antipsychotic medications. Significant negative overlap (fig. S5) (12) was observed, indicating that antipsychotics are unlikely to drive, but rather may partially normalize, these transcriptomic alterations, whereas the psychotomimetic phencyclidine partially recapitulates disease signatures.

To validate that these transcriptomic relationships are generalizable, we generated independent RNA-seq data sets for replication for three out of the five disorders (fig. S6) (12). We identified 1099 genes whose DGE is replicated in ASD [odds ratio (OR) 6.4, P = 3.3 × 10−236, Fisher’s exact test] (table S2), 890 genes for SCZ (BrainGVEX; OR 4.5, P = 7.6 × 10−155), and 112 genes for BD (BrainGVEX; OR 3.9, P = 4.6 × 10−26), which is likely due to the relatively smaller RNA-seq sample size for BD (12). We observed similarly high levels of transcriptomic overlap among ASD, SCZ, and BD and a similar gradient of transcriptomic severity (Fig. 2C and fig. S7). The SCZ and BD patterns were further replicated in the CommonMind data set, although gene-level overlap was lower (fig. S7) (12, 16). The ASD signature was qualitatively consistent across the four major cortical lobules, indicating that this pattern is not caused by regional differences (Fig. 2D).

To more specifically characterize the biological pathways involved, we performed robust weighted gene coexpression network analysis (rWGCNA) (12, 17), identifying several shared and disorder-specific coexpression modules (Fig. 3). Modules were stable (fig. S8), showed greater association with disease than other biological or technical covariates (fig. S9), and were not dependent on corrections for covariates or batch effects (fig. S10). Moreover, each module was enriched for protein-protein interactions (fig. S8) and brain enhancer-RNA co-regulation (fig. S11) derived from independent data, which provides anchors for dissecting protein complexes and regulatory relationships.

Fig. 3 Network analysis identifies modules of coexpressed genes across disease.

(A) Network dendrogram from coexpression topological overlap of genes across disorders. Color bars show correlation of gene expression with disease status, biological, and technical covariates. (B) Multidimensional scaling plot demonstrates relationship between modules and clustering by cell-type relationship. (C) Module-level differential expression is perturbed across disease states. Plots show β values from linear mixed-effect model of module eigengene association with disease status (FDR-corrected #P < 0.1, *P < 0.05, **P < 0.01, ***P < 0.001). (D) The top 20 hub genes are plotted for modules most disrupted in disease. A complete list of genes’ module membership (kME) is provided in data table S2. Edges are weighted by the strength of correlation between genes. (E and F) Modules are characterized by (E) Gene Ontology enrichment (top two pathways shown for each module) and (F) cell-type specificity, on the basis of RNA-seq of purified cell populations from healthy human brain samples (25).

An astrocyte-related module (CD4 and hubs GJA1 and SOX9) was broadly up-regulated in ASD, BD, and SCZ [false discovery rate (FDR)–corrected P < 0.05] (Fig. 3C and data table S2) (12) and enriched for glial cell differentiation and fatty-acid metabolism pathways. By contrast, a module strongly enriched for microglial markers (CD11) was up-regulated specifically in ASD (two-sided t test, FDR-corrected P = 4 × 10−9). Hubs include canonical microglial markers (HLA-DRA and AIF1), major components of the complement system (C1QA and C1QB), and TYROBP, a microglial signaling adapter protein (18). Results fit with convergent evidence for microglial up-regulation in ASD and an emerging understanding that microglia play a critical role regulating synaptic function during neurodevelopment (19).

One module, CD2, was up-regulated specifically in MDD (FDR-corrected P = 0.009) (data table S2) and was enriched for G protein–coupled receptors, cytokine-cytokine interactions, and hormone activity pathways, suggesting a link between inflammation and dysregulation of the hypothalamic-pituitary (HPA) axis, which is consistent with current models of MDD pathophysiology (20). Several modules annotated as neuronal/mitochondrial were down-regulated across ASD, SCZ, and BD (CD1, CD10, and CD13) (Fig. 3C and data table S2) (12). The overlap of CD10 with a mitochondrial gene-enriched module previously associated with neuronal firing rate (21) links energetic balance, synaptic transmission, and psychiatric disease (data table S2).

The transcriptome may reflect the cause or the consequence of a disorder. To refine potential causal links, we compared single-nucleotide polymorphism (SNP)–based genetic correlations between disease pairs (22) with their corresponding transcriptome overlap. SNP coheritability was significantly correlated with transcriptome overlap across the same disease pairs (Spearman’s ρ = 0.79, 95% confidence interval 0.43 to 0.93, P = 0.0013) (Fig. 2C), suggesting that a major component of these gene-expression patterns reflects biological processes coupled to underlying genetic variation.

To determine how disease-associated variants may influence specific biological processes, we investigated whether any modules harbor genetic susceptibility for specific disorders or for relevant cognitive or behavioral traits (12). We identified significant enrichment among several of the down-regulated, neuronal coexpression modules (CD1, CD10, and CD13) for GWAS signal from SCZ and BD, as well as for educational attainment and neuroticism (FDR-corrected P < 0.05, Spearman) (Fig. 4A) (12). We also observed enrichment for the three down-regulated neuronal coexpression modules in the iPSYCH Consortium (23) ASD GWAS cohort (Fig. 4A and table S3) (12). By contrast, these modules showed no enrichment for MDD, AAD, or IBD. Further, none of the microglial- or astrocyte-specific modules showed psychiatric GWAS enrichment. Extending this analysis to disease-associated rare variants (data table S3) (2, 12), we found that the CD1 neuronal module was enriched for genes harbouring rare, nonsynonymous de novo mutations identified in ASD (OR 1.36, FDR-corrected P = 0.03, logistic regression) and SCZ cases (OR 1.82, FDR-corrected P = 0.014) but not unaffected controls (Fig. 4B). A similar CD1-enrichment was observed for genes affected by rare, recurrent copy-number variation (CNV) in ASD (OR 2.52, FDR-corrected P = 0.008) and SCZ (OR 2.46, FDR-corrected P = 0.014). These results suggest convergence of common and rare genetic variation acting to down-regulate synaptic function in ASD and SCZ.

Fig. 4 Down-regulated neuronal modules are enriched for common and rare genetic risk factors.

(A) Significant enrichment is observed for SCZ-, ASD-, and BD-associated common variants from GWAS among neuron/synapse and mitochondrial modules (12). GWAS data sets are listed in table S3. (B) The CD1 neuronal module shows significant enrichment for ASD- and SCZ-associated nonsynonymous de novo variants from whole-exome sequencing. The number of genes affected by different classes of rare variants is shown in parentheses. Significance was calculated by using logistic regression, correcting for gene length. P values are FDR-corrected. (C) Total SNP-based heritability (liability scale for psychiatric diagnoses) calculated from GWAS by using LD-score regression. (D) Proportion of heritability for each disorder or trait that can be attributed to individual coexpression modules. Significance (FDR-corrected *P < 0.05, **P < 0.01, ***P < 0.001) is from enrichment statistics comparing the proportion of SNP heritability within the module divided by the proportion of total SNPs represented. The CD1 module shows significant enrichment in SCZ, BD, and educational attainment.

We next used LD score regression (24) to partition GWAS heritability (Fig. 4C and data table S4) into the contribution from SNPs located within genes from each module (Fig. 4D) (12). CD1 again showed significant enrichment for SCZ (2.5-fold, FDR-corrected P = 8.9 × 10−11), BD (3.9 fold, FDR-corrected P < 0.014), and educational attainment (1.9-fold, FDR-corrected P < 0.0008; χ2 test) GWAS, accounting for ~10% of SNP-based heritability within each data set, despite containing only 3% of the SNPs. This illustrates how gene network analysis can begin to parse complex patterns of common variants, each of small effect size, to implicate specific biological roles for common variant risk across neuropsychiatric disorders.

These data provide a quantitative, genome-wide characterization of the cortical pathology across five major neuropsychiatric disorders, providing a framework for identifying the responsible molecular signaling pathways and interpreting genetic variants implicated in neuropsychiatric disease risk. We observed a gradient of synaptic gene down-regulation, with ASD > SZ ≈ BD. BD and SCZ appear most similar in terms of synaptic dysfunction and astroglial gene up-regulation, which may represent astrocytosis, activation, or both. ASD, an early-onset disorder, shows a distinct up-regulated microglial signature, which may reflect the role for microglia in regulation of synaptic connectivity during neurodevelopment (19). MDD shows neither the synaptic nor astroglial pathology but does exhibit dysregulation of HPA-axis and hormonal signaling not observed in the other disorders.

Our data suggest that shared genetic factors underlie a substantial proportion of cross-disorder expression overlap. Given that a minority of these relationships represent expression quantitative trait loci (fig. S12), most of the genetic effects are likely acting indirectly, through a cascade of developmental and cell-cell signaling events rooted in genetic risk. Genetic variation is also not the only driver of expression variation; there is undoubtedly a contribution from environmental effects. Hidden confounders could introduce a correlation structure that matches SNP-level genetic correlations, but parsimony and hidden covariate correction suggests that this is unlikely. Diagnostic misclassification could artificially elevate shared signals, but the results are robust to disorder removal (fig. S13), and misclassification would not account for the substantial overlap we observed with ASD, which has a highly distinct phenotypic trajectory from later onset disorders. Last, we have replicated broad transcriptomic and cell type–specific patterns independently for ASD, SCZ, and BD, providing an organizing pathological framework for future investigation of the mechanisms underlying specific gene- and isoform-level transcriptomic alterations in psychiatric disease.

Supplementary Materials

Materials and Methods

Supplemental Text

Figs. S1 to S13

Tables S1 to S3


References (26-98)

Data Tables S1 to S5

References and Notes

  1. Materials and methods are available as supplementary materials.
Acknowledgments: The work is funded by the U.S. National Institute of Mental Health (NIMH) (grants P50-MH106438, D.H.G.; R01-MH094714, D.H.G.; U01-MH103339, D.H.G.; R01-MH110927, D.H.G.; R01-MH100027, D.H.G.; R01-MH110920, C.L.; and U01-MH103340, C.L.), the Simons Foundation for Autism Research Initiative (SFARI) (SFARI grant 206733, D.H.G., and SFARI Bridge to Independence Award, M.J.G.), and the Stephen R. Mallory schizophrenia research award at the University of California, Los Angeles, (UCLA) (M.J.G.). The study was supported by The Lundbeck Foundation Initiative for Integrative Psychiatric Research (grant R102-A9118) and by the Novo Nordisk Foundation. Published microarray data sets analyzed in this study are available on Gene Expression Omnibus (accession nos. GSE28521, GSE28475, GSE35978, GSE53987, GSE17612, GSE12649, GSE21138, GSE54567, GSE54568, GSE54571, GSE54572, GSE29555, and GSE11223), ArrayExpress (accession no. E-MTAB-184), or directly from the study authors (4). New RNA-seq data (available on Synapse with accession numbers syn4590909 and syn4587609, with access governed by NIMH Repository and Genomics Resource) were generated as part of the PsychENCODE Consortium, supported by grants U01MH103339, U01MH103365, U01MH103392, U01MH103340, U01MH103346, R01MH105472, R01MH094714, R01MH105898, R21MH102791, R21MH105881, R21MH103877, and P50MH106934 awarded to Schahram Akbarian (Icahn School of Medicine at Mount Sinai), Gregory Crawford (Duke), Stella Dracheva (Icahn School of Medicine at Mount Sinai), Peggy Farnham (USC), Mark Gerstein (Yale), Daniel Geschwind (UCLA), Thomas M. Hyde (LIBD), Andrew Jaffe (LIBD), James A. Knowles (USC), Chunyu Liu (UIC), Dalila Pinto (Icahn School of Medicine at Mount Sinai), Nenad Sestan (Yale), Pamela Sklar (Icahn School of Medicine at Mount Sinai), Matthew State (UCSF), Patrick Sullivan (UNC), Flora Vaccarino (Yale), Sherman Weissman (Yale), Kevin White (UChicago), and Peter Zandi (JHU). RNA-seq data from the CommonMind Consortium used in this study (Synapse accession no. syn2759792) was supported by funding from Takeda Pharmaceuticals Company, F. Hoffman-La Roche and NIH grants R01MH085542, R01MH093725, P50MH066392, P50MH080405, R01MH097276, RO1-MH-075916, P50M096891, P50MH084053S1, R37MH057881, R37MH057881S1, HHSN271201300031C, AG02219, AG05138, and MH06692. Brain tissue for the study was obtained from the following brain bank collections: the Mount Sinai NIH Brain and Tissue Repository, the University of Pennsylvania Alzheimer’s Disease Core Center, the University of Pittsburgh NeuroBioBank and Brain and Tissue Repositories, the Harvard Brain Bank as part of the Autism Tissue Project (ATP), the Stanley Medical Research Institute, and the NIMH Human Brain Collection Core. Summary statistics for the ASD GWAS performed on data from the iPSYCH consortium are available in data table S5. Data and analysis code are available at The authors thank S. Parhami, H. Won, J. Stein, D. Poliodakis, J. Flint, R. Ophoff, and members of the D.H.G. laboratory for critical reading of earlier versions of this manuscript. The authors also thank B. Pasaniuc for his helpful comments and for assistance with module heritability analyses.
View Abstract


Navigate This Article