An anatomic transcriptional atlas of human glioblastoma

See allHide authors and affiliations

Science  11 May 2018:
Vol. 360, Issue 6389, pp. 660-663
DOI: 10.1126/science.aaf2666

Anatomically correct tumor genomics

Glioblastoma is the most lethal form of human brain cancer. The genomic alterations and gene expression profiles characterizing this tumor type have been widely studied. Puchalski et al. created the Ivy Glioblastoma Atlas, a freely available online resource for the research community. The atlas, a collaborative effort between bioinformaticians and pathologists, maps molecular features of glioblastomas, such as transcriptional signatures, to histologically defined anatomical regions of the tumors. The relationships identified in this atlas, in conjunction with associated databases of clinical and genomic information, could provide new insights into the pathogenesis, diagnosis, and treatment of glioblastoma.

Science, this issue p. 660


Glioblastoma is an aggressive brain tumor that carries a poor prognosis. The tumor’s molecular and cellular landscapes are complex, and their relationships to histologic features routinely used for diagnosis are unclear. We present the Ivy Glioblastoma Atlas, an anatomically based transcriptional atlas of human glioblastoma that aligns individual histologic features with genomic alterations and gene expression patterns, thus assigning molecular information to the most important morphologic hallmarks of the tumor. The atlas and its clinical and genomic database are freely accessible online data resources that will serve as a valuable platform for future investigations of glioblastoma pathogenesis, diagnosis, and treatment.

Glioblastoma is the most common and the most lethal malignant brain tumor (1). Even for patients receiving aggressive treatment, the median survival is 12 to 15 months (2). The tumors evolve rapidly as they acquire new mutations; the resultant increase in intratumor genomic heterogeneity leads to the development of drug resistance, which limits the long-term efficacy of therapies (3, 4). Two large-scale efforts aimed at characterizing the genomic alterations in human glioblastoma are The Cancer Genome Atlas (TCGA), which is a catalog of multi-omics data, including genomics, transcriptomics, DNA methylomics, proteomics, etc. (5, 6), and REMBRANDT (Repository for Molecular Brain Neoplasia Data), which also includes multiple data domains (7). These efforts helped to clarify the role of genomic alterations in the pathogenesis of glioblastoma but were not designed to address intratumor heterogeneity. Subsequent studies addressed heterogeneity spatially within bulk tumor or at the single-cell level (4, 812). Nonetheless, we still lack a systematic understanding of the tumor’s molecular heterogeneity as it relates to anatomical heterogeneity. By “anatomical heterogeneity,” we mean the variable combination of the classical histological features of glioblastoma, such as tumor infiltration, endothelial cell proliferation, and necrosis. Here, we describe the Ivy Glioblastoma Atlas (, a comprehensive molecular pathology map of glioblastoma in which we have assigned key genomic alterations and gene expression profiles to the tumor’s anatomic features. The atlas will facilitate accurate deconvolution of anatomic features in new samples of glioblastoma, providing unique information for the comprehensive diagnostic characterization of the tumor’s heterogeneity.

To create the atlas, we surveyed the tumor’s anatomic features by in situ hybridization (ISH), analyzed these features’ transcriptomes by laser microdissection (LMD) and RNA sequencing (RNA-seq), and validated the feature-specific gene expression enrichment of newly identified markers by ISH (Fig. 1). We created a clinical and genomic database ( for the 41-patient cohort (table S1) whose tumors (n = 42) were evaluated to create the atlas. We describe gene sets whose expression is enriched in the anatomic features, measurements of intra- and intertumor heterogeneity, and a molecular subtype classification of transcriptomic samples from our atlas and from TCGA. Together, these two online resources constitute the Ivy Glioblastoma Atlas Project (Ivy GAP).

Fig. 1 Data generation, analysis, and presentation pipeline for the Ivy Glioblastoma Atlas Project.

(A) Clinical data were collected for the Ivy cohort of 41 patients. (B) Tissue preparation required en bloc resection and formation of tissue blocks with custom-made L bars. (C) Two studies, anatomic feature–based profiling and cancer stem cell marker–based profiling, provided a framework for the ISH surveys, LMD and RNA-seq experiments, and ISH validations. (D) Informatics included image registration, ontology development, and anatomic feature prediction based on a novel machine learning (ML) analysis of histological data. Search tools enable queries of the data set by tumor, tumor block, and gene expression filtered by anatomic feature, molecular subtype, and clinical information. Searchable manual labels delineating the laser microdissections for 270 RNA-seq samples from the two studies overlay the histology images. The atlas is equipped with image viewers that resolve the histology at 0.5 μm per pixel, a transcriptome browser, an application programming interface, and help documentation. The database has detailed longitudinal clinical information and MRI time courses (table S1). (E) This free resource is made available as part of the Ivy Glioblastoma Atlas Project (Ivy GAP). ( via the Allen Institute data portal (, the Ivy GAP Clinical and Genomic Database ( via the Swedish Neuroscience Institute (, and the Cancer Imaging Archive (

To identify gene sets with enriched expression in each anatomic feature (fig. S1), we used LMD to isolate RNA from the leading edge (LE), infiltrating tumor (IT), cellular tumor (CT), pseudo-palisading cells around necrosis (PAN), and microvascular proliferation (MVP). In total, we isolated 122 samples from three different blocks per tumor in 8 to 10 tumors. In consultation with a neuropathologist, we manually drew outlines (LMD guidelines) for each of the anatomic features on images of histologically stained tissue sections. Three additional neuropathologists independently validated the LMD guidelines, and the results showed excellent concordance (table S2). Differential gene expression analysis revealed a total of 3627 genes that had enriched expression in LE, CT, PAN, and MVP samples (Fig. 2A and table S3). Multidimensional scaling demonstrated that samples from these four features were largely distinct, whereas IT appeared to fall on a continuum between LE and CT (Fig. 2B). Gene Ontology enrichment analysis of gene sets with enriched expression in anatomic features (Fig. 2C) confirmed and extended previous reports (13, 14). In general, samples from the same anatomic feature, whether derived from the same or different tumors, were more similar to each other than to samples from other anatomic features of the same tumor (Fig. 2D). Within a given anatomic feature, intertumor heterogeneity exceeded intratumor heterogeneity (fig. S2).

Fig. 2 Gene expression in anatomic features.

(A) Differential expression matrix based on genes identified in the 122 anatomic feature RNA-seq samples isolated in triplicate from 8 to 10 tumors. Values are numbers of genes whose expression is enriched in the row feature relative to the column feature [false discovery rate < 0.01, relative (fold) change >2; P < 0.1, Benjamini-Hochberg corrected]. Values on the diagonal are numbers of genes with higher expression in one feature versus all other features (i.e., top marker genes). (B) Multidimensional scaling of all genes reflects anatomic specificity. (C) Gene Ontology enrichment analysis. LE and CT were enriched for Gene Ontology terms related to neuronal systems and glial cell differentiation, respectively, whereas PAN was associated with stress, hypoxia, and immune responses, and MVP was related to angiogenesis, immune regulation, and response to wounding. (D) Mean Euclidean distance within and between tumors based on hierarchical clustering of all genes in all 122 anatomic feature RNA-seq samples grouped by anatomic feature (figs. S1 and S2). Cross Feature measures variance between anatomic features. (E to I) Representative marker genes showing RNA-seq expression levels (in RPKM, reads per kilobase of transcript per million mapped reads) for features isolated by LMD, representative ISH, ML annotations for ISH and H&E (hematoxylin and eosin stain), and H&E adjacent to ISH. Color code: blue, LE; purple, IT; green, CT; light blue, PNZ; turquoise, PAN; orange, HBV; red/magenta, MVP; black, necrosis.

We selected 31 genes with enriched expression in anatomic features for further analysis by ISH, and found that 27 showed at least partial agreement and 22 showed good agreement between RNA-seq and ISH assessments of enrichment in PAN, CT, or MVP (Fig. 2, E to I, and table S4). Assessing enrichment of gene expression by ISH required that we calculate the overlap between the expression pattern and our machine learning (ML) annotations for each anatomic feature, which we validated using (i) ML-determined rates of accuracy and precision (table S5); (ii) an inter-neuropathologist test to establish agreement on definitions of anatomic features (fig. S1 and tables S6 and S7); and (iii) neuropathology concordance analyses (tables S8 to S11).

To characterize intratumor genetic heterogeneity across anatomic features, we assessed RNA-seq–derived copy number changes in the features and compared them to the DNA-level copy number variations (CNVs) (12) from the corresponding bulk tumor (fig. S3 and table S12). The CT and PAN samples consistently showed gene expression changes corresponding to the CNVs, whereas LE samples did not, as LE samples by definition consist largely of non-neoplastic cells and hence would not harbor the CNVs. On the other hand, MVP samples showed some gene expression changes corresponding to the CNVs, indicating a mixture of tumor and non-neoplastic cells. To evaluate the distribution of somatic mutations targeting key glioblastoma genes within the different anatomic features of this tumor, we used RNA-seq to call single-nucleotide variants (SNVs) in eight genes (TP53, PTEN, EGFR, ATRX, IDH1, NF1, PIK3R1, PIK3CA) known to harbor recurrent and functionally important mutations in glioblastoma across anatomic features for tumors where there was at least one sample available from each of the LE, CT, PAN, and MVP features (fig. S4 and table S13). We detected somatically mutated alleles in RNA from CT, PAN, and MVP samples, whereas we found only the wild-type variants in LE samples (fig. S4A). The ratio of mutant to wild-type expression was least for MVP relative to CT and PAN samples (fig. S4B). Some of the SNVs occurring across anatomic features were corroborated by ISH data (table S1). Together, the copy number and mutation analyses indicated that LE samples largely consist of non-neoplastic cells, CT and PAN samples largely consist of tumor cells, and MVP samples contain a mixture of tumor and non-neoplastic cells. The observed intratumor heterogeneity in copy number and mutation profiles is consistent with previous studies (8, 9). Only three tumors from our 41-patient cohort harbored the Arg132 → His mutation in isocitrate dehydrogenase 1 (IDH1) (table S1); thus, there was insufficient statistical power for analysis of this mutation by anatomic feature. We did not identify any mutation associated with a particular anatomic feature that predicted overall survival better than the promoter methylation status of the MGMT gene in the bulk tumor (fig. S5, A and B) (15).

Finally, we developed an admixture model using a 293–gene signature matrix (table S14) for computational decomposition of bulk tumor samples into four anatomic features (LE, CT, PAN, and MVP) and classified the 122 anatomic feature RNA-seq samples on the basis of histology, admixture (table S14), molecular subtype (6), and cell type gene expression signature (table S15) enrichment (fig. S6, A to D, and table S16). Several genes exhibited differential expression across known molecular subtypes of glioblastoma within each anatomic feature (fig. S7, A to C). Enrichment of the cell type gene expression signatures in the anatomic features was consistent with Gene Ontology enrichment analyses (Fig. 2C). The correlation between the anatomic feature gene sets and molecular subtypes (table S16) is broadly consistent with results of previous studies (8, 9). When we applied our admixture model to 167 RNA-seq samples of the TCGA data, we observed similar patterns (fig. S8, A to C, and table S16).

This atlas and the associated database for clinical and genomic data will serve as a useful platform for developing and testing new hypotheses related to the pathogenesis, diagnosis, and treatment of glioblastoma. We note that investigators are already leveraging this resource (1633). In one preclinical study, Miller et al. (22) used the atlas to prioritize potential druggable targets based on relationships to tumor microenvironment signatures. In another preclinical study, Yu et al. (31) used the Ivy GAP data set to identify anatomic regions of glioblastoma that are enriched in tumor-initiating cells. On the basis of this information, they delivered an experimental drug to these tumor regions as a way to maximize its therapeutic effect.

Supplementary Materials

Materials and Methods

Tables S1 to S16

Figs. S1 to S8

References (3465)

References and Notes

Acknowledgments: We thank the Allen Institute founders, P. G. Allen and J. Allen, for their vision, encouragement, and support. We thank B. Aronow, B. Bernard, D. Ghosh, L. Hood, C. Hubert, J. Lathia, B. Lin, J. Olson, N. Sanai, I. Shmulevich, Q. Tian, and I. Ulasov for providing lists of genes for putative cancer stem cell markers; J. Rich for critical review of the manuscript and helpful comments; N. Hansen from Swedish Research Institute for help with patient consent and clinical data collection; T. Crossley for help with the website; P. Sonpatki for help with neuropathology evaluation forms; and B. Facer and N. Stewart for artistic and administrative assistance, respectively. Funding: Supported by The Ben and Catherine Ivy Foundation. R.C.R. was supported by NINDS R01 NS091251 and NCI R01 CA136808 grants (to R.C.R.), and A.I. was supported by NCI grants R01CA178546, U54CA193313, R01CA179044, R01CA190891, and R01NS061776 and The Chemotherapy Foundation (to A.I.). Author contributions: G.D.F. and R.B.P. conceived of and secured funding for the creation of the resource. R.B.P., G.D.F., and N.S. designed and (with M.H. and C.C.) supervised the creation of the resource. R.B.P., N.S., J.M., and M.J.H. wrote the manuscript. N.S., J.M., M.L., and C.M. conducted computational data analysis. N.S. and M.L. developed J.Y., G.D.F., S.W.R., and R.B.P. developed methods for tissue block generation. R.B.P., S.W.R., and R.D. identified anatomic features in tissue blocks. J.Y. and H.L. did genomic studies. P.H. generated cell lines. R.B.P., P.H., and N.S. prepared cancer stem cell marker gene list. R.B.P. compared several platforms for semi-automated annotation of images. X.F. and B.K. were responsible for MRI data collection and processing. C.D. and L.N. managed overall online product design. S.R.N., with assistance from S.W.R., R.D., N.S., and R.B.P., was responsible for semi-automated annotation of H&E images. N.S. used the Definiens platform to count nuclei in images of H&E-stained sections. S.J.W. and D.M. created the ML application for the semi-automated annotation of H&E images. C.R.S. and S.D. provided engineering support. S.W.R. was responsible for neuropathology throughout the creation of the atlas. R.B.P. and N.S. designed and supervised the neuropathology concordance analyses with input from the neuropathologists, C.D.K., P.J.C, M.U., who performed the study, and the biostatisticians, J.S.B.-S., N.S., and H.R.G., as well as R.B.P. who analyzed the data. G.D.F., C.C., and F.R.F. performed surgeries. A.J., P.E.W., J.W.P., A.B., E.L., M.J.H., L.N., J.H., K.S., A.E., and J.H. provided valuable insight and oversight of their teams’ contributions to project at the Allen Institute. R.C.R., J.L., E.L., and M.E.B. critically read the manuscript. A.I. critically read and revised the manuscript. J.L. and R.C.R. provided valuable insight regarding cancer stem cell markers. M.E.B. gave input on glioblastoma biology and invasion markers. R.B.P., N.D., and Z.R. were responsible for methods development. R.D., M.F., K.J., D.R., D.S., T.D., and J.H. were responsible for image quality control. J.G. and K.A.S. were responsible for probe design and development. M.C., T.D., D.F., G.G., L.K., C.L., F.L., N.H., F.L., A.So., A.Sz., and W.W. contributed to design, development, and testing of image visualization tools. D.B., T.L., E.M., K.N., E.O., M.R., A.F.B., S.B., N.D., N.M., K.B., N.D., K.B., S.C., A.E., L.G., G.G., J.G., B.W.G., R.H., A.K., N.S., K.A.S., and G.S. contributed to reagent preparation and tissue sample processing through sectioning, histology, ISH, laser microdissection, and imaging. Competing interests: The authors declare no competing interests. Data and materials availability: A materials transfer agreement was executed on 20 May 2010 between the Allen Institute for Brain Science and Swedish Health Services to govern the transfer of human tissue between the two institutions, consistent with the approved IRB protocol and consent form. Requests for tissue should be addressed to R.B.P. Tissue accrued in the study will be shared with the scientific community depending on the availability, requested amount, and proposed study plan. Requests for tissue sent to Swedish Health Services will be reviewed for merit, IRB consent, and scientific value on a case-by-case basis. The RNA-seq and copy number data are publically available at Gene Expression Omnibus through GEO series accession number GSE107560. The MRIs are available at the Cancer Imaging Archive ( The atlas image and RNA-seq FPKM data are available as part of the Ivy Glioblastoma Atlas Project ( via the Allen Institute data portal ( The detailed clinical data are available through the Ivy GAP Clinical and Genomic Database ( via the Swedish Neuroscience Institute ( Dedication: This project is dedicated to G.D.F., a dedicated and talented neurosurgeon as well as a visionary in glioblastoma research, who passed away during the course of the study.

Stay Connected to Science

Navigate This Article