Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma

See allHide authors and affiliations

Science  20 Jun 2014:
Vol. 344, Issue 6190, pp. 1396-1401
DOI: 10.1126/science.1254257

Cancer at single-cell resolution

Single-cell sequencing can illuminate the genetic properties of brain cancers and reveal heterogeneity within a tumor. Patel et al. examined the genome sequence of single cells isolated from brain glioblastomas. The findings revealed shared chromosomal changes but also extensive transcription variation, including genes related to signaling, which represent potential therapeutic targets. The authors suggest that the variation in tumor cells reflects neural development and that such variation among cancer cells may prove to have clinical significance.

Science, this issue p. 1396


Human cancers are complex ecosystems composed of cells with distinct phenotypes, genotypes, and epigenetic states, but current models do not adequately reflect tumor composition in patients. We used single-cell RNA sequencing (RNA-seq) to profile 430 cells from five primary glioblastomas, which we found to be inherently variable in their expression of diverse transcriptional programs related to oncogenic signaling, proliferation, complement/immune response, and hypoxia. We also observed a continuum of stemness-related expression states that enabled us to identify putative regulators of stemness in vivo. Finally, we show that established glioblastoma subtype classifiers are variably expressed across individual cells within a tumor and demonstrate the potential prognostic implications of such intratumoral heterogeneity. Thus, we reveal previously unappreciated heterogeneity in diverse regulatory programs central to glioblastoma biology, prognosis, and therapy.

Tumor heterogeneity poses a major challenge to cancer diagnosis and treatment. It can manifest as variability between tumors, wherein different stages, genetic lesions, or expression programs are associated with distinct outcomes or therapeutic responses (13). Alternatively, cells from the same tumor may harbor different mutations or exhibit distinct phenotypic or epigenetic states (47). Such intratumoral heterogeneity is increasingly appreciated as a determinant of treatment failure and disease recurrence (8).

Glioblastoma is an archetypal example of a heterogeneous cancer and one of the most lethal human malignancies (9, 10). Intratumoral heterogeneity and redundant signaling routes likely underlie the inability of conventional and targeted therapies to achieve long-term remissions (1113). These tumors contain cellular niches enriched for distinct phenotypic properties, including transient quiescence and self-renewal (1416), adaptation to hypoxia (17), and resistance to radiation-induced DNA damage (18, 19). DNA and RNA profiles of bulk tumors have enabled genetic and transcriptional classification of glioblastomas (20, 21). However, the relationships among different sources of intratumoral heterogeneity—genetic, transcriptional, and functional—remain obscure.

Single-cell transcriptome analysis by RNA sequencing (RNA-seq) (22, 23) should in principle enable functional characterization from landmark genes and annotated gene sets, relate in vivo states to in vitro models, inform transcriptional classifications based on bulk tumors, and even capture genetic information for expressed transcripts. To analyze intratumoral heterogeneity systematically, we isolated individual cells from five freshly resected and dissociated human glioblastomas and generated single-cell full-length transcriptomes using SMART-seq (96 to 192 cells per tumor, total 672 cells; Fig. 1A). Before sorting, the suspension was depleted for CD45+ cells to remove inflammatory infiltrate. As a control, we also generated population (bulk) RNA-seq profiles from the CD45-depleted tumor samples. All tumors were IDH1/2 wild-type primary glioblastomas (fig. S1), and three were EGFR amplified as determined by routine clinical tests (table S1). We excluded genes and cells with low coverage (24), retaining ~6000 genes quantified in 430 cells from five patient tumors and population controls (table S1). The population-level controls correlated with the average of the single cells in that tumor (fig. S2), supporting the accuracy of the single-cell data. Individual cells from the same tumor were more correlated to each other than were cells from different tumors (fig. S2). Nevertheless, correlations between individual cells from the same tumor showed a broad spread (correlation coefficient r ~0.2 to 0.7) (fig. S2), consistent with intratumoral heterogeneity.

Fig. 1 Intratumoral glioblastoma heterogeneity quantified by single-cell RNA-seq.

(A) Workflow depicts rapid dissociation and isolation of glioblastoma cells from primary tumors for generating single-cell and bulk RNA-seq profiles and deriving glioblastoma culture models. (B) Clustering of CNV profiles inferred from RNA-seq data for all single cells and a normal brain sample. Clusters (dendrogram) primarily reflect tumor-specific CNV [colored bar coded as in (D)]. Topmost cluster (red, arrow) contains the normal brain sample and 10 single cells, 9 of which correlate with normal oligodendrocyte expression profiles and 1 with normal monocytes (“Oligo” and “Mono,” black and white heatmap). (C) Heatmap of CNV signal normalized against the “normal” cluster defined in (B) shows CNV changes by chromosome (columns) for individual cells (rows). All cells outside the normal cluster exhibit chromosome 7 gain (red) and chromosome 10 loss (blue), which are characteristic of glioblastoma. (D) Multidimensional scaling illustrates the relative similarity between all 420 single tumor cells and population controls. The distance between any two cells reflects the similarity of their expression profiles. Cells group by tumor (color code), but each tumor also contains outliers that are more similar to cells in other tumors. (E) RNA-seq read densities (vertical scale of 10) over surface receptor genes are depicted for individual cells (rows) from MGH30. Cell-to-cell variability suggests a mosaic pattern of receptor expression, in contrast to constitutively expressed GAPDH.

Although our isolation procedures specifically targeted glioblastoma cells, we tested whether our sampling also included normal cells. To distinguish normal from malignant, we attempted to infer large-scale copy-number alterations for each cell by averaging relative expression levels over large genomic regions (24). This allowed us to suppress individual gene-specific expression patterns and emphasize the signal of large-scale copy-number variations (CNVs). As a control, we included RNA-seq profiles from (bulk) normal human brain (25). Hierarchical clustering of all single cells and normal brain samples identified seven groups with concordant CNV profiles (Fig. 1B). The normal brain sample clustered with 10 single cells that presumably have “normal” copy number. In parallel, unsupervised transcriptional analysis identified nine outlier cells with increased expression of mature oligodendrocyte genes and down-regulation of glioblastoma genes (figs. S3 and S4). Notably, all nine of these expression outliers clustered with the normal brain in the CNV analysis (Fig. 1B). The one additional “normal” cell inferred from this CNV cluster correlated with a monocytic expression signature (26) (Fig. 1B). None of the remaining 420 cells show similarity to the transcriptional programs of nonmalignant brain or immune cell types (fig. S5) (24). Although nonmalignant cells are critical components of the tumor microenvironment, the combination of dissociation methods, CD45+ depletion, flow cytometry gating, and computational filtering used in this study largely excluded nontumor cells.

Normalization of CNV profiles using signal from the “normal” cluster revealed coherent chromosomal aberrations in each tumor (Fig. 1C). Gain of chromosome 7 and loss of chromosome 10, the two most common genetic alterations in glioblastoma (20), were consistently inferred in every tumor cell. Chromosomal aberrations were relatively consistent within tumors, with the exception that MGH31 appears to contain two genetic clones with discordant copy-number changes on chromosomes 5, 13, and 14. Although these data suggest large-scale intratumoral genetic homogeneity, we recognize that heterogeneity generated by focal alterations and point mutations will be grossly underappreciated using this method. Nevertheless, such panoramic analysis of chromosomal landscapes effectively separated normal from malignant cells.

To analyze global transcriptional interrelationships, we used multidimensional scaling to represent the degree of similarity among the cells in the data set (Fig. 1D) (24). In contrast to the chromosome-scale analysis above, we observed extensive intratumoral heterogeneity at the transcriptional level. Although most cells grouped by tumor of origin, there were many examples of cells from one tumor crossing into the transcriptional space of another tumor. Moreover, the transcriptional diversity within each individual tumor was significantly greater than that observed for the normal oligodendrocytes (fig. S4) or for an in vitro model of stemlike tumor-propagating glioblastoma cells (GBM6 and GBM8) (27, 28) (fig. S2).

Cell-to-cell variability is also evident in the expression and splicing patterns of signaling molecules such as receptor tyrosine kinases (RTKs), which are important therapeutic targets (29). Mosaic RTK amplification and redundant signaling pathways contribute to targeted therapy resistance in glioblastoma (11, 12, 30). We found mosaic expression for EGFR, PDGFRA, PDGFA, FGFR1, FGF1, NOTCH2, JAG1, and other surface receptors and ligands in pathways pertinent to glioblastoma (Fig. 1E and figs. S6 and S7). Notably, the transcripts encoding such genes are highly expressed in individual cells and in the aggregate profiles, increasing our confidence that their absence reflects true negatives (23). Additionally, multiple EGFR truncations and in-frame deletions have been described, including an oncogenic mutant form, EGFRvIII, which lacks the extracellular domain (de2-7) and is a putative target for immunotherapy (31). Of the three tumors with significant EGFR expression in our data set (MGH26, MGH30, and MGH31), MGH30 expresses EGFRvIII. By examining junction-spanning “spliced” reads at the single-cell level, we identified cells in MGH30 expressing wild-type EGFR (7%) and EGFRvIII (19%), as well as a second oncogenic variant [de4 (32); 25%] (fig. S8). These variants were almost mutually exclusive, with just 1 to 2% of cells coexpressing wild-type EGFR and EGFRvIII. Moreover, several cells lack EGFR but express other tyrosine kinase receptors, suggesting potential alternative pathways for proliferative signaling (Fig. 1E and fig. S7). For example, EGFR expression is anticorrelated with PDGFRA and PDGFC expression in cells from MGH30 (fig. S9). These findings suggest that heterogeneous expression and/or mutational status of RTKs and other signaling molecules across individual glioblastoma tumor cells may compromise therapies targeting receptor immunogenicity or RTK signaling.

We next used hierarchical clustering and principal component analysis to define four meta-signatures, each composed of multiple related clusters that coherently vary across individual cells from a given tumor or the full data set (24) (Fig. 2A). These four meta-signatures were, respectively, enriched for genes related to cell cycle (Fig. 2B), hypoxia (Fig. 2C), complement/immune response (fig. S10), and oligodendrocyte function (demarcating the nine normal oligodendrocytes). We validated the coexpression of meta-signature genes by single-cell quantitative polymerase chain reaction (QPCR) on another 91 cells from MGH26 and 76 cells from MGH30 using primers for 24 genes (fig. S11). Although immune cells are an important component of the tumor microenvironment, expression of complement and immune genes by malignant cells was somewhat unexpected. We validated this result using two approaches. MGH29 harbors a previously described p53 mutation, R248L (C→T) (33) (table S1). Although coverage of this transcript was relatively low, we identified three cells from MGH29 that clearly contain this oncogenic mutation. All of these cells also expressed C3 and other genes from the complement/immune module (fig. S12). Moreover, direct examination of additional cells scoring for the complement/immune signature confirmed chromosomal aberrations characteristic of glioblastoma (Fig. 1C). Notably, the module may be more generally relevant, as robust coexpression of these genes was detected in multiple cell lines derived from brain tumors in the Cancer Cell Line Encyclopedia (CCLE) (fig. S13) (34).

Fig. 2 Unbiased analysis of intratumoral heterogeneity reveals coherent transcriptional modules.

(A) Gene sets that vary coherently between cells in specific tumors or across the global data set (colored boxes) were identified by principal component analysis or clustering (24). Hierarchical clustering of these gene sets across all cells (tree) reveals four meta-signatures related to hypoxia, complement/immune response, oligodendrocytes, and cell cycle. (B) Heatmap shows expression of the cell cycle meta-signature, selected cell cycle gene sets, and representative genes from the signature (rows) in individual glioblastoma cells (columns). Cells were grouped by tumor and ordered by meta-signature score (top). (C) Heatmap depicts hypoxia meta-signature as in (B).

Another notable feature of the module analysis is the activity of the cell cycle program in a relatively small proportion of cells from each tumor, ranging from just 1.4% in MGH31 to 21.9% in MGH30 (table S2 and Fig. 2B). These values contrast markedly with those of the in vitro glioblastoma models in which almost 100% of cells scored positively for the cell cycle module, but are relatively consistent with Ki67+ quantifications for these tumors (figs. S14 and S15). We investigated several markers previously linked to quiescence, including HES1, TSC22D1, and KDM5B (3537). Transcripts for HES1 were not well detected in our data owing to low expression, but TSC22D1 and KDM5B showed significantly higher expression in noncycling tumor cells (fig. S16A). KDM5B, which has been implicated in quiescence and therapeutic resistance in melanoma, was detected in 10 to 20% of individual cells across all tumors and confirmed to anticorrelate with cell cycle meta-signature by single-cell QPCR in MGH26 (fig. S16B).

Clustering of genes anticorrelated to the cell cycle meta-signature also revealed a group of 12 genes (fig. S17A), nine of which were in the hypoxia meta-signature (Fig. 2C). Indeed, these meta-signatures were diametrically opposed (fig. S17B). Although this meta-signature might be influenced by tissue-processing procedures, hypoxia has been studied extensively as a stimulus for angiogenesis (17) and transdifferentiation of glioblastoma stem cells into vascular endothelium (3840). Reordering of the cells by hypoxic module score, which was pronounced in MGH28 and MGH31, demonstrated clear gradients in each sample (Fig. 2C), potentially reflecting variations in the tumor microenvironment that affect oxygen tension, blood supply, or nutritional source (17, 3840). Further studies are needed to understand the spatial relationships of these transcriptional niches in vivo. Thus, in vivo microenvironment and genes linked to quiescence may affect dormant and possibly refractory compartments in glioblastoma.

Glioblastomas likely contain a primitive subpopulation of stemlike cells (GSCs) with preferential resistance to existing therapies (16, 18). GSCs can be modeled in vitro as spherogenic cultures that potently initiate tumors in mice (15, 27). Glioblastoma is also postulated to contain more differentiated cells (DGCs) that can be expanded as adherent monolayers in serum (27, 39). We established GSC and DGC cultures from three tumors in our study (MGH26, MGH28, and MGH31). As expected, the GSCs exhibit a stemlike phenotype, express the stemness marker CD133, and propagate tumors in xenotransplantation (Fig. 3A and fig. S18). To identify in situ tumor cells with stemlike or differentiated phenotypes, we derived a stemness signature from a consensus set of genes differentially expressed between three respective GSC and DGC culture models (Fig. 3B).

Fig. 3 Transcriptional signatures of a stemlike compartment in primary glioblastoma.

(A) Stemlike (GSC) and differentiated (DGC) culture models were derived from patient tumor MGH26. GSCs grow as spheres (left, top), initiate tumors in xenotransplantation (right, top), and express the stem cell marker CD133 (right, bottom). (B) Heatmap depicts expression of genes (rows) from a stemness signature in differentiated models (DGC, left columns), stemlike models (GSC, right columns) derived from 3 tumors, and in 70 individual cells from MGH31 (middle). (C) Bar plot depicts the Pearson correlation coefficient (y axis) between the stemness signature and selected transcriptional modules in each tumor cell cycle; transcriptional targets of POU3F2, SOX2, SALL2, OLIG2 (core TF) (42); NFI transcriptional targets (NFI) (41); and the proneural (PN), classical (CL), mesenchymal (MES), and neural (N) subtypes defined by the Cancer Genome Atlas (21). (D) Plot depicts stemness score (y axis) computed from stemness signature gene expression in individual cells from each tumor (x axis) ordered by score. Bar plots depict the overall variance (y axis, SD) in the stemness score (red) and the average variance of simulated control gene sets (blue), confirming the significance of the gradient.

Application of the stemness signature to the single-cell transcriptional profiles revealed stemness gradients in all five tumors (Fig. 3D). The stemness gradient is modestly anticorrelated to the cell cycle meta-signature (Fig. 3C), consistent with the notion that stemlike cells divide at lower overall rates (16). Notably, the stemness-differentiation axis was occupied continuously rather than discretely, consistent with the notion that the respective in vitro models emulate phenotypic extremes but do not capture the full spectrum of cellular states within a primary tumor.

Genes correlated to the in vivo gradient include expected classifier genes from the in vitro analysis, as well as additional candidates that may reflect aspects of stemness not evident in the culture model (fig. S19, red and blue, respectively). These include several transcription factors (TFs), such as POU3F2, NFIA, and NFIB, which have been implicated in tumor propagation, neural stem cell self-renewal, and quiescence (41, 42). The in vivo stemness gradient also significantly correlated with the average expression of target genes for these TFs, which we predicted from chromatin immunoprecipitation (ChIP)–seq data (Fig. 3C). Thus, expression signatures and regulatory circuits derived from GSC and neural stem cell models converge to a coherent gradient of cells within primary glioblastoma and identify TFs likely to promote stemlike regulatory programs in vivo.

We next considered the classification scheme established by The Cancer Genome Atlas (TCGA) (21) to distinguish four glioblastoma subtypes: proneural, neural, classical, and mesenchymal. Although these original definitions were established from bulk tumor profiles, we wanted to explore whether individual cells in a tumor vary in their classification. On the basis of population-level (bulk) expression data, the tumors in our study scored as proneural (MGH26), classical (MGH30), or mesenchymal (MGH28 and MGH29) subtypes (fig. S20). To examine the distribution of subtype signatures across individual cells, we calculated subtype scores for each cell using the classifier gene sets.

All five tumors consist of heterogeneous mixtures with individual cells corresponding to different glioblastoma subtypes (Fig. 4, A and B). All tumors had some cells conforming to a proneural subtype regardless of the dominant subtype of the tumor, whereas each of the other subtypes was below detection in at least one tumor. Single-cell QPCR of 30 classifier genes in 167 additional cells from MGH26 and MGH30 (fig. S21) confirmed the presence of multiple subtypes within these tumors in proportions similar to those identified by single-cell RNA-seq. Thus, although population-level data detect the dominant transcriptional program, they do not capture the true diversity of transcriptional subtypes within a tumor.

Fig. 4 Individual tumors contain a spectrum of glioblastoma subtypes and hybrid cellular states.

(A) Heatmap depicts average expression of classifier genes for each subtype (rows) across all classifiable cells grouped by tumor (columns). PN: proneural; CL: classical; MES: mesenchymal; N: neural. Each tumor contains a dominant subtype, but also has cells that conform to alternate subtypes. (B) Hexagonal plots depict bootstrapped classifier scores for all cells in each tumor. Each data point corresponds to a single cell and is positioned along three axes according to its relative scores for the indicated subtypes (supplementary methods). Cells corresponding to each subtype are indicated by solid color, whereas hybrid cells are depicted by two colors. (C) Kaplan-Meier survival curves are shown for proneural tumors from the Cancer Genome Atlas (21). Intratumoral heterogeneity was estimated on the basis of detected signal for alternative subtypes and used to partition the tumors into a pure proneural group and three groups with the indicated additional subtype (group size in parentheses). Tumors with mesenchymal signal had significantly worse outcome than pure proneural tumors (P < 0.05). (D) Kaplan-Meier survival curves shown for proneural tumors partitioned on the basis of the relative strength of alternative subtype signatures in aggregate (24). Tumors with high signal for alternative subtypes had significantly worse outcome than pure proneural tumors (P < 0.05).

Intratumoral subtype heterogeneity provides potentially important insights into tumor biology. The stemness signature is strongest in individual cells conforming to the proneural (r = 0.12 to 0.68, P < 0.01, Student’s t test) and classical (r = 0.26 to 0.64, P < 0.01, Student’s t test) subtypes, but underrepresented in cells of the mesenchymal subtype (Fig. 3C and fig. S22), which has been correlated with astrocytic differentiation (21). In contrast, cells of the neural subtype do not correspond to either in vitro model (Fig. 3C), but are more similar to normal oligodendrocytes (Fig. 4B). These findings highlight parallels between intratumoral cellular heterogeneity in glioblastoma and cellular diversity in the developing brain, with respective subsets of tumor cells resembling a progenitor compartment, an astrocytic lineage, or an oligodendrocytic lineage. This analysis also revealed “hybrid” states (Fig. 4B) in which a single cell scored highly for two subtypes, most commonly classical and proneural (progenitor states) or mesenchymal and neural (differentiated states). These hybrid states may reflect aberrant developmental programs and/or interconversion between phenotypic states.

Finally, we examined whether subtype heterogeneity is relevant to prognosis (24). We focused on tumors classified as proneural, controlling for IDH1 status (3, 43) and binning them into three groups: (i) pure proneural tumors without any transcriptional signal for other subtypes; (ii) low-heterogeneity tumors with modest signal for other subtypes (defined as average expression of the alternative subtype genes greater than the median value in the proneural group); and (iii) high-heterogeneity tumors with stronger signals for other subtypes (defined as greater than the 85th percentile in the proneural group). We also partitioned the proneural tumors according to the other detected subtype. We found that increased heterogeneity was associated with decreased survival (Fig. 4, C and D). This suggests that the clinical outcome of a proneural glioblastoma is influenced by the proportion of tumor cells of alternate subtypes and emphasizes the clinical importance of intratumoral heterogeneity.

We have leveraged single-cell transcriptomics to characterize heterogeneous gene expression programs within five glioblastoma tumors and interrelate their transcriptional, functional, and (to a limited extent) genetic diversity. These findings have fundamental implications for cancer biology and therapeutic strategies, as signaling molecules relevant to targeted therapy show cell-to-cell variability in expression and isoform selection. Moreover, in vivo tumor cells display a spectrum of stemness and differentiation states, variable proliferative capacity, and variable expression of quiescence markers, all of which may confound therapeutic strategies. Although population-level methods for glioblastoma classification have provided important prognostic insights, they do not recapitulate the diversity of transcriptional programs present in an individual tumor. Our analysis reveals that tumors contain multiple cell states with distinct transcriptional programs and provides inferential evidence for dynamic transitions. A better understanding of the spectrum and dynamics of cellular states in glioblastoma is thus critical for establishing faithful models and advancing therapeutic strategies that address the complexity of this disease.

Supplementary Materials

Materials and Methods

Tables S1 to S3

Figs. S1 to S23

References (4448)

References and Notes

  1. Materials and methods are available as supplementary materials on Science Online.
  2. Acknowledgments: We thank E. Shefler for project management, E. Rheinbay for graphical contributions, J. Kim and P. Santos for histology, and P. Bautista and Y. Yagi for slide scanning. A.P.P was supported by NIH– National Institute of Neurological Disorders and Stroke R25NS065743. I.T. was supported by a Human Frontier Science Program fellowship and a Rothschild fellowship. M.L.S was supported by a grant from Medic Foundation. This research was supported by funds from Howard Hughes Medical Institute, Burroughs Wellcome Fund, Harvard Stem Cell Institute, NIH (R01 NS032677 to R.L.M. and U24 CA180922 to A.R.), National Brain Tumor Society, and Klarman Family Foundation. A.R. is a scientific advisory board member for Syros Pharmaceuticals and a consultant for Cancer Therapeutics Innovation Group. B.E.B. is a scientific advisory board member for Syros Pharmaceuticals and a founder and scientific advisor for HiFiBio SAS. RNA-seq data are deposited in Gene Expression Omnibus with accession no. GSE57872.
View Abstract

Stay Connected to Science

Navigate This Article