Supplementary Materials

Single-cell transcriptional diversity is a hallmark of developmental potential

Gunsagar S. Gulati, Shaheen S. Sikandar, Daniel J. Wesche, Anoop Manjunath, Anjan Bharadwaj, Mark J. Berger, Francisco Ilagan, Angera H. Kuo, Robert W. Hsieh, Shang Cai, Maider Zabala, Ferenc A. Scheeren, Neethan A. Lobo, Dalong Qian, Feiqiao B. Yu, Frederick M. Dirbas, Michael F. Clarke, Aaron M. Newman

Materials/Methods, Supplementary Text, Tables, Figures, and/or References

Download Supplement
  • Materials and Methods
  • Supplementary Text
  • Figs. S1 to S19
  • Captions for Tables S1 to S8
  • References
Table S1
Overview and characteristics of 42 scRNA-seq benchmarking datasets. This table lists details of each benchmarking dataset, including name, platform, number of cells, number of phenotypes, normalization approach, and associated publication. Dataset-specific filtration criteria are also provided (methods).
Table S2
Performance of RNA-based features for predicting known differentiation status across 42 scRNA-seq benchmarking datasets. This table summarizes predictive performance of RNA-based features in the training (n = 9 datasets) and validation (n = 33 datasets) cohorts (table S1), aggregated across datasets by mean performance (related to Fig. 2B) and median performance.
Table S3
Cellular ontogenetic status, developmental potential, and mean gene counts of cell phenotypes in murine plate-based datasets profiled in vivo. This table lists source data related to Fig. 1C.
Table S4
Predictive performance and raw values of gene counts, gene counts signature, CytoTRACE, lineage trajectory inference methods, stemness prediction tools, and top differentiation-associated gene sets for 42 benchmarking datasets. This table lists single-cell-level performance statistics (weighted Spearman correlation; methods) for predicting developmental potential in each benchmarking dataset (table S1). Values for each analyzed single cell transcriptome are also provided.
Table S5
Characterization of the molecular themes prioritized by gene counts signature (GCS) in 41 scRNA-seq benchmarking datasets. This table summarizes the results of applying ssGSEA (single-sample gene set enrichment analysis) to identify molecular themes associated with GCS. To do this, we first ranked all expressed genes in each dataset by their Pearson correlations against GCS. We then performed ssGSEA to determine enrichment scores for 18,706 annotated gene sets in each ranked gene list (17,810 gene sets from MSigDB and 896 gene sets of TF binding sites from ENCODE/ChEA; methods). This analysis was repeated for all benchmarking datasets except 'Whole planaria (Drop-seq)' owing to the absence of S. mediterranea from NCBI HomoloGene (methods). Datasets were Census-normalized and log2-adjusted prior to analysis.
Table S6
Enrichment of developmentally-associated genes in each benchmarking dataset. This table lists the top 100 (stemness-associated) and bottom 100 (differentiation-associated) genes associated with CytoTRACE in each benchmarked dataset in this study (n = 42; table S1).
Table S7
Patient-level and single-cell-level metadata for human breast tumors; antibodies and reagents for isolation of human breast epithelial cells. This table lists meta-data for the breast tumor scRNA-seq data generated in this work, including patient identifier, hormone receptor status, clinical subtype, predicted epithelial subpopulation, source tissue (tumor or adjacent normal), and CytoTRACE values. The antibodies and reagents used for cell isolation and flow cytometry analysis are also provided.
Table S8
Genes associated with CytoTRACE in tumor and adjacent-normal epithelial cells from human breast cancer patients. This table lists (i) Pearson correlations between gene expression levels and CytoTRACE values in basal, luminal progenitor, and mature luminal cells from human breast tumors and adjacent normal tissues (related to Fig. 4, B to D, and fig. S18, B and C); (ii) a clonogenicity index calculated using microarray data from Shehata et al., 2012 (related to Fig. 4B; methods); and (iii) control genes from an RNAi dropout viability screen conducted by Hsieh et al., 2018 (related to fig. S18B).