Research Article

An interactive reference framework for modeling a dynamic immune system

See allHide authors and affiliations

Science  10 Jul 2015:
Vol. 349, Issue 6244, 1259425
DOI: 10.1126/science.1259425

Single-cell measurements map immunity

Multiple characteristics of individual cells define cell types and their physiological states. Spitzer et al. quantitated the abundance of 39 different cell surface proteins or transcription factors on individual cells of the mouse immune system. They used these measurements to create a map that clustered similar individual cells into groups corresponding to cell type and function. Their extensible experimental platform will allow the inclusion of other data types and data from independent laboratories.

Science, this issue 10.1126/science.1259425

Structured Abstract


Immune cells constitute an interacting hierarchy that coordinates its activities according to genetic and environmental contexts. This systemically mobile network of cells results in emergent properties that are derived from dynamic cellular interactions. Unlike many solid tissues, where cells of given functions are localized into substructures that can be readily defined, the distribution of phenotypically similar immune cells into various organs complicates discerning any modest differences between them. Over decades of investigation into immune functions during health and disease, research has necessarily focused on understanding the individual cell types within the immune system, and, more recently, toward identifying interacting cells and the messengers they use to communicate.


Methods of single-cell analysis, such as flow cytometry, have led the effort to enumerate and quantitatively characterize immune cell populations. As research has accelerated, our understanding of immune organization has surpassed the technical limitations of fluorescence-based flow cytometry. With the advent of mass cytometry, which enables measuring significantly more features of individual cells, most known immune cell types can now be identified from within a single experiment. Leveraging this capability, we set out to initiate an immune system reference framework to provide a working definition of immune organization and enable the integration of new data sets.


To build a reference framework from mass cytometry data, we developed a novel algorithm to transform the single-cell data into intuitive maps. These Scaffold maps provide a data-driven interpretation of immune organization while also integrating conventional immune cell populations as landmarks to orient the user. By applying Scaffold maps to data from the bone marrow of wild-type C57BL/6 mice, the method reconstructed the organization within this complex developmental organ. Using this sample as a reference point, the unique organization of immune cells within various organs across the body was revealed. The maps recapitulated canonical cellular phenotypes while revealing reproducible, tissue-specific deviations. The approach revealed influences of genetic variation and circadian rhythms on immune structure, permitted direct comparisons of murine and human blood cell phenotypes, and even enabled archival fluorescence-based flow cytometry data to be mapped onto the reference framework.


This foundational reference map provides a working definition of systemic immune organization to which new data can be integrated to reveal deviations driven by genetics, environment, or pathology. Beyond providing an analytical framework to understand immune organization from the unified data set generated here, the approaches we describe can serve as a data repository for collating experimental data from the research community, including gene expression and mutational analysis. Efforts that characterize cellular behavior in this open-source approach will continue to improve upon the initiating reference presented here to reveal the inherent structure in biological networks of immunity for clinical benefit.

Building a dynamic immune system reference framework.

By combining mass cytometry with the Scaffold maps algorithm, the cellular organization of any complex sample can be transformed into an intuitive and interactive map for further analysis. By first choosing one foundational sample as a reference (i.e., the bone marrow of wild-type mice), the effects of any perturbation can be readily identified in this framework.


Immune cells function in an interacting hierarchy that coordinates the activities of various cell types according to genetic and environmental contexts. We developed graphical approaches to construct an extensible immune reference map from mass cytometry data of cells from different organs, incorporating landmark cell populations as flags on the map to compare cells from distinct samples. The maps recapitulated canonical cellular phenotypes and revealed reproducible, tissue-specific deviations. The approach revealed influences of genetic variation and circadian rhythms on immune system structure, enabled direct comparisons of murine and human blood cell phenotypes, and even enabled archival fluorescence-based flow cytometry data to be mapped onto the reference framework. This foundational reference map provides a working definition of systemic immune organization to which new data can be integrated to reveal deviations driven by genetics, environment, or pathology.

The immune system is a systemically mobile network of cells with emergent properties derived from dynamic cellular interactions. Unlike many solid tissues, where cells of given functions are localized into substructures that can be readily defined, the distribution of phenotypically similar immune cells into various organs makes it difficult to discern differences between them. Much research has necessarily focused on understanding the individual cell types within the immune system, and, more recently, toward identifying interacting cells and the messengers they use to communicate. Methods of single-cell analysis, such as flow cytometry, have been at the heart of this effort to enumerate and quantitatively characterize immune cell populations (13). As research has accelerated, the number of markers required to identify cell types and explain detailed mechanisms has surpassed the technical limitations of fluorescence-based flow cytometry (14). Consequently, insights have often been limited because only a few cell subsets could be examined, independent of the immune system as a whole (5, 6).

Although individual immune cell populations have been examined extensively, no comprehensive or standardized reference map of the immune system has been developed, primarily because of the difficulty of data normalization and lack of coexpression measurements that would enable “merging” of results. In other analysis modalities, such as transcript profiling of cell populations, reference standards and minable databases have shown extraordinary utility (714). A comprehensive reference map defining the organization of the immune system at the single-cell level would similarly offer new opportunities for organized data analysis. For example, macrophages exhibit tissue-specific phenotypes (15), and adaptive immune responses are influenced by genetics (16), but discerning these properties of immune organization required integrating the results of many disparate studies. Even current analytical tools that do provide a systems-level view do not compare new samples to an existing reference framework, making them unsuitable for this objective (17, 18). In contrast, a reference map that is extensible could provide a biomedical foundation for a systematized, dynamic, community-collated resource to guide future analyses and mechanistic studies.

We leveraged mass cytometry, a platform that allows measurement of multiple parameters simultaneously at the single-cell level, to initiate a reference map of the immune system (1921). By combining the throughput of flow cytometry with the resolution of mass spectrometry, this hybrid technology enables the simultaneous quantification of 40 parameters in single cells. The use of mass cytometry allows fluorophore reporters to be replaced with isotopically pure, stable heavy metal ions conjugated to antibodies or affinity reagents (22). These reporter ions are then quantified by time-of-flight mass spectrometry to provide single-cell measurements, enabling a more detailed characterization of complex cellular systems for a robust reference map.

An analytical framework for a reference map

A useful reference map should enable a data-driven organization of cells and should be flexible enough to accommodate different types of measurements. This would result in a map that has underlying consistency but is also robust enough to allow overlay of new data (or even of archival data from different measurement modalities) according to cell similarities. The approach is meant to provide templates for representing the system as a whole to enable systems-level comparisons, similar to other efforts to compare biological networks (2328). Although we provide one template here, the framework is built to enable users to construct individualized or community-organized versions.

Building a reference map requires the ability to overlay data from multiple samples onto one or more foundational reference samples; this ability is not accommodated by algorithms such as SPADE and viSNE, which necessitate incorporating data from all samples at the onset (17, 18). Without this feature, the reference map would not be an extensible solution. Moreover, the reference map ought to incorporate information about millions of individual cells to comprehensively represent the numerous cell types within complex samples, which remains beyond the capacity of other approaches (18). The mapping procedure should also enable users to implement one of the many available clustering algorithms or their own subjective definitions to determine cell groupings (29). Perhaps most important, positions of landmark cell populations are marked as flags on the map to allow users to compare cells in new samples to cells described in the existing literature (30).

Force-directed graphs are a type of graphical model commonly used to spatially organize complex data in an intuitive and flexible manner (31). Force-directed graphs also enable a method for grouping cells with similar features in a space that is defined by the molecular features of the individual cells (32). Force-directed approaches are based on a set of “forces” that guide data organization into, usually, a two-dimensional (2D) plane (33, 34). Nodes (in this case, groups of cells) that are similar are connected by edges with a length proportional to their resemblance (in our implementation, cosine similarity). These nodes are then spatialized into a graph: All nodes repel one another as if they were the same poles of magnets, but edges pull similar nodes together, acting like springs. We adapted this concept to build a new method to visualize complex cellular samples, termed Scaffold (single-cell analysis by fixed force- and landmark-directed) maps.

Scaffold maps enable a model to be built that incorporates prior knowledge from the literature but also allows the discovery and analysis of unanticipated cell types or behavioral states. Such an extensible map can allow for new data sets to be incorporated and linked to their mechanistic conclusions with references—as do transcriptomics or genomics databases (7, 11, 13, 14).

Systematic analysis for an immune reference map

We initiated a prototype high-resolution reference map of the murine immune system by characterizing the expression of 39 cell surface proteins and transcription factors (selected to delineate immune cell types) on more than 3 × 107 single cells from 10 different anatomical locations (fig. S1A, table S1, and Materials and Methods). Single-cell suspensions from the bone marrow, blood, spleen, skin-draining (inguinal) lymph node (SLN), mesenteric lymph node (MLN), thymus, lungs, liver, small intestine, and colon of 12-week-old male C57BL/6, Balb/c, and 129S1/Sv mice were simultaneously processed in replicate. Measurements were done under conditions that limited measuring error (35, 36), and all antibodies were validated to bind target proteins by standard protocols. As such, one antibody cocktail was used for all samples, and cells were bar-coded and pooled by tissue before cell staining to minimize technical variability (Materials and Methods). Single-cell protein expression was quantified using a CyTOF mass cytometer (Fluidigm Corp., South San Francisco, CA). The data for these samples were normalized to account for variability in instrument sensitivity over time (36). Cells from each condition were subsequently identified by their bar code and written into a unique flow cytometry standard file for each sample (see acknowledgments for data distribution instructions).

Defining immune organization in the bone marrow

Because the bone marrow contains most developing and mature immune cell types, we used the cells therein to build a foundational map as a point of comparison (Fig. 1A). “Landmark” populations of immune cells commonly recognized in the literature were identified in the bone marrow data of all C57BL/6 replicates by conventional criteria (Fig. 1A and fig. S1B). These populations ranged from hematopoietic stem cells to terminally differentiated lymphocytes and myeloid cells and served as landmarks within the map (visualized by red nodes) to demarcate the location of cell populations of interest (Fig. 1A).

Fig. 1 Scaffold maps reveal immune organization of the bone marrow.

(A) Schematic of the Scaffold map algorithm. (i) Bone marrow from C57BL/6 mice was chosen as the reference sample. (ii) Leukocytes were grouped according to prior knowledge to define landmark cell populations as reference points on the map. The same leukocytes were subjected to unsupervised clustering to provide an objective view of the tissue composition and organization. An illustration is provided with the two major lineages of mature T cells, which express either CD4 or CD8. (iii, iv) Both landmark populations (red nodes) and unsupervised clusters (blue nodes) were used to generate a force-directed graph in which similar nodes are located close together according to the similarity of their protein expression. Thus, similar nodes fall in proximity to one another while disparate nodes segregate apart from one another. Size of unsupervised clusters denotes the relative number of cells in that grouping. (v) Landmark populations from the bone marrow were fixed in place for subsequent maps to provide points of reference for rapid human interpretation. (vi) Additional samples were each subjected to unsupervised clustering via the same clustering algorithm. (vii) The resulting clusters for each sample were overlaid onto the original landmark nodes to generate tissue-specific Scaffold maps. (B) Bone marrow Scaffold map for C57BL/6 mice. Red nodes denote landmark manually gated cell populations; blue nodes represent unsupervised cell clusters from the same data. Inset: median frequencies of cell populations defined by conventional criteria from the bone marrow of C57BL/C mice, n = 14. (C) Scaffold map showing only the position of the landmark nodes with arrows annotating established maturation relationships in hematopoietic development.

We also took a data-driven approach to group similar cells into “clusters” according to their expression of the measured proteins. Grouping similar cells by clustering allows all of the data to be visualized at once. We therefore performed an unsupervised clustering of the C57BL/6 bone marrow leukocytes from all biological replicates with a modified PAM (partitioning around medioids) algorithm adapted for larger data sets (Fig. 1A and Materials and Methods) (37). We chose a number of clusters (200) that we expect exceeds the number of “true” cell populations present in the data. Therefore, we do not expect each cluster to represent a recognized functional cell subset, but rather to overpartition the data to ensure that two populations of distinct natures are not merged through underclustering. We believe this to be an appropriate tradeoff, as the proximity of clusters immediately reveals groups of highly similar cells and thereby provides clarity during visualization. This enables an intuitive browsing of the data rather than relying on clustering to define the “true” number of cell populations, which depends on evolving semantic conventions and understandings of cellular functions. Manual analysis of cell populations by traditional criteria, which we visualized by landmark nodes, remains the standard against which automated clustering algorithms are routinely compared (29).

The reference map was built by combining these unsupervised cell clusters (blue nodes) with the manually identified cell populations (red nodes) (Fig. 1A). Cluster sizes were scaled to reflect the relative cell frequencies in these initial maps, although this option can be modified. A force-directed algorithm was applied to the data, attracting cell clusters with similar phenotypes while separating those with dissimilar phenotypes (Fig. 1A). When mapping C57BL/6 bone marrow cells (Fig. 1B), the landmark and unsupervised nodes were arranged (with no manual intervention or organization) into a structure that recapitulated most known developmental relations between these populations (Fig. 1C) (17, 20). For instance, the hematopoietic stem cell (HSC) landmark was situated at the top of the map and linked to progenitors and more mature populations below. Different granulocytes (including neutrophils, eosinophils, basophils, and mast cells) occupied nearby portions of the map. Macrophages and conventional dendritic cells (cDCs) fell adjacent, and the various T cell populations [CD4+, CD8+, NKT (natural killer T), and γδ] grouped together.

Because clusters serve as a means of partitioning the data in this map, the density of clusters also reflected the relative frequencies of immune cells in the bone marrow that correspond to cell types as defined by established criteria (Fig. 1B, inset). For instance, the map exhibited the densest concentration of unsupervised clusters (blue nodes) surrounding the neutrophil, monocyte, and B cell landmarks. Rarer populations, such as dendritic cells, eosinophils, and basophils, were more sparsely represented. The progenitor zone contained cell clusters proximal to every multipotent population identified by established criteria with cell clusters also falling in between them, revealing the transition states between classically defined progenitors. This graph represents the data from all C57BL/6 biological replicates combined, although the data from individual mice consistently demonstrated these trends (fig. S2).

The Scaffold map of the bone marrow thus reflected the expected biological relations between immune cell populations and enabled an unsupervised visualization of its composition and complexity. The profiles of cells in any cluster, or any group of clusters, can also be visualized by conventional histograms. We used this as the initiating reference template and mapped other organs onto this map for comparison.

Mapping immune organization across the body

After determining that Scaffold maps effectively convey the organization of the immune cells present in the bone marrow, we determined how immune cells from other lymphoid organs or the blood might map into this space. By fixing the identity and position of the landmark (red) nodes that represent canonical populations in the bone marrow, we retained a common reference across all samples (Fig. 1A). We performed unsupervised clustering of total leukocytes from each tissue independently, and then overlaid these cell clusters (blue nodes) onto the reference map by allowing them to find their location according to the attractive and repulsive forces described above (Fig. 1A and Fig. 2).

Fig. 2 Mapping systemic immune organization by tissue.

Scaffold maps for lymphoid organs and peripheral solid organs from C57BL/6 mice, using bone marrow as the reference sample to define landmark nodes (red): (A) blood, (B) spleen, (C) skin-draining (inguinal) lymph node (SLN), (D) mesenteric lymph node (MLN), (E) thymus, (F) lungs, (G) liver; n = 14 for each organ. Insets, from top to bottom: Cells comprising B cell clusters from the spleen and SLN were visualized by 2D scatter plot. Immune cell circulation through and within the tissues was characterized by mass cytometry. Cells comprising a deviant thymic T cell population cluster were visualized by 2D scatterplot.

By inspecting the composition of the peripheral blood on the map, it was apparent that the cell populations overlapped with those found in the bone marrow, as was evident by the proximity of unsupervised clusters to the landmarks (Fig. 2A). As expected, the blood did not contain cells localized to the HSC/progenitor portion of the map. Rather, cell clusters associated with landmark nodes of mature cell populations known to predominate in circulating blood at steady state, including granulocytes, monocytes, B cells, T cells, and NKT cells (figs. S3 and S4 and table S2). Because unsupervised cell clusters from the blood were positioned close to landmark populations, there were no substantial unanticipated populations present in the circulation.

In comparison, maps for the secondary lymphoid organs (spleen, SLN, MLN) all exhibited an immune landscape dominated by mature lymphoid cells of the T and B cell lineages (Fig. 2, B to D). Indeed, these populations were also comparable when viewed by conventional 2D dot plots (Fig. 2, B and C, inset). Many of the myeloid cells in these tissues mapped more closely to the macrophage and dendritic cell zones and expressed major histocompatibility complex (MHC) class II, used to present antigens, consistent with the presence of mature antigen-presenting cells (APCs) in these organs (Fig. 2, B to D) (38). The clusters from the secondary lymphoid organs also largely mapped near a landmark population, indicating that most cells found in these tissues belong to well-characterized populations. The subtle differences in the cellular organization of these organs become evident thorough investigation of their maps, revealing enrichment in NKT cells, monocytes, macrophages, and cDCs in the spleen relative to frequencies of those cells in lymph nodes [P < 0.0001 for each by analysis of variance (ANOVA)]. A higher frequency of macrophages (P = 0.0006 by two-sided t test) and lower frequency of cDCs (P = 0.013 by two-sided t test) were present in the SLN than in the MLN. An appreciation for the distinct cellular composition of different secondary lymphoid organs provides an opportunity to examine how each cellular environment shapes the immune responses initiated in these locations.

Many of the cell clusters in the thymus radiated far away from the landmarks on the map. Inspection of these clusters indicated that many comprised CD4+CD8+ double-positive (DP) T cells that were absent from the bone marrow (Fig. 2E, red arrow). As the thymus largely contains developmental T cells, this was expected. However, the increased length of the lines connecting these ubiquitous DP T cell clusters to their nearest landmarks denotes cells that deviate from the characterized reference. We also observed these trends when cell populations from the spleen were used to define landmarks (fig. S5).

Immune cell subsets in peripheral solid organs were compared to the reference map of the bone marrow (Fig. 2, F and G, and fig. S6). The region of the maps representing myeloid cells was, in general, more densely filled (figs. S3 and S4). For instance, cells from the lungs exhibited many clusters distributed among the macrophage, cDC, and eosinophil landmarks, indicating that cells in this tissue were phenotypically distinct from those in bone marrow and even spleen. Alveolar macrophages in the lung expressed the proteins CD11c and Siglec-F, which are canonically markers of cDCs and eosinophils, respectively (Fig. 2F) (39). Similarly, the liver map exhibited many clusters connected to the macrophage landmark, although the length of the lines connecting them was longer than those for the macrophages in the bone marrow (P = 0.0004 by one-sided Wilcoxon rank sum test; see Materials and Methods), consistent with the unique characteristics of liver macrophages (Kupffer cells) (Fig. 2G) (40). Overall, these maps of peripheral solid organs, including the gut (fig. S6), exhibited less fidelity than those of lymphoid organs to the bone marrow reference, indicating that immune cells in these sites are likely distinct in their phenotypes and functions. Several previously uncharacterized cellular phenotypes are listed in table S3. For future studies, cell populations present in any tissue could also be used to define landmarks for organ-specific maps. Moreover, a comparative analysis of immune organization within the gut revealed site-specific characteristics, with significantly lower frequencies of CD4 and CD8 T cells and higher frequencies of macrophages and cDCs in the colon than in the small intestine (P = 2.8 × 10−15, P = 0.001, P = 9.4 × 10−7, and P = 1.0 × 10−5, respectively, by one-sided t test; fig. S6). This understanding will inform further investigations of immune responses and pathologies within regions of the gut.

Genetic variation affects immune cell composition and phenotype

We used the reference maps to reveal the impact of genetic diversity on immune cell phenotypes and organization. We generated Scaffold maps of immune cells from two common inbred mouse strains, 129S1/Sv and Balb/c (Fig. 3). Mapping cells from the bone marrow from these animals onto the C57BL/6 reference map revealed that the vast majority of clusters fell close the C57BL/6 landmarks (Fig. 3, A and B). However, certain cell clusters were distinct from those in the C57BL/6 reference. This likely reflects genetic variability, such as the relative lack of T cells in Balb/c mice, which we confirmed by conventional analysis of T cell populations (CD4 T cells, P = 0.0007; CD8 T cells, P = 0.001; γδ T cells, P = 2.2 × 10−7; NKT cells, P = 6.2 × 10−8 by ANOVA).

Fig. 3 Immune organization across inbred mouse strains.

Scaffold maps for several tissues from 129S1/Sv and Balb/c mice, using C57BL/6 bone marrow as the reference sample to define landmark nodes (red): (A) bone marrow from 129S1/Sv mice, (B) bone marrow from Balb/c mice, (C) SLN from 129S1/Sv mice, (D) SLN from Balb/c mice, (E) liver from 129S1/Sv mice, (F) liver from Balb/c mice (n = 3 for each panel). Histograms of CD64 and MHC II expression on liver macrophages from representative mice of each strain.

Similarly, analysis of the maps for lymphoid organs from these strains demonstrated high fidelity between unsupervised clusters and landmarks, with enrichment for mature lymphocytes. Other cell types in these organs also reflected the underlying genetics, such as pDC and NKT cells, which were overrepresented in the SLN of Balb/c mice (P = 1.2 × 10−6 and P = 7.5 × 10−8, respectively, by ANOVA) (Fig. 3, C and D, fig. S2, and table S2). In contrast, the SLN in C57BL/6 mice contained significantly more cDCs and NKT cells but fewer CD4 T cells than did the SLN from the other strains (P = 5.0 × 10−5, P = 2.9 × 10−7, and P = 5.5 × 10−10, respectively, by ANOVA). Analysis of peripheral solid organs revealed other apparent impacts of genetic variation. In the liver, an unexpected shift in cell density from the macrophage to the cDC landmark was observed only in 129S1/Sv mice. Further investigation of these cells demonstrated differential expression of CD64 and MHC II in liver macrophages from these inbred strains, causing these cells to adopt a phenotype more similar to that of cDCs (Fig. 3, E and F, red arrows). The difference in CD64 staining could be attributable to a polymorphism in the gene expressed by 129S1/Sv mice (41). However, this difference in MHC II expression was not observed when comparing macrophages in other solid organs, suggesting that this disparity is specific to the liver.

These results illustrate the ability of Scaffold maps to highlight sample-specific differences in immune cell characteristics. These maps convey a common global structure of immune cell populations along with specific influences of genetic variance.

Circadian influences on immune organization

To investigate circadian immune fluctuations, which can powerfully regulate immune system behavior (42, 43), we obtained organs from C57BL/6 mice in four batches, either in the morning (8 to 9 a.m.; Zeitgeber time 1 to 2) or afternoon (1 to 2 p.m.; Zeitgeber time 6 to 7) of two consecutive days.

Analysis of the maps revealed a number of cell populations that fluctuated according to the time of day. Unexpectedly, these were significantly more pronounced in the peripheral solid organs than in the lymphoid tissues. The lungs displayed clear circadian patterns with remodeling of the ratios for several immune cell populations (Fig. 4A). To validate these findings, we used fluorescence-based flow cytometry to investigate the composition of the lungs in a new cohort of animals. In both analyses, the frequencies of CD8 T cells and B cells were significantly higher in the afternoon than in the morning (Fig. 4B). In contrast, the frequency of macrophages increased in the morning, revealing a compensatory shift in composition from myeloid to lymphoid cells (Fig. 4B). Scaffold maps in which cell populations from the lungs were used as the landmarks additionally recapitulated these results (fig. S7). Further investigation of the macrophage compartment by generating a population-specific, force-directed map revealed differential remodeling of alveolar and interstitial macrophages in a circadian manner (fig. S8A). Validation by conventional criteria corroborated that alveolar macrophages were more prevalent in the morning, whereas interstitial macrophages were increased in frequency in the afternoon (fig. S8, B and C). Thus, reference map analysis revealed a previously undetected influence of circadian rhythms on immune organization of peripheral organs that was particularly prominent in pulmonary lymphocytes and macrophages. The symptom severity of patients diagnosed with infectious or atopic lung pathologies (i.e., allergies, asthma, and viral pneumonias) fluctuates in a circadian manner (44, 45). These results provide a potential explanation for these trends, as the lung-resident immune compartment undergoes circadian reorganization. This suggests that certain modes of antigen presentation could become exacerbated during different times of the day, or could indicate that nasally applied vaccines or therapeutics might have differing influences on immune function depending on the time of application.

Fig. 4 Mapping circadian changes in the lungs.

(A) Scaffold maps of lungs of representative animals collected in the morning (8 to 9 a.m.) and afternoon (1 to 2 p.m.). (B) Population frequencies in the lungs between morning and afternoon, as defined by traditional criteria from both the original mass cytometry data set (n = 7 morning and afternoon) and a follow-up fluorescence experiment (n = 7 morning, n = 8 afternoon). Bars represent means ± SEM; P values result from one-sided t test.

Integrating human data into the reference map

Because immune cell types are well conserved between mice and humans, we analyzed human data overlaid onto the murine reference map (46). Mass cytometry data from whole peripheral blood from four healthy human donors was passed through the Scaffold map algorithm. We calculated distance between clusters on the basis of 15 cell surface markers that have similar cell subset expression patterns between humans and mice (Fig. 5, A to C). Differences between the species were apparent, such as the increased frequency of neutrophils and relative scarcity of B cells in human peripheral blood (47). However, the similar overlay pattern confirmed a common global structure of immunity. We also generated a map of murine blood using only the same 15 proteins to measure distance from the established landmarks (Fig. 5C). This similarity is not surprising. Gene expression networks in species as widely separated as humans and mice have strong similarities—even to the point of enabling drug screening based on gene network similarities (48). The human data were not normalized or differentially transformed in any manner, underscoring the robustness of the mapping approach. Efforts to generate a human-centric reference map may enable more detailed mapping of human immune organization, but these results demonstrate the feasibility of comparing cellular features across the species barrier.

Fig. 5 Mapping human and archival data onto the reference map.

(A) Original mass cytometry whole-blood Scaffold map from C57BL/6 mice, n = 14. (B) Scaffold map of human whole blood interrogated by 15-parameter mass cytometry with distance measured using only those 15 dimensions for layout of unsupervised clusters onto the reference. Human parameters were assigned to murine correlate markers with similar cellular distribution, including canonical surface markers used for identification of cell populations by conventional criteria as well as several orthologous proteins, n = 4. (C) Scaffold map of original murine blood mass cytometry data with distance measured using only the same 15 dimensions for layout of unsupervised clusters onto the reference. (D) Original mass cytometry bone marrow Scaffold map from C57BL/6 mice. (E) Scaffold map of C57BL/6 bone marrow interrogated by eight-color fluorescence-based flow cytometry from a previously published data set (17) with distance measured using only those eight dimensions (B220, CD11b, TCRβ, CD4, CD8, c-Kit, Sca-1, CD150) for layout of unsupervised clusters onto the reference. (F) Scaffold map of original mass cytometry data with distance measured using only the same eight dimensions for layout of unsupervised clusters onto the reference.

Mapping archival data

The ability to map data from independent experiments would increase the utility of a reference map, creating a dynamic resource in which knowledge could accrue over time. Therefore, we mapped archival fluorescence-based flow cytometry data onto the reference map (Fig. 5, D to F). We used a previously published data set of bone marrow cells from C57BL/6 mice obtained with eight-color flow cytometry including lineage-specific markers [B220 for B cells, CD11b for myeloid cells, T cell receptor β chain (TCRβ) for T cells, CD4, and CD8 to distinguish the major types of mature T cells] as well as stem cell/progenitor markers [stem cell growth factor receptor (c-Kit), stem cell antigen 1 (Sca-1), and CD150] (17). We used only the information contained in these eight dimensions to calculate similarity (Fig. 5E). As a point of reference, we also generated a Scaffold map from the original mass cytometry data of the C57BL/6 bone marrow using these same eight dimensions (Fig. 5F).

Cells from the fluorescence data occupied the major regions of the Scaffold map with frequencies similar to those in the original reference. Moreover, the maps generated from both fluorescence and mass cytometry data using the same eight dimensions exhibited strong similarity, suggesting that the underlying structure of the system remained the primary driver of the layout organization. Cell populations for which no unique markers exist and for which complex combinations of markers define cell types (such as the different myeloid cell subsets) exhibited lower resolution on the map, and as such, they are grouped in the center of several landmark nodes. Thus, although the specific selection of measured features affects the ability to discriminate between similar cell populations, even a few key parameters can drive cell clusters toward cognate known reference cell subsets within the map.

A cross-sectional view of cellular compartments

It would be useful to reveal in detail the local structure of cell subsets that lack preexisting landmarks, so as to enable characterization of similarities and deviations. Having identified distinctions within given cell subsets across anatomical locations, we used unsupervised force-directed graphs (lacking landmark populations) to organize cells of a given cell type (T cells or dendritic cells, for instance) defined by traditional criteria such that differences between them would become apparent (Fig. 6). Each major cell population from every tissue was clustered and mapped together into force-directed graphs, resulting in a phenotypic landscape for that given cell type. As noted, manually defined landmarks were omitted, although they could be defined in subsequent analyses as desired by the user. Cell clusters were colored according to their tissue of origin to reveal how each tissue is represented within the global similarity map for each cell type. Scaling each cluster proportionally to the percentage of total leukocytes represented the relative frequency of cells in each cluster.

Fig. 6 Defining the landscape of immune cell populations.

Population-specific landscapes were generated as follows: Cell populations were manually gated, subjected to unsupervised clustering, and laid out in an unsupervised force-directed graph. Clusters are colored according to tissue of origin and sized by the number of cells in each cluster as a percentage of the total number of leukocytes in the tissue of origin. Each plot is scaled independently. (A) T cell landscape including CD3+ cells. Cells comprising T cell clusters from the colon and small intestine falling within the red box are visualized by 2D scatterplot, n = 14. (B) B cell landscape including B220+ and CD138+ cells, n = 14. (C) NKT cell landscape including CD49b+ cells, n = 14. (D) cDC landscape including CD11chi MHC IIhi cells, n = 14. (E) Macrophage cell landscape including CD64+ F4/80+ cells, n = 14. Lineage markers are defined in Materials and Methods.

We began by examining the landscape of T cells across the body, as T cells are well known to exhibit organ-specific properties. The mapping shows that a large group of cell clusters was exclusively located in the thymus and expressed both CD4 and CD8, characteristic of developmental double-positive (DP) T cells (Fig. 6A, fig. S9, and table S4). The T cell map then showed two predominant branches characterized by CD4 (left) or CD8 expression (right), which were bridged by smaller clusters lacking high expression of either. Some of these cell clusters expressed the γδ TCR (Fig. 6A, inset). Others expressing TCRβ were localized to the gut and lungs, likely representing recently described mucosa-associated invariant T (MAIT) cells (fig. S9 and table S4) (49). Among the CD4+ and CD8+ T cells expressing the αβ TCR, further divisions were driven by CCR7, CD27, and CD44, which are common markers that distinguish differentiation states (fig. S9 and table S4) (50). The tissue distribution of these subsets appeared skewed, with enrichment of effector and memory T cells in the peripheral solid organs. A group of CD4+ αβ T cell clusters expressed CD25 and forkhead box P3 (Foxp3), characteristic of regulatory T cells, and were overrepresented in the gut (fig. S9 and table S4).

Whereas T cells demonstrate a largely bifurcated set of phenotypes with “bridging” cell subsets, the B cell landscape was markedly different, exhibiting a continuum of phenotypes in tissues distributed across the body (Fig. 6B). Although B cells in the bone marrow exhibited a wide range of phenotypes reflecting developmental stages, those in the secondary lymphoid organs expressed higher amounts of B220 and CD19 (a cell surface co-receptor expressed by most mature B cells) with variable expression of the B cell receptor isotypes IgM, IgD, and CD23 [the low-affinity immunoglobulin E (IgE) receptor] (fig. S10 and table S4). The majority in peripheral solid organs exhibited reduced amounts of IgD and CD23 with increased MHC II (fig. S10 and table S4) (51). Many thymic B cells exhibited a unique phenotype, characterized by the extracellular matrix receptor CD44 and Sca-1, and mapped near the plasma cells, which express CD138 (fig. S10 and table S4). Thus, the B cell landscape was characterized by a phenotypic continuum with enrichment of specific phenotypes according to tissue of residence.

The NKT cell landscape was predominantly organized by expression of CD11b and CD27, which delineate NKT cell maturation stages (Fig. 6C, fig. S11, and table S4) (52). A discrete population of NKT cells expressing higher levels of developmental markers CD34 and cKit (CD117) was found in the bone marrow (fig. S11 and table S4). In the peripheral solid organs, large populations of NKT cells were present in the liver and lung with fewer in the gut. A group of NKT cells with broad tissue distribution expressed Ly6C, which has been associated with NKT cell memory (fig. S11 and table S4) (53). These results recapitulate the known landscape of lymphoid cell biology and provide new insights regarding immune organization across the body according to the tissues in which the immune cells reside (table S4).

Definitive statements regarding myeloid phenotypes and their functions remain a matter of interest (54, 55) and occasional contention (56). For instance, examining the cDC landscape revealed several subgroups, some of which expressed CD4 or CD8; their expression was mutually exclusive, and these cell types were overrepresented in the secondary lymphoid organs (Fig. 6D). Several of the thymic cDC clusters expressed CD8, a feature characteristic of cross-presenting DCs, which may reflect their need to present intracellular antigens in the context of both MHC I and II to promote T cell tolerance (fig. S12 and table S4) (57). Many cDCs in peripheral solid organs and the bone marrow were CD11b+ and expressed higher levels of Fcγ receptors (CD16/CD32), which suggests that they may be more sensitive to antibody-mediated activation (fig. S12 and table S4).

The macrophage landscape exhibited distinct segregation by location, consistent with their tissue-specific homeostatic functions and self-renewal (Fig. 6E) (15). Relative to macrophages present in the SLN and MLN, which exhibited high expression of the CD11b integrin and MHC II, red-pulp macrophages in the spleen expressed significantly less CD11b (fig. S13 and table S4). The macrophages in the gut exhibited the highest expression of MHC II and Fcγ receptors (CD16/CD32), which might reflect a greater capacity to present antigen to CD4 T cells or sensitivity to activation via antibodies (fig. S13 and table S4). Macrophages in the liver (Kupffer cells) expressed the highest levels of F4/80 and CD64, whereas alveolar macrophages in the lung segregated far away, as judged by their high expression of the CD11c integrin, the Siglec-F lectin, and CD44 (fig. S13 and table S4).

Thus, the force-directed graphical landscapes enabled rapid identification of the features that distinguish each population across the samples of interest, providing a model for characterizing the predominant differences among multiple conditions.


We exploited the increased parameterization afforded by mass cytometry to generate a consolidated, extensible reference map of the murine immune system with single-cell resolution. By assessing the composition and characteristics of immune populations throughout the body, this provides the basis for a systematic model of immune organization. Such an objective necessitated new analytical methods for comparing groups of complex cellular samples. Our visualization algorithm combines unsupervised clustering with cellular landmarks defined by prior knowledge. The resulting Scaffold maps enabled global characterization of the steady-state immune structure from different anatomical locations, genetic backgrounds, circadian time points, and species barriers. When compared to an unsupervised graph across the organismal immune system (fig. S14), the advantages of such a framework become apparent. The incorporation of landmarks assists in the interpretation of the graphical organization. They also provide the reference points for comparing data, enabling the unique features of new, uncharacterized samples to stand out by comparison to a characterized baseline sample. A reference map of this nature will be useful in additional iterations when merged with immunological perturbations such as infection, autoimmune disease, or cancer to identify how altered immune states deviate from the steady state.

Beyond providing an analytical framework to understand immune organization from the unified data set generated here, the approaches we describe can serve as a data repository for collating experimental data from the research community (fig. S15). This would provide several distinct benefits. First, users could mine the data included in these studies to investigate the characteristics and distribution of cell types of interest in a dynamic way. Second, user modification of defined parameters (such as the definition of landmark populations) could provide analyses of immune structure not biased by prior strictures.

Perhaps more urgent to the community at large, mapping of newly created data sets onto a reference structure will assist in global comparisons of archival animal experiments with clinical human data. Investigators can merge newly mapped data to compare cellular features across previously mapped features in the reference landscape. With the implementation of standard regression analysis, the presence or absence of given clinical outcomes due to certain immune configurations might be discerned—much as has been the case with accessible archival gene expression data sets (9). In one analysis, the expression of a newly discovered regulatory molecule from ongoing forward genetics efforts (58, 59) could be defined in all immune cell types during health and disease. This could be achieved by measuring such a molecular feature by mass cytometry in addition to the proteins included here and mapping the resulting data. Alternatively, changes in metabolism or cell death programs within the global immune system during chronic inflammation or aging would be revealed, providing knowledge to inform the design of precise therapeutic strategies. Moreover, as the number of measurable parameters on a single-cell basis increases, the framework could easily be updated to reflect more detailed data sets.

Scaffold maps demonstrate the capacity to align data from distinct analysis platforms, including fluorescence-based flow cytometry, or across species of interest, such as the demonstration of mapping human immune data onto a murine framework. As the throughput of other single-cell analysis modalities, such as single-cell RNA sequencing (60, 61), continues to develop, these data could also be incorporated into the map along with other metadata types such as publication records, clinical phenotypes, and other relevant assays analogous to other strategies for data integration (62, 63). Therefore, this core infrastructure forms the basis for a centralized repository in which single-cell data can accrue over time, providing a unified reference map for understanding the organization and behavior of complex cellular systems. Efforts that characterize cellular behavior in this open-source approach will continue to improve upon the initiating reference presented here to reveal the inherent structure in biological networks of immunity for clinical benefit.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S20

Tables S1 to S4

References (6468)

References and Notes

  1. Acknowledgments: All mass cytometry data are accessible at and the R package is available at R. Finck, S. Bendall, and G.P.N. have served as paid consultants of Fluidigm Sciences, the maker of the mass cytometry instrumentation and reagents used for the data collection in this study. We thank J. Kenkel, B. Burt, D.-H. Wang, and M.Ch’ng for their assistance in tissue processing; A.Trejo and A. Jager for mass cytometry quality control and maintenance; B. Gaudilliere and M. Angst for access to human whole blood data; and M. Angelo, C. Loh, N. Reticker-Flynn, and L. Sanman for constructive feedback. Supported by a George D. Smith Stanford graduate fellowship and NIH grant F31CA189331 (M.H.S.); a Stanford Bio-X graduate fellowship and NIH grant T32GM007276 (G.K.F.); CIRM Basic Biology II RB2-01592 and NIH grant NRSA F32 GM093508-01 (E.R.Z.); Damon Runyon Cancer Research Foundation fellowship DRG-2017-09 and NIH grant K99GM104148-01 (S.C.B.); NIH grants 1U19AI100627, 1R01GM109836, and 7500108142, NIAID Bioinformatics Support Contract HHSN272201200028C, PN2EY018228 0158 G KB065, 1R01CA130826, 5U54CA143907NIH, HHSN272200700038C, N01-HV-00242, 41000411217, 5-24927, P01 CA034233-22A1, RFA CA 09-009, RFA CA 09-011, U19 AI057229, U54CA149145, 5R01AI073724, R01CA184968, R33 CA183654, R33 CA183692, 1R01NS089533, 201303028; DOD grants OC110674 and 11491122; Bill and Melinda Gates Foundation grant OPP1113682; CIRM grants DR1-01477 and RB2-01592; FDA grant HHSF223201210194C BAA-12-00118; European Commission grant HEALTH.2010.1.2-1; and the Rachford and Carlota A. Harris endowed professorship (G.P.N.). P.F.G. is a Howard Hughes Medical Institute Fellow of the Life Sciences Research Foundation.
View Abstract

Stay Connected to Science

Navigate This Article