A Stem Cell Molecular Signature

See allHide authors and affiliations

Science  18 Oct 2002:
Vol. 298, Issue 5593, pp. 601-604
DOI: 10.1126/science.1073823


Mechanisms regulating self-renewal and cell fate decisions in mammalian stem cells are poorly understood. We determined global gene expression profiles for mouse and human hematopoietic stem cells and other stages of the hematopoietic hierarchy. Murine and human hematopoietic stem cells share a number of expressed gene products, which define key conserved regulatory pathways in this developmental system. Moreover, in the mouse, a portion of the genetic program of hematopoietic stem cells is shared with embryonic and neural stem cells. This overlapping set of gene products represents a molecular signature of stem cells.

Adult and embryonic stem cells (SCs) hold great promise for regenerative medicine, tissue repair, and gene therapy (1). Hematopoietic stem cells (HSCs) have been the most extensively studied and serve as a prototype model to define the general biological properties of mammalian SCs. Distinct developmental stages of the hematopoietic hierarchy can be identified and arranged in a hierarchical tree that begins with the long-term (LT) functional HSC. A single LT-HSC is both necessary and sufficient for life-long sustenance of the entire hematopoietic system (2,3). LT-HSCs produce less potent short-term (ST) functional HSCs, and these in turn, give rise to lineage-committed progenitor (LCP) cells. The LCP cells are directly responsible for the generation of at least 10 mature blood cell (MBC) populations. Many nonhematopoietic tissues also depend on tissue-resident SCs for their maintenance and regeneration (4). Totipotent embryonic stem cells (ESCs), derived from blastocysts, and neural stem cells (NSCs), derived from the germinal zones of the nervous system, are two examples of SCs that can be propagated in vitro (5). Because all SCs share fundamental biological properties, they may share a core set of molecular regulatory pathways. It is likely that at least some components of these regulatory pathways are preferentially expressed by SCs. We therefore attempted to define a general gene expression profile of the SC “state.”

We have adopted the approach outlined in Fig. 1 that first separately identifies gene expression profiles for murine fetal and adult HSCs. These profiles are then compared to derive a shared HSC profile. This profile should include gene products that are necessary for LT hematopoietic function. We also generated gene expression profiles for human HSCs and for two murine nonhematopoietic SC populations, NSCs and ESCs. The comparison of murine with human HSCs defines evolutionarily conserved components in HSCs, whereas the comparison of hematopoietic with nonhematopoietic SCs identifies the gene products expressed in multiple SC types. The samples were processed as shown in fig. S1. Tissue or cell replicates were isolated and functionally evaluated to measure the purity of SC-containing fractions. In vitro amplified RNA probes were hybridized to Affymetrix oligonucleotide arrays. We estimate that these arrays currently allow for the monitoring of approximately 80% of HSC-related gene products (fig. S2). Arrays were scanned and processed using Affymetrix MAS 4.0 software. Genes were assigned to distinct clusters according to their expression patterns within the hematopoietic hierarchy. NSC and ESC enrichment scores were calculated to define the expression of the transcripts in these two SC populations. Bioinformatics analyses were performed for the SC-specific gene products. Details of the SC purification procedures, biological assays, and data analyses are available in supporting online material (6).

Figure 1

Stem cell phenotypes profiled. Cells at key stages of the murine and human hematopoietic hierarchy were isolated as shown, and include LT-HSCs, ST-HSCs, LCPs, and MBCs. Nonhematopoietic SCs were cultured (ESCs) or purified (NSCs). This approach identifies three groups: genes specific for both fetal and adult murine HSCs (blue boxes), genes specific for murine and human HSCs (red box), and genes enriched in diverse SCs (green box).

To translate the biological phenotypes of key hematopoietic populations into the language of gene expression, we used a series of hypothetical expression patterns that correlate with distinct, quantitatively measured biological activities present in the hematopoietic hierarchy (Fig. 2, A to C). A total of 4289 informative genes were assigned to seven clusters (Fig. 2D), characteristic of key stages of hematopoiesis, progressing from stem through progenitor to terminally differentiated cells. HSC-related clusters i to iii include many known HSC markers such asc-Kit, Tie1, Ly-6E/Sca-1,Tek, Mpl, Meis1, Gata2, andAbcb1b/MDR1. At least 72% of the above-defined HSC-related genes are also up-regulated in CD45+c-Kit+Sca-1+ Hoechst 33342 side population cells (7). These cells have been shown to contain LT-HSCs (8). Furthermore, 54% of genes assigned to these clusters were previously identified through a global subtractive hybridization screen for HSC-specific gene products (7, 9) (fig. S2). This demonstrates a strong correlation between HSC-specific gene sets identified by different strategies. The expression specificity of 22 HSC-related genes was confirmed by quantitative reverse transcription–polymerase chain reaction (RT-PCR) (fig. S5). Gene products were grouped into categories according to their function as reported in the literature or as predicted on the basis of the presence of diagnostic protein motifs. Regulatory molecules, such as transcription factors, proteins involved in intracellular signaling, cell-surface receptors, and ligands account for 45% of the HSC-related gene-products (Fig. 2E).

Figure 2

Correlating biological function and gene expression. (A) Competitive repopulating activity of the isolated hematopoietic populations was determined (23). Mice were transplanted with graded doses of purified Ly5.2 fetal liver (FL) or bone marrow (BM) SCs, mixed with 2 × 105 Ly5.1 whole BM cells. Ly5.2 peripheral blood content at 6 months is shown. The repopulating stem cell frequency in these purified populations is 1 in 10 to 20 cells for both FL and BM SCs. (B) The number of colony-forming cells (CFCs) in the isolated stem and progenitor cell populations was determined. Colonies were scored as high proliferative potential–granulocyte macrophage (HPP-GM), GM, MIX (three or more lineages: GM, megakaryocyte, erythrocyte), and HPP-MIX. (C) The hematopoietic hierarchy subgrouped into different stem and progenitor populations and (D) their corresponding expression clusters (i to vii). Individual genes were assigned to expression clusters as described (6). Relative expression levels are displayed by red (highest) to green (lowest) coloration. Predicted cellular roles of identified HSC-specific gene products: (E) distribution within the HSC profile for gene products with known or putative functions, (F) distribution of the annotated gene-products between HSC subtypes, and (G) between fetal and adult HSCs.

We have defined genomewide transcriptional changes during early stages of hematopoietic differentiation by comparing four distinct sets of genes that are up-regulated in LT-HSCs (i), in both LT and ST-HSCs but not in LCPs (ii), in both HSCs and LCPs (iii) and, in ST-HSCs and early progenitor population (iv), respectively. The distribution of genes within these four sets across functional categories is shown in Fig. 2F. Molecules thought to be involved in cell-cell communication, such as signaling ligands, receptors, extracellular matrix, and adhesion molecules, tend to be overrepresented in the HSC-specific gene set. LT-HSC–specific ligands include Bmp8a,Wnt10A, EGF-family members Ereg andHegfl, the angiogenesis-promoting factor Agpt, a ligand for the ROBO receptor family Slit2, and the ephrin receptor ligand EfnB2. These molecules may be involved in signaling between HSCs and their microenvironment. It is interesting that HSCs coexpress several ligand-receptor pairs, such asWnt10A/Frizzled and Agpt/Tek, which suggests that HSC regulation may be partly autocrine. The complete set of HSC-related genes is presented in table S2.

ST-HSCs and early progenitors express molecules associated with the initiation of the cell cycle such as Wee1 kinase,Cdk4, replication licensing factor Mcmd, and the critical hematopoietic proliferation protein, Myb. Genes involved in DNA repair and protein synthesis are also up-regulated in these compartments. This is consistent with the exit from G0 arrest at the onset of differentiation. ST-HSCs also express a set of gene products with RNA-binding domains, which is suggestive of posttranscriptional regulation.

Hox genes are likely to play a role in HSC regulation. Four HoxA genes are expressed in different subsets of HSCs. Hoxa5 andHoxa10 are specific for the LT-HSCs, Hoxa2 is expressed in both LT and ST-HSCs, and Hoxa9 is expressed both in HSCs and LCPs. It is noteworthy that overexpression ofHoxa9 in murine HSCs induced stem cell expansion (10), whereas Hoxa5 and Hoxa10perturbed their differentiation activity (11,12). In addition, Hoxb4, which is detected both in HSCs and LCPs, has been shown to promote specification and expansion of definitive HSCs (13, 14). Fetal and adult HSCs share the key stem cell properties of self-renewal and multilineage differentiation potential. In agreement with this, comparing the gene expression profiles of fetal and adult HSCs reveals broad molecular similarities (Fig. 2G). More than 70% of all HSC-related gene-products are expressed in both fetal and adult HSCs.

We next asked whether the HSC genetic program is conserved between mouse and human. Human fetal liver LinCD34+CD38 cells provide long-term engraftment of nonobese diabetic immunodeficient NOD-SCID mice and, therefore, are functionally similar to murine LT-HSCs (15). Human gene products with an increase in expression of at least twofold in HSCs compared with MBCs were defined as HSC-enriched. Mouse-human homologous pairs were identified by direct sequence comparison of expressed sequence tag (EST) assemblies as described (6). We found 822 human homologs for murine HSC-related genes that are expressed in fetal liver (Database S3). Of these, 322 (39%) were enriched in human fetal HSCs. The probability of observing such an overlap by chance as estimated using hypergeometrical distribution (6, 16) is extremely low (P = 10−11 ). These genes likely represent the conserved molecular components expressed in HSCs. Homologous gene products expressed in the LT-HSC subset are listed inTable 1. The remaining homologous pairs did not show coordinate expression. This may reflect technical difficulties in purifying homogeneous HSC fractions. Alternatively, related but not identical populations may function as HSCs in different organisms.

Table 1

Select known mouse-human homologs expressed in LT-HSC subset. The complete list of homologous pairs is presented in Database S3. FC, fetal cells, GPCR, G protein–coupled receptor; LDL, low density lipoprotein; MHC, major histocompatibility complex; TF, transcription factor; UTR, untranslated region.

View this table:

To establish the gene expression profile common for diverse types of SCs, we performed analyses of ESCs and NSCs. Gene products with an increase in expression of at least twofold compared with both fetal and adult MBCs were defined as ESC/NSC-enriched. Correct detection of ESC- and NSC-enriched genes was verified by comparison with published data sets (17, 18) and are presented in tables S3 and S4.

ESC/NSC-enriched gene sets were compared with each hematopoietic cluster. These results are summarized in Fig. 3. Gene products enriched in all three SC types belong to a variety of functional categories. Several identified gene products have been previously implicated in the regulation of different types of SCs. Transcription factors Edr1 andTcf3 have been shown to sustain the activity of HSCs (19) and epidermal SCs (20), respectively, whereas EfnB2 and Hes1 have been implicated in control of NSC proliferation (21, 22). Analyses of EST collections indicate that many of the HSC- ESC- and NSC-enriched genes are also expressed in other tissues (7). This may suggest more general functional roles in a broader array of SC populations.

Figure 3

Overlapping gene expression in diverse murine SCs. (A) Venn diagram detailing shared and distinct gene expression among NSCs, ESCs, and HSCs. (B) A summary of the number of different genes expressed in diverse stem cell compartments in relation to each other and compared with the above defined hematopoietic clusters. A complete list of HSC-related genes also enriched in NSCs and/or ESCs is presented in Database S4.

In summary, we have determined the molecular similarities and differences among five distinct SC populations, specifically, human fetal HSCs, murine fetal and adult HSCs, NSCs, and ESCs. The similarities define a common SC genetic program or SC molecular signature. It is likely that hallmark properties shared by all SCs, such as the ability to balance self-renewal and differentiation, will be governed by shared molecular mechanisms. As such, numerous components of these molecular mechanisms are likely to be contained within the SC molecular signature presented here.

Supporting Online Material

Materials and Methods

Fig. S1 to S5

Tables S1 to S4

Databases (Excel files) 1 to 4

  • * To whom correspondence should be addressed: E-mail: ilemischka{at}


Stay Connected to Science

Navigate This Article