Research Article

Stem cell differentiation trajectories in Hydra resolved at single-cell resolution

See allHide authors and affiliations

Science  26 Jul 2019:
Vol. 365, Issue 6451, eaav9314
DOI: 10.1126/science.aav9314

Mapping Hydra development cell by cell

Hydra continually renews all cells in its body using three stem cell populations. This feature of Hydra allowed Siebert et al. to identify the transcriptional signatures of stem cells, progenitors, and terminally differentiated cells using single-cell RNA sequencing of adult Hydra (see the Perspective by Reddien). From these data, they built differentiation trajectories for all cell lineages, identified gene modules expressed along these trajectories, and identified putative regulators of genes within these modules. In addition, they identified candidate markers for elusive cell populations (such as multipotent stem cells and germline stem cells) and built a molecular map of the nervous system.

Science, this issue p. eaav9314; see also p. 314

Structured Abstract


The cnidarian polyp Hydra undergoes continual self-renewal and is capable of whole-body regeneration from a small piece of tissue. The stem cell populations, morphological cell types, and lineage relationships of Hydra are well characterized, but the molecular definition of its cell states has remained elusive. Capturing the molecular diversity of Hydra cells and the transcriptional programs underlying homeostatic development would greatly advance the utility of this organism for tackling questions in developmental and regenerative biology as well as neurobiology, and would help to elucidate ancestral mechanisms by means of comparative approaches.


Recent advances in single-cell RNA sequencing permit identification of the complete molecular diversity of cell states in an animal. This includes capturing the transcriptomes of cells in the process of differentiation. Ordering such cells into differentiation trajectories can uncover the genetic cascades that accompany cell fate specification. Here, we applied these approaches to Hydra to uncover the molecular cell complement, characterize the nervous system, and capture cell differentiation in the homeostatic adult.


We generated ~25,000 single-cell transcriptomes from Hydra using Drop-seq, clustered the cells, and annotated cell states. These data identified candidate molecular markers for elusive cell populations such as multipotent interstitial stem cells and germline stem cells. We then constructed differentiation trajectories for cells from each of the three cell lineages found in Hydra using the software URD. This revealed the dynamics of gene expression that occur during cell specification and differentiation in the adult Hydra, including the spatial and temporal expression of transcription factors and gene modules. To identify potential cell state regulators, we used ATAC-seq to reveal regulatory regions of coexpressed genes, identified enriched transcription factor binding motifs within these regulatory regions, and matched these motifs to coexpressed candidate transcriptional regulators. Our trajectory reconstruction also identified similarities between the neurogenesis and gland cell differentiation pathways that suggest a shared or similar progenitor state. We propose a model in which interstitial stem cells give rise to a bipotential progenitor in the ectodermal layer, which crosses the extracellular matrix to supply the endodermal layer with neurons and gland cells. Additionally, trajectory reconstructions of individual cell types (including gland cells and epithelial cells of the ectoderm and endoderm) uncovered gene expression changes that occur as they are translocated along the body column; these results suggest candidate genes and pathways involved in spatial patterning along the oral-aboral axis. Finally, we profiled neurons and neuronal progenitors that were enriched using fluorescence-activated cell sorting and used the data to build a molecular map of the nervous system. We found 12 distinct neuronal subtypes and determined their location using differential gene expression analysis, in situ hybridization, and transgenic approaches. Access to neuronal transcriptional signatures, including the first molecular markers specific to endodermal neurons, creates opportunities for precise manipulations of the nervous system.


We provide a molecular map of Hydra cell states, including differentiation trajectories for each lineage, identification of candidate regulators of cell states, and a spatial/molecular map of the nervous system. This resource identifies numerous candidates for functional testing, and we therefore anticipate that it will accelerate the discovery of developmental mechanisms in this highly regenerative animal. Hydra has diverse cell specification pathways that can be captured in one life stage by a relatively small number of sequenced single cells, which paves the way to study organism-wide changes at a single-cell level in response to perturbations.

Uncovering Hydra transcriptional cell states and cell differentiation trajectories.

Single-cell RNA sequencing of homeostatic Hydra reveals a molecular map of Hydra cell states. We built differentiation trajectories, identified cell state–specific gene modules, and combined single-cell data with ATAC-seq to uncover putative regulators of cell states.



The adult Hydra polyp continually renews all of its cells using three separate stem cell populations, but the genetic pathways enabling this homeostatic tissue maintenance are not well understood. We sequenced 24,985 Hydra single-cell transcriptomes and identified the molecular signatures of a broad spectrum of cell states, from stem cells to terminally differentiated cells. We constructed differentiation trajectories for each cell lineage and identified gene modules and putative regulators expressed along these trajectories, thus creating a comprehensive molecular map of all developmental lineages in the adult animal. In addition, we built a gene expression map of the Hydra nervous system. Our work constitutes a resource for addressing questions regarding the evolution of metazoan developmental processes and nervous system function.

Hydrozoans have been at the center of fundamental discoveries in developmental biology, including animal regeneration and early observations of stem cells (1, 2). Among hydrozoans, the cell populations and lineage relationships are best characterized in the freshwater polyp Hydra (Fig. 1, A to D) (37). Homeostatic somatic maintenance of the adult Hydra polyp depends on the activity of the differentiation pathway for all cells, which are replaced approximately every 20 days (8). Hydra has three cell lineages—endodermal epithelial, ectodermal epithelial, and interstitial—with each lineage supported by its own stem cell population (Fig. 1, A to D) (9). All epithelial cells in the body column are mitotic unipotent stem cells, resulting in continual displacement of cells toward the extremities. Epithelial stem cells differentiate to build the foot at the aboral end and the hypostome and tentacles at the oral end (Fig. 1, A and C); differentiated cells are eventually shed from the extremities (10). Multipotent interstitial stem cells (ISCs) give rise to the three somatic cell types of the interstitial lineage—nematocytes, neurons, and gland cells (Fig. 1D)—and can also replace germline stem cells (GSCs) if they are experimentally depleted (6, 11, 12) (Fig. 1D). The cnidarian-specific stinging cells, the nematocytes, can fire once and are then discarded; neurons and gland cells are closely associated with epithelial cells and thus are continually displaced and lost (13). Interstitial cells are maintained by three mechanisms: (i) mitotic divisions of ISCs, progenitors, and gland cells (12); (ii) ISC differentiation into neurons, nematocytes, and gland cells (5, 7); and (iii) change in the expression and function of neurons and gland cells with position (14, 15). Thus, cell identity in Hydra depends on coordinating stem cell differentiation and gene expression programs in a manner dependent on cell location. Understanding the molecular mechanisms that underlie cellular differentiation and patterning in Hydra would be greatly facilitated by the creation of a spatial and temporal map of gene expression.

Fig. 1 Hydra tissue composition and single-cell RNA sequencing of 24,985 Hydra cells.

(A) The Hydra body is a hollow tube with an adhesive foot at the aboral end (bd, basal disk; ped, peduncle) and a head with a mouth and a ring of tentacles at the oral end. The mouth opening is at the tip of a cone-shaped protrusion, the hypostome. (B) Enlargement of box in (A). The body column consists of two epithelial layers (endoderm and ectoderm) separated by an extracellular matrix, the mesoglea. Cells of the interstitial cell lineage (red) reside in the interstitial spaces between epithelial cells, except for gland cells, which are integrated into the endodermal epithelium. Ectodermal cells can enclose nerve cells or nematocytes, forming biological doublets. (C) Epithelial cells of the body column are mitotic, have stem cell properties, and give rise to terminally differentiated cells of the hypostome (hyp), tentacles, and foot. (D) Schematic of the interstitial stem cell lineage. The lineage is supported by a multipotent interstitial stem cell (ISC) that gives rise to neurons, gland cells, and nematocytes; ISCs are also capable of replenishing germline stem cells if they are lost. (E) t-SNE representation of clustered cells colored by cell lineage. (F) t-SNE representation of clustered cells annotated with cell state. ec, ectodermal; en, endodermal; Ep, epithelial cell; gc, gland cell; id, integration doublet; mp, multiplet; nb, nematoblast; nem, differentiated nematocyte; pd, suspected phagocytosis doublet; prog, progenitor. id, mp, and pd are categories of biological doublets. Arrows indicate suggested transitions from stem cell populations to differentiated cells. [(A) to (D) adapted from (50)]

We used single-cell RNA sequencing (scRNA-seq) to complement this extensive knowledge of Hydra developmental processes. We collected ~25,000 Hydra single-cell transcriptomes covering a wide range of differentiation states and built differentiation trajectories for each lineage. These trajectories allowed us to identify putative regulatory modules that drive cell state specification, find evidence for a shared progenitor state in the gland cell and neural differentiation pathways, and explore gene expression changes along the oral-aboral axis. Finally, we generated a molecular map of the nervous system with spatial resolution, which provides opportunities to study mechanisms of neural network plasticity and nervous system evolution. We have made the single-cell data available at the Broad Institute’s Single Cell Portal. We anticipate that providing a comprehensive molecular map as a resource to the developmental biology and neuroscience communities will rapidly advance the ability of researchers to make discoveries using Hydra. Cnidarians such as Hydra hold an informative position on the phylogenetic tree as the sister group to bilaterians (16) and largely have the same complement of gene families found in vertebrates (1719). Thus, this dataset, in combination with the existing cnidarian single-cell dataset for Nematostella (20), offers the opportunity to identify conserved developmental mechanisms.

Single-cell RNA sequencing of whole Hydra reveals cell state transitions

Thirteen droplet-based single-cell RNA-seq (Drop-seq) libraries were prepared from dissociated whole adult Hydra polyps, and two neuron-enriched libraries were prepared using fluorescence-activated cell sorting (FACS)–enriched, green fluorescent protein (GFP)–positive neurons from transgenic Hydra (figs. S1 and S2 and tables S1 and S2). We mapped sequencing reads to a reference transcriptome and filtered for cells with 300 to 7000 detected genes and 500 to 50,000 unique molecular identifiers (UMIs), resulting in a dataset with a detected median of 1936 genes and 5672 UMIs per cell (table S3). We clustered the cells, annotated cluster identity using published gene expression patterns (Fig. 1, E and F, and fig. S3), and further validated identities by performing RNA in situ hybridization experiments (fig. S4). In the clustering, cells separated according to cell lineage (Fig. 1E), and we observed the expected cell populations within each lineage (Fig. 1F). We captured cells in a wide range of differentiation states.

Several differentiation trajectories are evident even in the t-distributed stochastic neighbor embedding (t-SNE) representation, similar to findings in scRNA-seq studies performed in planarians (21, 22). For example, clusters that correspond to differentiated head and foot epithelial cells are connected to their respective body column stem cell clusters (Fig. 1F). Additionally, the interstitial stem cell clusters are connected to both neuronal and nematocyte progenitors (nematoblasts). We also identified distinct clusters for differentiated cells of the interstitial lineage—neurons, gland cells, nematocytes, and germ cells (Fig. 1F). We applied non-negative matrix factorization (NMF) to the full dataset and subsequently to all lineage subsets to identify modules of genes that are coexpressed within cell populations (fig. S5) (23, 24). As described below and in the supplementary materials, the recovered gene modules were used for doublet identification (see supplementary methods for discussion of doublet categories, figs. S2 and S6 to S9), trajectory characterization, and the identification of transcription factor binding sites enriched in the cis-regulatory elements of co-regulated genes.

Trajectory reconstruction of epithelial cells reveals position-dependent gene expression

Epithelial cells constantly adjust their gene expression relative to their position as they divide in the body column and are displaced toward the extremities (Fig. 1A). To identify these position-dependent gene expression patterns, we performed trajectory analyses on subsets of endodermal and ectodermal epithelial cells (Fig. 2, A and B, and fig. S10, A to C). We ordered cells along the oral-aboral axis by using the R package URD to generate branching trajectories for each lineage spanning from the foot (aboral) to the hypostome and tentacle (oral) as two separate endpoints (24). URD connects cells with similar gene expression and uses simulated random walks to find gene expression trajectories between terminal cell populations and a starting progenitor cell population. This required removing biological and technical doublets from the epithelial cell subsets, which we accomplished by using NMF module coexpression to identify doublet signatures (see methods and fig. S7). To validate these differentiation trajectories, we visualized the spatial expression of several previously characterized genes and validated the expression of several uncharacterized genes by RNA in situ hybridization (Fig. 2, C to M, and figs. S11 to S13).

Fig. 2 Identification of genes with differential expression along the oral-aboral axis.

(A and B) t-SNE representation of subclustered endodermal epithelial cells (A) and subclustered ectodermal epithelial cells (B). (C and D) Epithelial cells were ordered using URD to reconstruct a trajectory where pseudotime represents spatial position. Scaled and log-transformed expression is visualized. (C) Trajectory plots for previously uncharacterized putative signaling genes expressed in ectodermal epithelial cells of foot and tentacles. Genes: BMP antagonist CHRD (t35005), FGF1 (t12060), and Wnt antagonists DKK3 (t10953), SFRP3 (t19036), and APCD1 (t11061). (D) Trajectory plots for genes expressed in a graded manner in endodermal epithelial cells. Genes: BMP antagonist “DAN domain–containing gene” t2758, secreted Wnt antagonist FZD8 (t15331), FGF receptor FGRL1 (t14481), homeobox protein HXB1 (t1602). (E to M) Epithelial expression patterns obtained using RNA in situ hybridization consistent with predicted patterns. Whole mounts and selected close-ups are shown. Arrowheads indicate ectodermal signal. t, tentacle; bd, basal disk. Scale bars: whole mounts [including (G)], 500 μm; close-ups, 100 μm.

We identified epithelial genes with variable expression along the oral-aboral axis, including differentially expressed gene modules identified by NMF (figs. S14 to S17). These spatially and temporally resolved gene expression profiles for body column epithelial cells provide access to putative regulators of epithelial cell terminal differentiation at the oral and aboral ends, such as transcription factors and signaling molecules (Fig. 2 and figs. S12 and S15). For example, we find differential expression along the body axis of previously uncharacterized genes in the Wnt, BMP (bone morphogenetic protein), and FGF (fibroblast growth factor) signaling pathways (Fig. 2). Therefore, these data suggest candidate genes and pathways for functional testing to better understand oral-aboral patterning in Hydra.

Identification of multipotent interstitial stem cells and trajectory reconstruction of the interstitial lineage

We extracted 12,470 interstitial cells from the whole dataset (Fig. 1E and fig. S10D), performed subclustering, and annotated the clusters through the expression of known and new markers (Fig. 3A and fig. S18). The t-SNE representation of interstitial cells showed evidence for ISC differentiation (Fig. 3A and fig. S18, A to H). NMF analysis was used to identify gene modules associated with interstitial lineage differentiation pathways (fig. S19). We identified a population of cells that largely lack expression of differentiation gene modules (i.e., the putative multipotent ISCs) and used this cell population as the root in an URD trajectory reconstruction (fig. S19). HvSoxC (25) was found to be expressed in transition states between candidate ISCs and differentiated neurons and nematoblasts, which suggests that the expression of this gene marks cells undergoing differentiation (Fig. 3B and fig. S20). We attempted to identify transcripts specific to the putative ISC population and found only a single marker with no shared similarities to known proteins (Fig. 3C and fig. S21A). ISCs may therefore be largely defined by an absence of cell type–specific markers, similar to planarian cNeoblasts (21). The URD reconstruction recovered a branching tree of interstitial stem cell differentiation that resolves neurogenesis, nematogenesis, and gland cell differentiation (Fig. 3D). We performed double fluorescence in situ hybridization (FISH) to validate predicted transition states (Fig. 3, E and F, and figs. S20 to S22).

Fig. 3 Trajectory reconstruction for cells of the interstitial lineage suggests a cell state common to neurogenesis and gland cell differentiation.

(A) t-SNE representation of interstitial cells with clusters labeled by cell state. Solid arrow, neurogenesis/gland cell differentiation; dashed arrow, nematogenesis. (B) HvSoxC expression in progenitor cells. Arrow indicates putative ISC population, which is negative for HvSoxC. (C) The same putative ISC population as in (B) is positive for biomarker Hy-icell1 expression. (D) URD differentiation tree of the interstitial lineage. Colors represent URD segments and do not correspond to the colors in the t-SNE (see fig. S19). (E) Myb (green) is expressed in the neuron/gland cell progenitor state and during early neurogenesis/gland cell differentiation. Expression of Myb (green, >0) partially overlaps with high expression of the neuronal gene NDA-1 (magenta, >3) and the gland cell gene COMA (t2163) (magenta, >0); COMA is also expressed in a subset of endodermal neurons. Coexpressing cells are black. Star and close-up highlights cell states with coexpression. (F) Double labeling using fluorescent RNA in situ hybridization is consistent with neuron differentiation in the endodermal and ectodermal epithelial layers and demonstrates the existence of transition states observed in the trajectory analysis. Additionally, endodermal gland cell differentiation transition states were observed in the endodermal epithelial layer (see also fig. S22). gc, gland cell; gp, gland cell progenitor; n, neuron; np, neuron progenitor; p, Myb-positive progenitor. (G) Model for progenitor specification. Ectodermal ISCs give rise to a progenitor that can give rise to ectodermal neurons. Progenitors that translocate to the endoderm are able to give rise to gland cells or neurons. ec, ectoderm; en, endoderm; gmgc, granular mucous gland cell; gc, gland cell; hyp, hypostome; ISC, interstitial multipotent stem cell; mgc, mucous gland cell; nb, nematoblast; smgc, spumous mucous gland cell; prog, progenitor; zmg, zymogen gland cell.

The trajectory analysis of the interstitial lineage suggests that neuron and gland cell differentiation transit through a previously undescribed shared cell state (Fig. 3D), whereas nematogenesis is distinct. To test this result, we identified genes that are expressed in the progenitor state common to neural and gland cell differentiation, including Myc3 (t18095) (26) and Myb (t27424) (Fig. 3, D and E, and fig. S21). We identified the spatial location of Myb-positive cells using FISH and found positive cells in both the endodermal and ectodermal layers (Fig. 3F and fig. S22). A subset of Myb-positive cells coexpress the neuronal marker NDA-1 (27), consistent with Myb-positive cells giving rise to neurons in both epithelial layers (Fig. 3, E and F, and fig. S22). Furthermore, we found endodermal Myb-positive cells that coexpress COMA (t2163), a gene expressed during gland cell differentiation and in all gland cell states (Fig. 3, E and F, and figs. S21E and S22). ISCs reside in the ectodermal layer but are the source of both new gland cells and neurons in the endodermal layer (12, 28). The data suggest the existence of two Myb-positive progenitor populations: one that stays in the ectoderm to give rise to neurons, and another that crosses the mesoglea to the endodermal layer and subsequently gives rise to endodermal neurons and gland cells (Fig. 3G). Finally, we find that many of the gene modules identified by NMF analysis were specific to each differentiation pathway with ordered expression in pseudotime (fig. S23), thus revealing gene expression changes that underlie differentiation in the interstitial lineage.

Subtrajectory analyses of interstitial cell types

We next explored the specification of different cell types within the interstitial lineage (Fig. 1D). First, we examined nematocytes, which contain one of the most complex eukaryotic organelles, nematocysts (29); these are used to sting and immobilize prey. Hydra nematocytes each have one of four types of nematocysts: desmonemes, holotrichous or atrichous isorhizas, and stenoteles (30, 31). We identified one cluster of differentiated nematocytes, which contains nematocytes that harbor either desmonemes or stenoteles (Fig. 3A, cluster “nematocyte,” and fig. S24, A to F). In addition, we annotated the differentiation trajectories of nematoblasts and identified gene modules that are expressed as they produce these two types of nematocytes (Fig. 3, A and D, Fig. 4A, and figs. S23 to S25). Although extensive work on nematocyst diversity has been facilitated by their extreme morphological and functional differentiation, little is known about nematocyte molecular diversity. The identification of genes that are differentially expressed between nematocytes harboring different nematocyst types (Fig. 4A and figs. S23 to S25) provides a basis for understanding the specification and construction of these extraordinary organelles, which are the defining feature of Cnidaria.

Fig. 4 Subtrajectory analyses of interstitial cell types.

(A) Interstitial gene modules successively expressed in nematocytes forming a stenotele (ic, interstitial gene module). (B) Model for gland cell (ZMG/gMGC) location-dependent changes. Gland cells integrated into the endodermal epithelium get displaced toward the extremities and undergo changes in expression and morphology. Bars show known expression domains for genes depicted in (C). gmgc, granular mucous gland cell; hyp, hypostome; tent, tentacle; zmg, zymogen gland cell. (C) URD linear ZMG/gMGC trajectory recapitulates known position-dependent gene expression in gland cells along the body column. Genes: HyTSR1 (15), HyDkk1/2/4A and HyDkk1/2/4C (51, 52), matrilysin-like (t32151), and CHIA (t18356) (fig. S26, B to E). (D) URD linear sMGC trajectory plot for HyWnt1 (53), HyWnt3 (54), HyBra1 (55), HyBra2 (56), ETV1 (t22116), and NDF1 (t21810) showing expression changes in pseudotime that correlate to position along the oral-aboral axis. Cells are ordered according to pseudotime, with putative hypostomal cell states to the left and putative lower head cell states to the right. (E) Plot showing HyFem-2 expression in a subset of cells in the early female cluster. (F to H) HyFem-2 is expressed in single cells or pairs scattered within the body column.

Second, we analyzed gland cells, which are interspersed between endodermal epithelial cells. Gland cell numbers are maintained both by specification of new gland cells from ISCs and by mitotic divisions of differentiated gland cells (7). We were able to capture ISC differentiation into gland cells in the trajectory analysis (Fig. 3D and fig. S26). Zymogen gland cells (ZMGs) are found throughout the body and transdifferentiate into granular mucous gland cells (gMGCs) when they are displaced into the head (Fig. 4B). Both of these cell types exhibit location-dependent changes in gene expression, and we captured these by building linear trajectories along the oral-aboral axis (Fig. 4C and figs. S14 and S27). We hypothesized that spumous mucous gland cells (sMGCs), a separate type of gland cells present in the head, may exhibit similar location-dependent gene expression profiles that were previously unappreciated. Indeed, reconstruction of a linear trajectory uncovered oral-aboral organization of gene expression in this cell type, including several oral organizer genes (such as HyWnt1, HyWnt3, HyBra1, and HyBra2) in the orally located sMGCs (Fig. 4D and figs. S14 and S28). This raises the possibility that these cells participate in patterning the head. Overall, our analysis reveals a broad range of gland cell states in Hydra that can be achieved through multiple paths.

Finally, we explored the germ cell clusters recovered in the dataset. We excluded germline cells from the interstitial lineage tree reconstruction because differentiation of GSCs from ISCs does not typically occur in a homeostatic animal (11); thus, we did not expect to observe transition states linking ISCs to GSCs. However, we did elucidate the spermatogenesis trajectory by analyzing the progression of cell states found in the two male germline clusters that were recovered in the subclustering of interstitial cells and used these data to identify and confirm several new male germline genes (figs. S14, S29, and S30). We identified two female germ cell clusters, which likely correspond to early and late female germ cell development (Fig. 3A). We performed in situ hybridizations for two genes (HyFem-1, HyFem-2) expressed in a subset of cells found in the early female germline cluster and found positive cells scattered throughout the body column, which we hypothesize are GSCs (Fig. 4, E to H, and fig. S30, G to N). If so, this would be the first report of gene expression in Hydra that is specific to GSCs and would allow for the study of GSCs in Hydra through the construction of GSC reporter transgenic lines.

Identification of putative transcriptional regulators of cell state–specific regulatory modules

The construction of differentiation trajectories allows us to determine the spatial and temporal expression patterns of transcription factors, and thus gain insight into the gene regulatory networks that control cell type specification. We aimed to identify the transcription factor binding sites shared by coexpressed genes and candidate transcription factors that may bind these sites. To identify coexpressed genes, we used NMF to interrogate a genome-mapped dataset and found 58 metagenes (i.e., sets of coexpressed genes) (figs. S31 and S32). To identify the putative regulatory regions of these coexpressed gene sets, we performed ATAC-seq (assay for transposase-accessible chromatin using sequencing) on whole Hydra (fig. S33). We identified regions of locally enriched ATAC-seq read density (peaks), which signify regions of open chromatin, and restricted the analysis to peaks within 5 kb upstream of the start codon of the genes in each NMF metagene (figs. S32 to S34). We then performed motif enrichment analysis to identify transcription factor binding sites that may control the expression of genes belonging to a metagene. We found at least one significantly enriched motif for each of 39 metagenes.

These metagenes had distinct sets of enriched motifs, suggesting differences in the transcription factor classes underlying various cell states (Fig. 5A and fig. S35). For example, the paired box (Pax) motif is enriched in regulatory regions of genes expressed during early and mid-stages of nematogenesis, the forkhead (Fox) motif is enriched at mid- and late stages, and the POU motif is enriched only in late stages. The B cell factor (EBF) motif is enriched in the female germline and the TCF motif is enriched in neurons and gland cells. Among epithelial cell states, motif enrichment is less tightly restricted to particular cell states. However, the ETS domain binding motif is enriched in metagenes expressed in endodermal and ectodermal epithelial cells in the extremities (tentacles and foot). Additionally, homeodomain (Otx and Arx) and bZip motifs are enriched throughout both epithelial lineages, and forkhead motifs appeared to be associated with genes expressed in endodermal epithelial cells (Fig. 5A). The enrichment of forkhead motifs in Hydra endoderm and Nematostella digestive filaments is consistent with a conserved function for forkhead transcription factors in cnidarian endodermal fate specification that is also found across bilaterians (20, 32).

Fig. 5 Motif enrichment analysis for gene modules and identification of candidate regulators.

(A) Selected enriched motifs (columns) found in open chromatin of putative 5′ cis-regulatory regions of coexpressed gene sets (metagenes) for listed cell states (rows). (B to D) Metagene scores visualized on the t-SNE representation (left in each panel), a significantly enriched motif found in putative 5′ cis-regulatory regions (bottom), and candidate regulators likely to bind the identified motif with correlated expression (right). (B) Metagene expressed during nematogenesis and putative PAX regulator. (C) Metagene expressed in gland cells and putative RFX regulator. (D) Metagene expressed in ectodermal epithelial cells of the foot and putative homeobox regulator.

To determine the regulatory factors that may be coordinating gene coexpression modules, we identified transcription factors within each metagene that are predicted to interact with the binding site(s) enriched in that metagene using a combination of Pfam domain annotation and profile inference (JASPAR) (fig. S34). For 25 of the 39 metagenes with enriched binding motifs, we found one or multiple candidate transcription factors with putative function in cell fate specification (table S5). For example, we found a metagene (wg32) that consists of 73 genes coexpressed during nematogenesis. A Pax transcription factor binding motif was significantly enriched in the potential regulatory regions near those genes, and the Pax-A transcription factor (t9974) is part of the metagene (Fig. 5, A and B, and fig. S35). The results therefore strongly suggest that Pax-A functions during early nematogenesis. This is concordant with a recent finding that Pax-A is required for nematogenesis during Nematostella development (20, 33). Similarly, we found evidence that an RX homeobox transcription factor (t22218) functions in basal disk development and an RFX transcription factor (t30134) functions in gland cell specification; the latter was also reported for Nematostella (Fig. 5, C and D) (20). Homeodomain transcription factor binding motifs are enriched in ectoderm tentacle genes (metagene wg71) and the analysis recovered aristaless-related homeobox gene HyAlx (t16456) as a regulator (table S5 and fig. S11A). A role for HyAlx during tentacle formation has been established previously (34). We provide all transcription factors that met our criteria as candidate regulators, including cases such as the basal disk where multiple TFs are both expressed in the proper context and predicted to bind an enriched motif (table S5). Overall, we identified several candidates for regulators of Hydra cell fate specification.

A molecular map of the Hydra nervous system

The Hydra nervous system consists of two nerve nets, one embedded in the ectodermal epithelial layer and one embedded in the endodermal epithelial layer. Neurons are concentrated at the oral and aboral ends of the polyp (35). To determine the molecular nature of neuronal subtypes, we extracted neural progenitors and differentiated neurons from the dataset for subcluster analysis (fig. S10, E and F, and fig. S36). We identified 15 clusters: Three clusters consist of neuronal progenitor cells, expressing progenitor genes such as Myb/Myc3, and the remaining 12 clusters are differentiated neuronal subtypes (Fig. 6A and figs. S37 to S40). To place these 12 neuronal subtypes into the ectodermal or endodermal nerve net, we performed TagSeq (36) on separated tissue layers and conducted differential gene expression analysis to identify genes with significantly higher expression in the endodermal or ectodermal epithelial layer (fig. S37, see methods). Because the neurons remained attached to the epithelia, differentially expressed genes included neuron-specific genes, which allowed us to score the neuronal clusters as ectodermal or endodermal (Fig. 6A and fig. S37).

Fig. 6 Molecular map of the Hydra nervous system with spatial resolution.

(A) Subclustering of neurons and neuronal progenitors. Cell states are annotated with cell layer, localization along the body column, tentative neuronal subtype category [sensory (S) or ganglion (G)], and gene markers used in annotations. (B) Heat map shows top 12 markers for neuronal cell states. (C to E) First molecular markers for endodermal neurons. (C) Transgenic line NDF1(t14976)::GFP expressing GFP in endodermal ganglion neurons along the body column (cluster en1). [(D) and (E)] Body column cross section of transgenic line Alpha-LTX-Lhe1a-like(t33301)::GFP expressing GFP in putative sensory neurons (cluster en2). Phalloidin staining (red) marks actin filaments running along the ECM; Hoechst (blue) marks nuclei. en, endoderm; ec, ectoderm.

To determine the location of the ectodermal neuronal subtypes along the oral-aboral axis, we generated a list of neuronal markers and selected genes to test spatial location using a combination of new and previously published in situ expression patterns (Fig. 6, A and B, and figs. S38 to S41). To test the endodermal identity of clusters en1 and en2, we examined NDF1 (t14976, specific to cluster en1) and Alpha-LTX-Lhe1a-like (t33301, specific to cluster en2) expression by generating GFP reporter lines. For NDF1, GFP is expressed in endodermal ganglion neurons in the entire body except tentacles (Fig. 6C and fig. S40, N and O). For Alpha-LTX-Lhe1a-like, GFP is expressed in sensory neurons along the body column in the endoderm (Fig. 6, D and E). Therefore, the transgenic reporter lines confirm endodermal localization of clusters en1 and en2 and demonstrate our ability to identify specific biomarkers for each neuronal subtype. In summary, we have produced a molecular map of the Hydra nervous system that describes 12 molecularly distinct neuronal subtypes and their in situ locations.


We present an extensively validated gene expression map of Hydra cell states and differentiation trajectories, thus providing access to transcription factors expressed at key developmental decision points. Several recent studies have similarly demonstrated the value of conducting whole-animal (2022) or whole-embryo scRNA-seq (20, 24, 3739) to uncover cell type diversity and the regulatory programs that drive cell type specification. Conducting scRNA-seq on a diversity of organisms will provide insights into the core regulatory modules underlying cell type specification and the evolution of cellular diversity (40). Thus, our Hydra dataset provides an additional opportunity for comparisons to be made in an evolutionary context.

Analysis of Hydra by scRNA-seq uncovered new technical challenges, and we provide solutions to these challenges that will likely be applicable to many systems. For example, Hydra epithelial cells are highly phagocytic (41), a phenomenon that has been observed in a variety of systems, and thus will likely present a challenge for interpretation of scRNA-seq results in future studies (4244). We implemented an approach that has been incorporated into URD, in which we use NMF as an unbiased method to identify anomalies in the data that likely represent cell doublets or phagocytic events. We envision that our approach could be applied to other systems and will be particularly useful in animals where existing expression data are limiting.

Our gene expression map for a dynamic and regenerative nervous system opens the door to understanding the molecular basis of neuronal plasticity and regeneration. Of the 12 neuronal subtypes we have identified, three (the endodermal neurons) were previously uncharacterized molecularly. Three distinct neuronal circuits have been described in Hydra: two in the ectoderm [rhythmic potential 1 (RP1) and contraction burst (CB)] and one in the endoderm [rhythmic potential 2 (RP2)] (45). These circuits are likely composed of ganglion neurons connected throughout the body. The characteristic localization of these circuits, combined with the in situ locations of the ganglion neuron molecular subtypes we identified (Fig. 6A), suggest the molecular identities of the neurons that constitute these distinct circuits. We propose the following: (i) The endodermal neurons of cluster en1 (Fig. 6, A and B) make up the RP2 circuit; (ii) the neurons of clusters ec3A, ec3B, and ec3C make up the RP1 circuit; and (iii) the neurons of clusters ec1A, ec1B, and ec5 make up the ectodermal CB circuit. This is supported by the observation that the RP1 circuit is active in the basal disk (cluster ec3A), whereas the CB circuit extends aborally only to the peduncle (cluster ec5) (45). Neuron subtype–specific transgenes, such as the two examples presented here, will provide powerful tools for experimental perturbations to test neuronal function and nervous system regeneration by enabling precise alterations to these neural circuits. Nervous system function in such engineered animals can be tested using newly developed microfluidic tools that allow for simultaneous electrical and optical recordings in behaving animals (46).

The interstitial lineage differentiation trajectories provided several new insights. First, we identified a marker that may be specific to the multipotent stem cell population, which could provide a powerful tool for understanding stem cell function and fate decisions. Second, our data suggest the existence of a cell state that is shared by the neuron and gland cell trajectory (Fig. 3D). This interpretation is supported by the colocalization of neural and gland cell progenitors in several independent clustering analyses (Fig. 1F, Fig. 3A, and fig. S31) that consider different sets of variable genes and sets of cells, and by the overlap of gene modules for neurogenesis and gland cell differentiation (fig. S42). The shared stem cell of gland cells, neurons, and nematocytes suggests a shared evolutionary history of these cell types. The data further suggest that the evolution of nematocytes coincided with the emergence of a distinct progenitor. We thus propose a model in which multipotent ISCs first decide between a nematocyte or gland/neuron fate and then a second decision is made by the common gland/neuron progenitor. This contrasts with previous models that posit a common neuron/nematocyte progenitor (47). However, an alternative explanation is that gland and neuronal progenitors are separate populations that share early transcriptional events; thus, future fate-mapping experiments will be crucial. Additionally, our data suggest a model in which a bipotential gland/neuron progenitor born in the ectodermal layer, where multipotent ISCs reside, traverses the extracellular matrix to provide the endodermal layer with both gland cells and neurons (Fig. 3G).

Adult Hydra polyps, which are in a constant state of development, enable the capture of all states of cellular differentiation using scRNA-seq. An important future goal is to use scRNA-seq to rapidly assess the effect of mutations on all cell types (24, 39, 48). Hydra has a diversity of fate specifications from multiple stem cell types, yet is simple enough to be completely captured by a relatively small number of sequenced single cells from one life stage. Thus, we are now able to study organism-wide changes at a single-cell level in response to perturbations. The transcription factors that we identified at key developmental decision points are exciting candidates to test using this approach. In conclusion, this resource and the experimental approaches we describe open doors in multiple fields including developmental biology, evolutionary biology, and neurobiology.

Methods summary

Hydra vulgaris AEP and Hydra vulgaris transgenic lines were dissociated into single cells and were prepared for Drop-seq (49); FACS was used to enrich for neurons. Sequencing reads were mapped to a de novo assembled transcriptome and a Hydra genome reference, and clustering was performed. Subclustering was performed on the following subsets of the data: epithelial ectodermal cells, epithelial endodermal cells, interstitial cells, and neurons and neuronal progenitors. The in situ location of neuron subclusters was determined using in situ hybridization and differential gene expression analysis of separated epithelial layers. URD (24) was used to build differentiation trajectories for the interstitial and male germline lineages and to analyze the spatial expression of genes in the ectodermal, endodermal, and gland lineages. To analyze regulatory regions, we identified coexpression modules using NMF, performed ATAC-seq to identify regions of open chromatin, and used motif enrichment analysis to identify candidate regulators of the gene modules. We used or performed colorimetric in situ hybridization, FISH, immunohistochemistry, and generation of transgenic lines to validate biomarkers and cell states. For complete methods, see supplementary materials.

Supplementary Materials

Materials and Methods

Figs. S1 to S43

Tables S1 to S9

Supplementary Analysis

Gene Expression Cascades

References (57132)

References and Notes

Acknowledgments: We thank R. Monroy for collection of separated endoderm and ectoderm epithelial layers for TagSeq; R. Abhineet, J. Ho, Q. Li, N. Sierra, and A. Cho for wet lab validations by RNA in situ hybridization; Y. Wang for help with NMF analysis and helpful discussion; R. Steele, C. Dana, and K. Glauber for kindly providing transgenic lines PT1, enGreen1, and nGreen; T. C. G. Bosch for kindly providing Hydra vulgaris AEP; L. Froenicke, V. Rashbrook, S. Ashtari, and E. Kumimoto of the UC Davis DNA Technologies Core for expert advice and assistance with sequencing; the MCB Light Microscopy Imaging Facility, which is a UC Davis Campus Core Research Facility, for the use of an Olympus FV1000 confocal microscope and M. Paddy for excellent support; C. David for providing numerous and plentiful insights on Hydra biology that helped with the interpretation of the data; B. Teefy for helpful discussion and moral support; and A. Schier, R. Steele, G. Wessel, B. Draper, S. Materna, J. Robinson, and C. Dunn for critical reading of the manuscript. Funding: The confocal microscope used in this study was purchased using NIH Shared Instrumentation Grant 1S10RR019266-01. This study was supported by DARPA contract HR0011-17-C-0026 (C.E.J.), UC Davis Start-up funding (C.E.J.), and NIH grant K99HD091291 (J.A.F.). Author contributions: S.S., J.A.F., J.F.C., and C.E.J. conceived the study; S.S., J.A.F., J.F.C., and C.E.J. wrote the paper with revisions by A.S.P., Y.A., and C.E.S.; S.S. and J.A.F. collected single-cell transcriptomes; Y.A., J.F.C., A.S.P., and S.S. performed ISH validation experiments; S.S. and Y.A. performed imaging; A.S.P. performed transgenic validation experiments of neurons and FACS sorting; J.F.C. performed ATAC-seq; S.S., J.A.F., and J.F.C. processed the raw data and conducted data analysis; S.S. and C.E.S. built transcriptome and gene models; J.A.F. built differentiation trajectories; J.F.C. and S.S. performed regulatory module analysis. Competing interests: The authors declare no competing interests. Data and materials availability: The raw data reported in this paper are archived at NCBI GEO (accession number GSE121617) and in processed and interactively browsable forms in the Broad Single-Cell Portal ( RNA-seq data: Bioproject PRJNA497966. Transcriptome Shotgun Assembly project: GHHG01000000. Transcriptome Blast and ATACseq read/peak display: Analysis code and objects: github and Dryad

Stay Connected to Science

Navigate This Article