Special Reviews

A Genomic Regulatory Network for Development

See allHide authors and affiliations

Science  01 Mar 2002:
Vol. 295, Issue 5560, pp. 1669-1678
DOI: 10.1126/science.1069883


Development of the body plan is controlled by large networks of regulatory genes. A gene regulatory network that controls the specification of endoderm and mesoderm in the sea urchin embryo is summarized here. The network was derived from large-scale perturbation analyses, in combination with computational methodologies, genomic data, cis-regulatory analysis, and molecular embryology. The network contains over 40 genes at present, and each node can be directly verified at the DNA sequence level by cis-regulatory analysis. Its architecture reveals specific and general aspects of development, such as how given cells generate their ordained fates in the embryo and why the process moves inexorably forward in developmental time.

The mechanism causing cats to beget cats and fish to beget fish is hardwired in the genomic DNA, because the species specificity of the body plan is the cardinal heritable property. But despite all the examples of how individual genes affect the developmental process, there is yet no case where the lines of causality can be mapped from the genomic sequence to a major process of bilaterian development. One reason for this is that most of the developmental systems that have been intensively studied produce adult body parts, such as the third instar Drosophila wing disc, or the vertebrate hindbrain during rhombomere specification, or the heart anlagen of flies and mice (1). These systems present tough challenges because they go through successive stages of pattern formation in order to generate complex morphologies, and their development is initiated from states that are already complex. Furthermore, traditional molecular, genetic, and developmental biological approaches have focused on determining the functions of one or a few genes at a time, an approach that is not adequate for analysis of large regulatory control systems organized as networks. The heart of such networks consists of genes encoding transcription factors and the cis-regulatory elements that control the expression of those genes. Each of these cis-regulatory elements receives multiple inputs from other genes in the network; these inputs are the transcription factors for which the element contains the specific target site sequences. The functional linkages of which the network is composed are those between the outputs of regulatory genes and the sets of genomic target sites to which their products bind. Therefore, these linkages can be tested and verified by cis-regulatory analysis. This means identifying the control elements and their key target sites, and experimentally determining their functional significance. The view taken here is that “understanding” why a given developmental process occurs as it does requires learning the key inputs and outputs throughout the genomic regulatory system that controls the process as it unfolds.

In mechanistic terms, development proceeds as a progression of states of spatially defined regulatory gene expression. Through this progresssion, specification occurs: This is the process by which cells in each region of the developing animal come to express a given set of genes. The spatial cues that trigger specification in development are generally signaling ligands produced by other cells, in consequence of their own prior states of specification. In addition to intercellular signals, maternal molecules of regulatory significance are distributed to particular cells with the egg cytoplasm and partitioned spatially during cleavage. Ultimately, either inter- or intracellular spatial cues affect the course of events in development by causing the activation (or repression) of particular genes encoding transcription factors. But although it is these genes that do the transcriptional regulatory work of spatial specification, the locus of programmatic control for each developmental event is the sequence of the particular cisregulatory elements that respond to the inputs presented. Genes encoding transcription factors are typically used at many times and places in the life cycle, and so the uniqueness of any given developmental regulatory network lies in its operative cis-regulatory modules. Such cisregulatory systems produce new and often more refined spatial patterns than those described by their inputs: They add regulatory or informational value. For example, cis-regulatory elements active in spatial specification often use “and” logic, in that two different transcription factors, each present in a given spatial domain, must be bound to the cis-regulatory DNA at once in order for transcription to be activated (1). The gene is expressed only where the input patterns overlap, and this defines a new spatial regulatory state. By determining the succession of DNA sequence–based cis-regulatory transactions that govern spatial gene expression, closure can be brought to the question of why any particular piece of development actually happens.

The most closely examined example of a cis-regulatory information processing system is that which controls developmental expression of the endo16 gene of the sea urchin embryo. Endo16encodes a large polyfunctional protein that is secreted into the lumen of the embryonic and larval midgut. Endo16 is expressed in the early embryo in the progenitors of the endomesoderm, then throughout the gut, and finally only in the midgut (2–4), a not very elaborate spatial sequence. But its control system turns out to be an elegantly organized and complex information processing device that responds to both positive and negative inputs to set the boundaries of expression. Early and late expression phases are controlled by two different subregions of the regulatory sequence, or modules, each several hundred base pairs long. Together these are serviced by nine different DNA sequence–specific transcription factors. The functional role(s) of each interaction were determined (5,6), and a computational model was derived to describe how this system responds to its time-varying regulatory inputs and to mutations and combinations of its target sites. The functions that theendo16 regulatory system performs are conditional on the inputs, and they include linear amplification of these inputs, but also many nonlinear operations such as an intermodule switch that transfers control from the early to the late module, detection of input thresholds, and various logic operations (5,6). The model affords precise predictions of the responses of this cis-regulatory system under all conditions.

Uses of a First-Stage Regulatory Network Model

A complete cis-regulatory network model would portray both the overall intergenic architecture of the network and the information processing functions of each node, at the level achieved for the endo16 cis-regulatory system. The complete model could then handle the kinetic flow of regulatory inputs around the whole system. Because of the nonlinear processing functions at each node, inputs into the network are unlikely to be propagated through it in a linear fashion. But the primary necessity is to discover the logic map of the intergenic regulatory interactions, and to represent this map as a first-stage regulatory network model. Its function is just to define precisely those inputs and outputs to each cis-regulatory element that derive from other genes in the network. We have derived such a model for endomesoderm specification in the sea urchin embryo. Although in absolute terms there is an uncomfortably large number of genes in the endomesoderm network (almost 50 at present), they are only a tiny fraction of the total being expressed in the embryo, which is estimated at about 8500 (1).

There are two ways to consider such network models, which are roughly equivalent to the functional genomics point of view and the developmental biology point of view (7, 8). In what we term the “view from the genome,” all relevant inputs into each cis-regulatory element that occur in all cells at all times in the developmental process are shown at once. This gives the genetically determined architecture of the network and predicts the target site sequences that should be functional in the genomic cis-regulatory DNA. The second, the “view from the nucleus,” highlights only those interactions occurring in given nuclei in the particular time frame of that view. It explains why given genes are or are not being expressed at given times and in given cells.

Endomesoderm Specification in the Sea Urchin Embryo

The biology of the sea urchin embryo offers natural advantages for a regulatory network analysis of development. Not many regulatory steps separate the initial zygotic gene expressions that first distinguish a given patch of embryonic cells from the activation of terminal differentiation genes in the progeny of these cells (1, 9, 10). Furthermore, the sea urchin embryo gives rise only to a very simply constructed larva that consists of single-cell-thick structures and only 10 to 12 cell types (10), rather than to a morphologically complex juvenile version of the adult body plan, as in the development of insects and vertebrates.

Not only is the molecular and developmental biology of the sea urchin embryo well known (1,10–12), but dozens of developmentally regulated genes have been cloned, the overall embryonic expression patterns are well described, and the genome has been at least somewhat characterized (13–15). A large collection of arrayed cDNA and bacterial artificial chromosome (BAC) libraries is available (13). Most important for present purposes, the sea urchin embryo provides a high-throughput test bed for cis-regulatory analysis by gene transfer (6, 16–18).

The endomesoderm of the sea urchin embryo forms from cell lineages at the south pole (the “vegetal” pole) of the early embryo (Fig. 1). The endomesodermal constituents of the embryo ultimately consist of the skeletogenic mesenchyme, which arises from the micromere lineage; several other mesodermal cell types; and the gut endoderm. Most of the gut endoderm and all but the skeletogenic mesodermal cell types derive from the progeny of a ring of eight sixth cleavage cells, called “veg2”; the remainder of the gut endoderm derives from their eight sister cells, “veg1”, which also give rise to some ectoderm. What happens in the specification of the lineages is now reasonably well understood as a result of a long series of experimental studies to which many different labs have contributed [see the compressed summary of major steps inTable 1, and see (10) and (19) for reviews]. The specification of the micromere lineages occurs as soon as these cells are formed at fourth cleavage, because if isolated then and cultured, their progeny will express skeletogenic functions just as they do in their natural situation (10). Their specification depends initially on localized maternal cues.

Figure 1

Schematic diagrams of S. purpuratus embryos displaying specified territories (10). Drawings were traced off differential interference contrast images of embryos. The color coding shows the disposition of endomesoderm components and also refers to the network diagrams that follow: lavender, skeletogenic lineage; darker purple, the small micromere precursors of adult mesoderm; light green, endomesodermal veg2 lineage that later gives rise to endoderm, yellow, and to mesoderm, light blue. Light gray indicates oral ectoderm; darker gray indicates aboral ectoderm; white indicates regions yet to be specified at the stages shown. Ten-hour (10 h) embryo: a median optical section of an early blastula, at about seventh cleavage. 15 h blastula: a similar view, at about ninth cleavage. There is now a single cell-deep ring of mesodermal precursors directly abutting the skeletogenic micromere lineage. 24 h mesenchyme blastula-stage embryo: specification of veg2 endoderm and of mesodermal cell types completed. 55 h late gastrula stage embryo (about 800 cells): The drawing shows the later disposition of all the endomesodermal cell types about midway through embryonic morphogenesis.

Table 1

Phenomenological aspects of endomesoderm specification in sea urchin embryos: developmental process (55).

View this table:

Specification of the veg2 lineage in endomesodermal progenitor cells begins immediately as well. There are two inputs required: one a signal passed from the micromeres to the immediate ancestors of the veg2 ring, at fourth to sixth cleavage (20, 21), and the other the nuclearization of β-catenin (that is, its accumulation in the nuclei of all prospective endomesodermal cells) (22). β-catenin is a cofactor of the Tcf transcription factor, and its initial nuclearization is autonomous rather than signal dependent. However, the endomesodermal cells soon activate a gene encoding the signaling ligand Wnt8 (23), which, when bound by the adjacent cells, stimulates a signal transduction pathway that results in further nuclearization of β-catenin/Tcf. Endomesodermal functions downstream of the Tcf transcription input are thereby reinforced by an intra-endomesodermal signaling loop (19).

At seventh through ninth cleavage, the descendants of the micromeres, now located in the center of the disc of veg2 cells (Fig. 1, 10-hour embryo), emit the ligand Delta (24, 25), which activates the Notch (N) signal transduction system in the adjacent veg2 cells and is required to specify them as mesoderm [Fig. 1, 15-hour embryo (26–28)]. If we now imagine the specification map from the bottom rather than from the side as in Fig. 1, the pattern of cell fates (and by now of gene expression) would display a concentric arrangement (10): In the center are the “small micromeres,” the fifth-cleavage sister lineage of the skeletogenic micromeres; surrounding them are the skeletogenic precursors; the veg2 mesoderm precursors; and finally the veg2endoderm precursors. The embryo is still an indifferent-looking hollow ball of cells, but the specification map is well on its way to completion. At 20 to 24 hours, the skeletogenic cells move inside the blastocoel (Fig. 1, 24-hour embryo), leaving behind a now fully specified central disc of prospective mesodermal cell types, and peripheral to them, the endoderm precursors. After this, a late Wnt8 signal from the veg2 endoderm causes the adjacent veg1 progeny to become specified as endoderm as well, and gastrular invagination ensues. The problem that we set ourselves was to discover the network of regulatory interactions underlying the events of endomesoderm specification during the first 24 hours, by which point some mesodermal and endodermal differentiation genes are already being expressed in a cell type–specific manner.

Analyzing the Network

The cis-regulatory network for endomesoderm specification that we show in the following was derived in part from a large-scale perturbation analysis in which the expression of many different regulatory genes and the operation of several signaling processes were altered experimentally. The effects on many other genes were then measured with quantitative real-time fluorescence polymerase chain reaction [QPCR (29)] (see Fig. 2 for the kinds of perturbations applied and illustration of their effects). For an input to be considered significant, the effect of the perturbation had to be greater than threefold with respect to the control; that is, the level of the target gene transcript must be <30% or >300% of normal as a result of the perturbation. Numerical QPCR data (updated as additional measurements are made) are available online (30).

Figure 2

Perturbations and functional knockouts used in the network analysis. (A) Effect of a MASO, from (25). Eggs giving rise to control embryos were injected with an mRNA encoding a fusion between the 5′ leader plus the initial part of the coding sequence of a gene encoding the Pmar1 transcription factor (25), fused to the GFP coding sequence. The control eggs also contained an irrelevant morpholino oligonucleotide. Lateral views of control embryos are shown. The top left panel displays normal embryonic morphology at 24 hours (compare Fig. 1), and the fluorescence display, top right, shows that all cells in the embryo express GFP. Eggs giving rise to the embryos in the two bottom panels were injected with the same GFP fusion plus a MASO targeted to the leader sequence of the pmar1 mRNA. The abnormality of the morphological phenotype that results is not yet evident (left panel, viewed from the vegetal pole), but it can be seen that GFP expression is totally abolished (right panel): The gain in this image is about 100 times that in the top right panel, so that the outline of the embryo can be seen. At the same gain as the control, the image is black. (B) Effect of the introduction of a form of Krox1 that acts as an obligate repressor of its target genes. The morphology of the control embryo is shown at 72 hours, oral side down, as well as that of an embryo of the same age expressing an injected mRNA that encodes a fusion between the DNA binding domain of the Krox1 transcription factor (63) and the Drosophila Engrailed repressor domain (64). Gut formation has not occurred, other severe abnormalities affect the ectoderm and skeleton formation, and there are excess pigment cells as well as other mesodermal cell types. (C) Effect of blocking β-catenin nuclearization. A 48-hour control embryo is shown laterally, with the oral side on the left; and an embryo of the same age expressing an injected mRNA that encodes the intracellular domain of cadherin is shown on the right (image from A. Ransick). The cadherin embryo consists of a hollow ball of ectoderm; endomesodermal specification has been completely wiped out. (D) Effect of the introduction of a negatively acting derivative of the N receptor. A control 37-hour late gastrula is shown on the left, and on the right is an embryo of the same age expressing an injected mRNA encoding the extracellular domain of the N receptor (negN) (image from C. Calestani). This embryo has a normal complement of skeletogenic mesenchyme cells and a well-formed gut but only a very few mesodermal cells of veg2 origin as compared with the control.

Most of the network linkages discovered in this study were based on perturbations that remove functions (19), such as morpholino-substituted antisense oligonucleotides (Fig. 2A), or blockade of all endomesoderm specification (Fig. 2C), or blockade of mesoderm specification (Fig. 2D). One mRNA encoding a transcription factor and mRNAs encoding four different Engrailed domain fusions to transcription factors were used as well (31, 32). These mRNAs were all introduced into the egg in amounts that would produce levels within an order of magnitude of the natural mRNA concentrations per cell, sometimes within a few fold of these concentrations (in reality less because of continuing decay of the exogenous mRNA).

In itself, perturbation analysis cannot distinguish between direct and indirect effects: Blockade of the expression of a gene that encodes a transcriptional activator may decrease expression of both immediately and secondarily downstream target genes; and if it encodes a repressor, blockade of its expression may increase expression of both. Direct effects are those in which a perturbation in the expression or function of a transcription factor causes changes in the expression of another gene, because target sites for that factor are included in a cis-regulatory element of the gene. cis-Regulatory analysis can therefore be used to resolve whether effects on a given control element are indeed direct. Another approach that we have used at several key nodes of the network is the attempted rescue of a perturbation effect by introduction of appropriate amounts of mRNA encoding a different factor, which might be mediating an indirect effect of the perturbation (33). Where a rescue experiment indicates an indirect effect, or where the effect must be indirect because the affected and the perturbed genes are expressed in different cells or at different times, the implied relationships are omitted from the network models. This is because only direct effects imply specific genomic target site sequences in the cis-regulatory systems of the affected genes, and an object of the network model is to make explicit a testable map of cis-regulatory interrelations.

In an iterative process, the inferences from the experimental perturbation results were checked against the network model, further experiments were designed, the model was altered according to their results if necessary, and so forth. The model was constructed with the program Netbuilder (34), a new tool for the construction of computational models that allows simulations to be performed, so as to test whether its relationships generate the appropriate outputs. But from the start, the model had to conform to the facts from experimental embryology (Table 1).

A major gene discovery effort was undertaken in order to clothe with real genes the armature of interactions implied by the embryology, and to add to the collection of genes already known to be involved in endomesoderm specification. Several screens were carried out (Table 2) in which endomesoderm specification was perturbed so as to generate material for use with a very sensitive subtractive hybridization technology designed for use with large-scale arrays of ∼105 clone cDNA libraries (macroarrays) (35). The purpose was to create probes in which sequences differentially expressed in the endomesoderm are greatly enriched (by 20- to 30-fold, which affords the possibility of isolating very rare transcripts). The probes were used for highC 0 t (concentration × time) hybridization to the macroarrays, and the results were digitized and analyzed with a new image analysis program, BioArray, which was designed for analysis of differential macroarray screens (34). New regulatory genes were recovered, as well as genes encoding differentiation proteins of the endoderm and mesoderm (19, 36–39). Most of the transcriptional regulatory genes that are specifically involved in endomesoderm specification up to 24 hours are probably now known (36). On the other hand, only a small sample of endomesodermal differentiation genes have so far been recovered, because most of the screens were directed at the earlier stages of the specification process (Table 2).

Table 2

Differential gene discovery screens. Macroarray filter screens were carried out with probes prepared by high-C 0 T subtractive hybridization, using single-stranded driver and selectate, as described (35). “Selectate” denotes the cDNA preparation that contains the sequences of interest, in contrast to the nucleic acid present in excess in the hybridization reaction: The “Driver,” which lacks these sequences. In the subtractive hybridizations, the reactions were carried out to near termination with respect to driver, and nonhybridized selectate sequences were recovered by hydroxyapatite chromatography (35).

View this table:

Direct cis-regulatory analysis is essential to test the predicted network linkages, but the task of finding these elements on the scale of the network required an approach different from the traditional methods, which boil down to searching experimentally over all the genomic DNA surrounding a gene of interest [the average intergenic distance in Strongylocentrotus purpuratus is about 30 kb (13)]. To solve this problem, we turned to computational interspecific sequence analysis. BAC recombinants containing the genes of interest in a more or less central position were recovered from two sea urchin species. These were S. purpuratus, on which all the experiments were carried out, andLytechinus variegatus, which develops in a very similar manner. The last common ancestor of these species lived about 50 million years ago (40, 41). The sequences of BACs representing most of the genes in the network at present were obtained and annotated (19). A new program, FamilyRelations, was built for the purpose of recognizing short patches of conserved sequence in long stretches of genomic DNA (34). Applied to the Strongylocentrotus-Lytechinus species pair, this approach efficiently served to identify cis-regulatory elements that score positively in gene transfer tests (42).

In summary, three software packages were developed and used for this project: Netbuilder, FamilyRelations, and BioArray (34). These programs are all available online; for access, go to http://sea-urchin.caltech.edu/software.

Provisional Endomesoderm cis-Regulatory Network: The View from the Genome

The overall network (Fig. 3) combines all significant perturbation data (19, 30); information on time and place of gene expression, as determined by whole mount in situ hybridization (WMISH) and QPCR measurements (19); computational and experimental cis-regulatory data where available; the results of rescue experiments; and all the underlying information from experimental embryology. The outputs from each gene in the diagram are color-coded: for instance, that from the gatae gene (GenBank accession number, AF077675), shown in dark green, provides inputs to thelim, otxβ, foxa, foxb,not, bra, elk, pks, andnrl genes. These particular relations were derived from studies (19, 43) of the effects of an α-gatae morpholino antisense oligonucleotide (MASO). Of course many other genes were entirely unaffected by this MASO treatment (30).

Figure 3

Regulatory gene network for endomesoderm specification: the view from the genome. The current version of the model in this figure and the perturbation data on which it is based are available on a Web site (30); for additional details and discussion, see (19). At the top, above the triple line, are the earliest interactions; in the middle tier, the spatial domains are color-coded (Fig. 1), and genes are placed therein according to their final loci of expression. As indicated (black background labels), the lavender area at the left represents the skeletogenic micromere (mic) domain before ingression; the light green area indicates the veg2 endomesoderm domain, with genes eventually expressed in endoderm on yellow backgrounds and genes eventually expressed in mesoderm on blue backgrounds; the tan box at right represents the veg1 endoderm domain. Many genes are initially expressed over broader ranges, and their expression later resolves to the definitive domains. The rectangles in the lower tier of the diagram show downstream differentiation genes (PMC, “primary” or skeletogenic mesenchyme). Short horizontal lines from which bent arrows extend represent cis-regulatory elements responsible for expression of the genes named beneath the line. Embryonic gene expression was perturbed in specific ways as in Fig. 2. The arrows and barred lines indicate the inferred normal function of the input (activation or repression), as deduced from changes in transcript levels due to the perturbations. Each input arrow constitutes a prediction of specific transcription factor target site sequence(s) in the cis-regulatory control element. In some cases, the predicted target sites have been identified in experimentally defined cis-regulatory elements that generate the correct spatial pattern of expression (solid triangles). At the upper left, the light blue arrow represents the maternal β-catenin (cβ) nuclearization system (χ). This transcriptional system (nβ−TCF) is soon accelerated and then taken over by zygotic Wnt8 (dark blue lines); its initial activation, of mixed zygotic and maternal origins, is shown in light blue. Data for the roles of SoxB1 and Krüppel-like (Krl) are from (50, 51). Data for the role of Ets are from (52, 65). “Micr/Nuc Mat Otx” refers to the early localization of maternal Otx in micromere nuclei at fourth cleavage (56). Genes labeled “Repressor” are inferred; all other genes shown are being studied at the DNA sequence level and by multiplexed QPCR. “Ub” indicates a ubiquitously active positive input inferred on the basis of ubiquitous expression seen by whole-mount in situ hybridization, under conditions in which a spatial repression system that normally confines expression has been disarmed. Dotted lines in the diagram indicate inferred but indirect relationships. Arrows inserted in arrow tails indicate intercellular signaling interactions. Small open or closed circles indicate perturbation effects that resist rescue by the introduction of mRNA where there is a possibility that the effect seen is actually an indirect result of an upstream interaction; that is, this possibility of such an indirect effect has been experimentally excluded, and both sites are shown as probable direct inputs (19). Large open ovals represent cytoplasmic biochemical interactions at the protein level, such as those responsible for nuclearization of β-catenin, for the effect of Delta on N (66); or for the effect of Neuralized, an E3 ubiquitin ligase with specificity for Delta (67, 68).

The early cleavage stage events in endomesoderm specification take place in the veg2 endomesoderm lineage, indicated in light green above the triple line at the top, and in the micromere lineage shown in lavender at the left. The central light green endomesodermal domain of the diagram in Fig. 3 portrays genes that ultimately (that is, by 24 hours) function in either endoderm or mesoderm; however, many of these genes are initially expressed throughout the veg2domain. At the bottom, in three boxes, are shown several differentiation genes: skeletogenic genes on the left, mesodermal genes (mainly pigment cell genes) in the center, and endodermal genes on the right. So the first take-home lesson of the diagram in Fig. 3 is that, except for these differentiation genes, almost every gene in the network encodes a DNA sequence–specific transcription factor, and that most of the linkages in the network consist of cis-regulatory interactions amongst these genes. There are also three genes encoding signaling ligands: the wnt8 gene, the delta gene, and the unknown gene responsible for the micromere-to-veg2signal (M→V2L). But on the network scale, it is plain to see that most of the regulatory work of specification is done by the cis-regulatory elements of genes encoding transcription factors. This is a general fact of life that should be true for all major developmental programs (1).

The model provides explanations of specific developmental processes. One example is spatial control by negative transcriptional interactions, illustrated here by the functions of thefoxa gene. The foxa gene is expressed in the endoderm, as gastrulation proceeds, primarily in the foregut and midgut. Perturbation experiments with α-foxa MASO resulted in a sharp increase in target gene transcript levels (30), implying that foxa encodes a repressor (black barred lines emanating from this gene in Fig. 3). Two target genes arefoxb and bra: foxb is expressed in the hindgut and blastopore (19, 44) andbra in the blastopore (37, 45). We see from the network diagram that the repression is likely to be spatial restriction due to foxa. Hence, an experiment was carried out in which a reporter gene controlled by a cis-regulatory element ofbra introduced into embryos bearing an α-foxaMASO. The result was that expression now spread forward into the anterior gut (46). Comparative observations have also been made on the embryo of a starfish, a distantly related echinoderm. Here too, foxa is used in endomesoderm specification as a repressor, servicing the same target genes as in theS. purpuratus network (47). So the network provides an explanation of why those target genes are expressed where they are: partly as a result of spatial transcriptional repression. In addition, the network implies a temporal aspect offoxa expression. The foxa gene is seen to repress itself as well; combined with the continuing positive inputs (from GataE and other factors), the result should in principle be an oscillation. And indeed, QPCR measurements of foxa mRNA show that its level rises, falls, and then rises again late in gastrulation (48).

The network explains some of the phenotypes observed when given processes are perturbed, in terms of its consequential regulatory logic. For example, as shown in Fig. 2C, if β-catenin nuclearization is prevented by introduction of mRNA encoding the intracellular domain of cadherin, neither endodermal nor mesodermal cell types and structures appear. In default of β-catenin/Tcf inputs, the embryo becomes a hollow ball of ectoderm. Note, however, that all the perturbation data underlying the network in Fig. 3 were obtained between 6 and 24 hours, long before any gastrulation phenotypes can be seen (30). Initiation of β-catenin nuclearization produces such a catastrophic result because multiple endodermal and mesoderm regulatory genes depend on a β-catenin/Tcf input. For these genes, only a few percent of control transcript levels survive cadherin mRNA injection (19, 30). Another interesting phenotype is obtained when embryos are treated with α-gcm MASO. The result is albino larvae (49). The gene gcm is ultimately expressed in pigment cells (36), and a downstream target of gcm is the pks(polyketide synthase) gene, which is also expressed in pigment cells (38, 39). This product (and other pigment cell genes under gcm control, not shown) is likely to be required for synthesis of the red quinone pigment these cells produce. Upstream, the network shows gcm to be a target of the N signaling system, because its expression is severely depressed by the introduction of a negatively acting N derivative (19) (Fig. 2D). In fact, gcm expression begins in the single ring of mesoderm progenitor cells that directly receives the Delta micromere signal (36). So we now have a sequence of DNA-based interactions that leads from the initial specification to the terminal differentiation of pigment cells and that explains the albino phenotype. Similarly, the network explains the α-gataeMASO phenotype. This treatment produces a severe interference with endoderm specification and gut development (43), which is no less than would be expected from the branching regulatory effects ofgatae expression indicated in the network.

The network explains the role of the signaling interactions required in endomesodermal specification in terms of their inputs into cis-regulatory systems (except for the early micromere-to-veg2 signal, the targets of which remain unknown). The gene encoding Wnt8 is itself a target of a β-catenin/Tcf input and it is, in addition, under the control of the early endomesoderm regulator krox. These inputs show how the autonomous nuclearization of β-catenin soon causes the Wnt8 loop to start up in all endomesoderm cells, strengthening the set of regulatory relationships indicated by the blue lines in Fig. 3.

The view from the genome provides a qualitative DNA-level explanation for the spatial domains of expression of many endomesodermal regulatory genes. No two of these genes have identical inputs: Each cis-regulatory information processing system has its own job to do. The network shows that the downstream targets of a few of these regulatory genes, such as bra (37), include differentiation proteins that were discovered in our differential screens, but for many of the regulatory genes the downstream targets are still unknown.

System-Level Insights into the Developmental Process

Physiological transcriptional responses flicker on after the advent of stimuli, then return to their ground state; for example, after changes in the level of nutrients or the advent of toxins in the bloodstream, or after the appearance of pathogens. In contrast, the fundamental feature of developmental transcriptional systems in higher (bilaterian) animals is that it always moves inexorably forward, never reversing direction. This property is clearly evident in the developmental process considered here, and the network provides a concrete mechanistic explanation. To see this, we consider views from the nuclei at successive stages (Figs. 4and 5).

Figure 4

Initial events in endomesoderm specification. (A) View from veg2 endomesoderm and micromere nuclei, about fourth to seventh cleavage. Maternal inputs are shown in blue boxes (see Fig. 3 for abbreviations) and blue lines, except for the autonomous nuclearization of β-catenin, shown in a hatched blue line. Four early zygotic transcriptional activations are indicated in red: krox, krl, wnt8 in the endomesodermal domain (all of which require the β-catenin/Tcf input), and pmar1 in the micromere (mic) domain, which requires this and a maternal Otx input [suggested by cis-regulatory as well as perturbation evidence (19)]. Directly or indirectly, pmar1 is also required for expression of the ligand conveying the early micromere to veg2 signal (M→V2L). The negatively acting subnetworks discussed in text are shown in green. All other gene expressions and interactions in the network are indicated in gray. (B throughG) Whole-mount in situ hybridization displays, from (25). The gene, expression of which is being displayed, is shown at the upper right, and the mRNA injected into the egg at the lower right; the age of the embryo is at lower left. (B) Expression ofpmar1 specifically in micromeres. (C) Expression ofdelta specifically in micromeres. (D) Expression ofdelta in all embryonic cells when pmar1 mRNA is translated everywhere, after injection into the egg. Exactly the same result is obtained if an Engrailed domain fusion is instead expressed (25); because the Engrailed fusion acts as an obligaterepressor of pmar1 targetgenes,pmar1 must normally act as a repressor. (E) Expression of sm50, a skeletogenic differentiation gene exclusively in skeletogenic mesenchyme cells (69). (F) Global expression of sm50 in embryos expressing pmar1 globally. (G) Expression of the skeletogenic regulator tbr in embryos expressingpmar1 mRNA globally. (F) and (G) show that the whole embryo has been converted to a state of skeletogenic mesenchyme differentiation. Note the rounded form of the cells at 24 hours in (F), as compared to the control in (E), due to their tendency to behave mesenchymally.

Figure 5

Lock-down functions and expression of the complete regulatory state. (A) Institution of regulatory lock-down devices, shown in color. This view from the endomesoderm nuclei extends from about sixth cleavage to midblastula stage (Fig. 1). The features illuminated are the zygotic Wnt8/Tcf loop (hatched blue), and zygotic auto- and cross-regulations (red), as discussed in text. The N signal transduction input into the gcm gene is shown in hatched orange. (B) Complete activation of the endomesodermal regulatory system: the view from the nuclei from midblastula to after mesenchyme blastula (Fig. 1). By this point, both endoderm and mesoderm specifications have become final, and all genes shown are being expressed. All can be accounted for in terms of the set of inputs included in the color key at the bottom. Except for the Delta and Wnt8 signal-mediated inputs, which are transient, these regulatory inputs have by now achieved stabilization by the interactions shown in (A).

The initial events in endomesoderm specification occur in the micromeres and in the veg2 lineage, as summarized above. The maternal inputs provide the initial state, with respect to regulatory transactions. There are two consequences of the initial zygotic transcriptional responses (Fig. 4A, shown in red). The first is to begin the activation of the endomesodermal zygotic control apparatus; here, by turning on the krox (35) andkrl [krüppel-like (50)] genes in the veg2 endomesoderm and the pmar1 gene in the micromeres. The second is a surprise: An immediate sequel, in both domains, is to engage repressive subnetworks (shown in green) of interactions that have the effect of stabilizing the initial definition of the endomesodermal and mesomere territories by cutting off the possibility of similar transcriptional activations elsewhere. Thekrl gene encodes a repressor that prevents expression ofsoxb1 in the endomesoderm, though it is expressed everywhere else (50, 51). The SoxB1 protein antagonizes nuclearization of β-catenin. The krl/soxb1 loop is an early lock-down device to keep the endomesodermal cells endomesodermal (because they have elevated nuclear β-catenin from the start) and to prevent other cells from going the same way. The pmar1 gene active in the micromeres also encodes a repressor. Its target is an unknown gene that produces another repressor of key regulators of micromere-specific function. Like soxb1, it too is potentially active everywhere, except where it itself is repressed, which is the role accomplished by pmar1 in the micromeres. Micromere regulators that are micromere-specific only because of thepmar1 repression system include the gene that produces the Delta signal to the surrounding veg2 cells and the regulatory genes that are responsible for installing the skeletogenic state of differentiation in the micromere progeny [thet-brain (tbr) gene, the ets gene, and the deadringer (dri) gene (19,25, 52)]. Some evidence for the pmar1repression system is reproduced in Fig. 4, B through G. Expression of the delta gene, the tbr skeletogenic control gene, and sm50, a skeletogenic differentiation gene, all occur globally if pmar1 mRNA is expressed globally (25) (Fig. 4). Almost the first thing accomplished by zygotic genes activated in both the veg2 endomesoderm and the micromeres is to activate local negative control of otherwise global repressors of the respective states of specification. The network reveals active repression of these endomesodermal regulatory states in all the cells of the embryo, except those wherekrl and pmar1 are respectively activated.

The system next proceeds to stabilize positively, and to expand, the endomesodermal regulatory state (Fig. 5A, red interactions). The result is essentially to lock the process into forward drive: “commitment,” here seen to be hardwired into the regulatory circuitry. The Wnt8/Tcf loop discussed above is a piece of this process, which consists mainly of positive cis-regulatory feedbacks; that is, auto- and cross-regulations. In the future mesodermal domain, the gcm gene autoregulates after its initial activation though the N pathway (49). Similarly, the kroxgene positively autoregulates, in addition to stimulating expression of the wnt8 gene, which locks wnt8 andkrox in a positive regulatory embrace. The kroxgene product also activates one of the transcription units of theotx gene (19, 30, 42). In turn, Otx stimulates the krox gene. The otxgene now provides an input into the gatae gene, the importance of which was discussed above; but note that the β-otx cis-regulatory system in turn responds positively to GataE input (30, 43). This is a further positive feedback that links the gatae gene, a dedicated endomesodermal activator, into the stabilization circuitry. As illustrated by the color coding in Fig. 5B, the regulatory state illustrated in Fig. 5A suffices to provide inputs to every one of the known transcriptional regulatory genes in the endomesodermal domain. The drivers are Krox, Otx, GataE, Tcf, and whatever Enhancer of Split-like factor operates in this embryo downstream of N signal transduction. After this, the expression of the wnt8 gene falls off [probably the gene is repressed by one of the Otx isoforms (19, 30, 42, 53)]; and during the late blastula stage, β-catenin disappears from the veg2 endomesoderm nuclei (22). By now, the regulatory system is locked in and has no further need of this input, which was so important in the initial phases of the specification process.

Here we can see how an active cis-regulatory network produces the developmental phenomenon of progressivity. Later, epigenetic processes such as changes in chromatin structure, methylation, etc., may contribute to further stabilization of the differentiated state. But the processes highlighted in Figs. 4 and 5 are sufficient to explain the progression from the initial maternal inputs, to early zygotic responses and stabilization of the state of specification, and thence to the full-fledged program of regulatory gene expression.


Developmental regulatory network analysis can be done in any organism where the necessary genomics, a high-throughput method of gene transfer, and the ancillary molecular methods are available. But it requires a new mix of technologies and a new level of close interactions between system-minded biologists and computational scientists. It seems no more possible to understand development from an informational point of view without unraveling the underlying regulatory networks than to understand where protein sequence comes from without knowing about the triplet code. To understand the operation of whole systems of regulatory interactions, computational models are essential: for organizing experimental extensions and tests at each stage of construction of the model, to check on consistency, and to integrate experimental results with the current network architecture by means of simulation. The cis-regulatory systems at the nodes of the network in reality each process kinetic input information: the rise and fall of the activities of the transcription factors to which they respond. But even from the first-stage model, which just states the interactions that occur at each node, there emerge system properties that can only be perceived at the network level. Examples are the features of the system treated in Figs. 4 and 5. These features explain the means by which maternal spatial cues are used to activate the zygotic transcriptional network, the progressivity of the developmental process, and its lock-down mechanisms. The network model relates these and other developmental features of the process of endomesoderm specification (19) directly to the genome, because it is couched in terms of cis-regulatory interactions at the DNA level. The model thus represents an outline of the heritable developmental program, but the program is not the machine. The DNA regulatory network coexists with many other multicomponent systems that constitute the machine. These systems execute biochemical functions, produce signal transduction pathways, and cause cell biological changes to occur. They sum to the majority of the working parts of the cell. Their mobilization is controlled by the transcriptional switches that hook them into the genomic regulatory control system.

The development of complex body plans is a definitive property of the Bilateria, and encoding the developmental process is a major regulatory function of the genome. It has been clear for a long time that the evolution of body plans has occurred by change in the genomic programs for the development of these body plans (54), and it is now clear that we need to consider this in terms of change in regulatory networks. The bilaterians all have more or less the same genetic toolkit, and in particular rely on essentially the same repertoire of regulatory genes to control the developmental organization of their body plans (1). Network analysis affords the means to focus on the exact consequences of differences in the use of these genes. To solve the questions of body plan evolution will require learning how architectural changes in developmental networks could be added on at each evolutionary stage, while yet preserving the workability of what was there before. It will be necessary to consider regulatory gene networks as evolutionary palimpsests—patterns of regulatory interactions that are successively overlain with new regulatory patterns. In the last analysis, understanding what a given animal is, including us, will mean understanding where each linkage of our developmental networks arose, what other forms share them, which are new, and which are ancient.

  • * To whom correspondence should be addressed. E-mail: davidson{at}caltech.edu

  • Present address: European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB101 1SD, UK.

  • Present address: Altera European Technology Centre, Holmers Farm Way, High Wycombe, Buckinghamshire HP12 4XF, UK.


View Abstract

Navigate This Article