A High-Resolution Root Spatiotemporal Map Reveals Dominant Expression Patterns

See allHide authors and affiliations

Science  02 Nov 2007:
Vol. 318, Issue 5851, pp. 801-806
DOI: 10.1126/science.1146265


Transcriptional programs that regulate development are exquisitely controlled in space and time. Elucidating these programs that underlie development is essential to understanding the acquisition of cell and tissue identity. We present microarray expression profiles of a high-resolution set of developmental time points within a single Arabidopsis root and a comprehensive map of nearly all root cell types. These cell type–specific transcriptional signatures often predict previously unknown cellular functions. A computational pipeline identified dominant expression patterns that demonstrate transcriptional similarity between disparate cell types. Dominant expression patterns along the root's longitudinal axis do not strictly correlate with previously defined developmental zones, and in many cases, we observed expression fluctuation along this axis. Both robust co-regulation of gene expression and potential phasing of gene expression were identified between individual roots. Methods that combine these profiles demonstrate transcriptionally rich and complex programs that define Arabidopsis root development in both space and time.

The development of multicellular organisms is regulated by transcriptional networks that act to specify cell types and to provide positional information (1). An understanding of the spatial and temporal control of an organ's transcriptional complexity requires detailed knowledge of its transcriptional states at a resolution specific for cell type and developmental stage. Here, we present, at high resolution, the first microarray-based expression map of a single organ and a profile of nearly all Arabidopsis root cell types. With a computational pipeline we developed to analyze this immense data set, we show the output of a complex transcriptional network that underlies root spatiotemporal development, including evidence of fluctuating expression over developmental time and considerable expression variation between individual roots.

The cellular organization of the Arabidopsis root reduces the complexity of analyzing its spatiotemporal development. The root consists of 15 cell types that are organized around its radial axis, many of which display rotational symmetry (Fig. 1 and table S1) (2). To a first approximation, the root's rotational symmetry permits the analysis of cell types to be confined to one dimension, the radial axis. The different cell types arise from the quiescent centre (QC), where initial cells that surround a mitotically less active stem cell niche divide. Cell types are constrained within cell files, so that each new cell division successively displaces an older cell distal to the quiescent centre. Cells undergo division, elongation, and differentiation when they enter the meristematic, elongation, and maturation zones, respectively, along the longitudinal axis (Fig. 1). Because cells are constrained within these files and new cells are born at the root apex, a cell's developmental time line can be tracked along the root's longitudinal axis.

Fig. 1.

Microarray expression profiles of 19 fluorescently sorted GFP-marked lines were analyzed (39, 23, 24). The colors associated with each marker line reflect the developmental stage and cell types sampled. Thirteen transverse sections were sampled along the root's longitudinal axis (red lines) (10). CC, companion cells.

A previous analysis combining fluorescence-activated cell sorting of green fluorescent protein (GFP)–marked cell populations with microarray analysis described expression profiles of five tissue types and three developmental zones in the root (3). These tissue and developmental zone profiles revealed a greater transcriptional complexity than profiles of the organ alone (3). To accurately describe all transcriptional patterns that occur in the root, however, requires a higher-resolution data set that profiles all cell types and developmental stages within an organ. Using the fluorescence cell-sorting expression analysis method (3), we obtained expression profiles of eight new GFP-marked cell populations [S17 (4), S32 (4), COBL9 (5), JO121 (6), S4 (4), SUC2 (7), J2501, and RM1000 (table S1)]. We combined these with 11 previously published experiments [PET111 (8), AGL42 (8), LRC (3), GL2 (3), CORTEX (4), SCR (3), J0571 (3), J2661 (9), APL (4), WOL (3), and S18 (4)] to form a comprehensive data set of 19 experiments (radial data set) that profile expression of 14 nonoverlapping cell types in the Arabidopsis root (table S1). To profile developmental stages in the root, we microdissected a single root into 13 sections, with each section encompassing approximately 3 to 5 cells along the longitudinal axis, and we profiled the expression of each by microarray (10). Temporal expression variation in shoot tissue has been demonstrated, and noise in gene expression between genetically identical organisms can determine cell fate (11, 12). Therefore, we also microdissected a second root to assess expression variation between roots in developmental time.

We first asked how expression between individual cell types differed, by using a mixed-model ANOVA (analysis of variance) to determine significant differential expression (q < 0.001) (9). We selected 10 marker lines that profile individual cell types, two lines that explore expression variation within the pericycle, and four lines that profile varying developmental stages of xylem and phloem (table S1). Contrary to previous interpretation, the vascular cell types contain the highest number of enriched genes (fig. S1, table S2) (3). Functions of individual cell types have been characterized by genetic or physiological studies, but no comprehensive analysis has described the putative functions of each cell type as defined by its transcriptome. We therefore examined Gene Ontology (GO) term enrichment using the hypergeometric distribution, correcting for multiple hypothesis testing to infer cellular function (13). A number of genes have also been associated with biological processes through microarray expression analysis. We therefore mined the literature for such gene associations and for features that link genes to transcriptional networks through cis-element enrichment (figs. S2 to S4) (10). Two interesting trends were observed for GO term enrichment: First, the majority of enriched terms were associated with individual cell types, and second, those GO terms associated with multiple cell types represent more general biological processes (e.g., peroxisomal activity and membrane localization) (Fig. 2A, and table S3 and fig. S2) (14). As a measure of the reliability of our method in identifying cell type–enriched genes and of our ability to correctly annotate biological processes to a cell type, we compared a list we generated of genes enriched in root hair cells with a previous study that profiled root hair morphogenesis. In accord with this published report (15), we found root hair cell differentiation genes, kinases, and cell wall–loosening genes enriched in hair cells (P = 2.1E–4, 5.3E–6, 3.8E–4) (Fig. 2B). Our analysis further increased the spectrum of biological processes associated with root hair development to include calcium ion transport, vesicle docking during exocytosis, and nicotinamide adenine dinucleotide and/or nicotinamide adenine dinucleotide phosphate (NAD+/NADP+) activity (P = 2E–4, 1.5E–4, 1.1E–4, 5E–4). With this added confidence in our method, we then used enriched gene categories to infer the function of other cell types (table S3 and figs. S2 to S4). Of particular note were xylem cells in the meristematic zone that were enriched for translation initiation and elongation, RNA binding and processing, and mitosis (P = 7.2E–8, 5E–4, 6.1E–13, 1.8E–5, 7.3E–10) (16), (fig. S5). Strong colocalization of the first two enzymes in flavonoid biosynthesis in the cortex has been demonstrated by immunolocalization (17). We provide further evidence for cortex-enriched flavonoid biosynthesis with enriched expression of four of the five flavonoid biosynthetic enzymes [P = 6.12E–6 (table S3)]. Another striking example is the enrichment of genes involved in defense, heat, and oxidative stress responses in the endodermis [P = 4.3E–4, 8.97E–5, 8.23E–6 (table S3)].

Fig. 2.

(A) The majority of enriched GO terms (hierarchically clustered) are associated with individual cell types (blue bar). A smaller number are present across multiple cell types (green bar). (fig. S2) (B) GO category enrichment for hair cells confirms a previous report (15). Enriched cis-elements and an enriched TF family were also identified. (C) From the top 50% of varying probe sets, 51 dominant radial patterns were identified. Pattern expression values were mean-normalized (rows) and log2 transformed to yield relative expression indices for each marker line (columns). Marker line order is the same for all figures; see table S1 for marker line abbreviations. (D) Pattern expression peaks were found across one to five cell types. (E to G) Patterns where expression is enriched in single and multiple cell types support transcriptional regulation of auxin flux and synthesis. In all heat maps with probe sets, expression values were mean-normalized and log2 transformed. Expression is false-colored in representations of a root transverse section, a cut-away of a root tip, and in a lateral root primordium. (E) Auxin biosynthetic genes (CYP79B2, CYP79B3, SUPERROOT1, and SUPERROOT2) are transcriptionally enriched in the QC, lateral root primordia, pericycle, and phloem-pole pericycle (P = 1.99E–11, pattern 5). All AGI identifiers and TAIR descriptions are found in table S14. (F) Auxin amido-synthases GH3.6 and GH3.17 that play a role in auxin homeostasis show enriched expression in the columella, just below the predicted auxin biosynthetic center of the QC (P = 8.82E–4, pattern 13). (G) The expression of the auxin transporter, PIN-FORMED2, and auxin transport regulators (PINOID, WAG1) are enriched in the columella, hair cells, and cortex (P = 1.03E–4, pattern 31).

We were also interested in identifying complex spatial expression patterns that are specific to two or more cell types, and therefore, we developed an unsupervised approach to identify dominant, relative expression patterns across cell types and along developmental time (10). If one does not know the a priori expression patterns present in a biological context, there are two methods that can identify them. The first enumerates all possible patterns. However, given the length and complexity of the patterns we are looking for, the number of patterns would be intractably large. Consequently, our pipeline uses fuzzy k-means clustering to generate an initial collection of patterns with strong support. Heuristic filters were applied to reduce the collection to a smaller set of distinct expression patterns (10), and 51 distinct, dominant radial and 40 dominant longitudinal patterns resulted (Fig. 2C, Fig. 3A, figs. S17 and S18, and table S4). In comparison, in a previous analysis of five root tissues and three developmental zones, only eight dominant expression patterns were identified using Principle Component Analysis on differentially expressed genes (3). The increased number of identified patterns clearly demonstrates the value of higher-resolution data sets and of the need for more sophisticated computational methods to elucidate underlying transcriptional programs.

Fig. 3.

(A) Forty diverse and distinct patterns were identified from the top 50% of varying probe sets (rows) from sections along the longitudinal axis (columns). (B) Most patterns show an expression increase across two longitudinal sections. (C) Two pattern categories were identified: a single expression peak (top) and fluctuating expression in developmental time (bottom). (D to F) Co-regulation of probe set expression between roots. Patterns were identified from root 1. Root 2 was treated as the replicate. Probe sets are listed in the same order. Red dashed lines indicate co-regulated groups (10). Note that there is no columella section in root 2. (D) Robust co-regulation of expression for a single expression peak (pattern 25). (E) Robust co-regulation of expression for a fluctuating pattern (pattern 19). (F) Extensive between-root variation with multiple co-regulated groups (pattern 37). (G to J) Validation of longitudinal expression patterns with transcriptional GFP fusions. The expression profiles are able to resolve in vivo expression at high resolution: AGAMOUS-LIKE21 (G) expression validates a peak in section 1, whereas WEREWOLF (H) shows a peak in section 2; (I) At4g05170 (pattern 39) confirms the microarray data with a fluorescence peak in the maturation zone. (J) Expression conferred by the At5g60200 promoter confirms a fluctuating expression profile (pattern 37) with two peaks of expression in the meristematic region and the distal elongation region. This image contains portions distal to the sections dissected for microarray analysis. A third peak of expression is found in this distal maturation zone.

We analyzed these patterns to determine the breadth of expression pattern distributions. Seventeen dominant patterns identified from the radial data set show enriched expression in a single cell type (10) (Fig. 2D). In some of the 34 patterns where expression was enriched in multiple cell types, peaks of expression occurred in ontologically or spatially related tissues; in other cases, they were found in cell types that are spatially separated and have no known ontological relationship (Fig. 2, E to G, and fig. S6).

Among the 40 patterns identified from the longitudinal data set (Fig. 3A and fig. S7), the majority showed maximal expression across two or more contiguous root sections (Fig. 3B). This suggests that, in most cases, the sections we sampled are smaller than the region in which expression is being regulated, which, in turn, indicates that we are at the necessary resolution for detecting differential expression along the longitudinal axis. This also demonstrates a greater complexity of transcriptional programs that underlie previously defined root developmental zones. Development is usually described as a progressive unidirectional process, and it was therefore surprising that 17 of the 40 dominant expression patterns show expression changes that fluctuate over developmental time, presenting multiple peaks of expression (Fig. 3C). Although oscillatory transcriptional mechanisms have been described (18, 19), until recently (20) they had not been described during root development.

To determine the biological relevance of these transcriptional programs, we identified sets of genes that correlated well with the dominant patterns and assessed these lists for biological process enrichment (10) (figs. S8, S19, and S20, tables S5 to S10, and folders S1 and S2). Coexpressed genes assigned to three patterns support transcriptional mechanisms in the regulation of auxin flux across single and multiple cell types. The expression of tryptrophan-dependent auxin biosynthetic genes is concordantly enriched in the QC, lateral root primordia, and pericycle [P = 1.99E–11 (Fig. 2E)]; polar auxin transporters and genes that regulate transporter polarity show enrichment in the columella, root hair cells, and cortex (P = 1.03E–4); and genes that regulate auxin homeostasis are specifically enriched in the columella [P = 8.82E–4 (Fig. 2, F and G)]. In addition to these genes of known function, coexpressed genes of unknown function can be hypothesized to play a role in these processes. In cases where expression is enriched in cell types that have no known ontological or spatial relationship, the biological significance of a shared transcriptional pathway that links these cell types can be inferred. An example is found in hair cells and xylem, which undergo cell wall deposition when they undergo terminal differentiation, and in which a corresponding enrichment of genes involved in cell wall deposition was identified [P = 1.44E–4 (fig. S6)]. Hair cells deposit cell wall materials during hair morphogenesis when the cell rapidly expands, and xylem cells, during the deposition of secondary cell wall. Identifying the functional importance of dominant expression patterns that span multiple cell types should contribute important insights into the network of transcriptional interactions that regulate Arabidopsis root development.

Analysis of the genes assigned to longitudinal patterns allowed us to infer and confirm biological processes that occur during root development. Starch-containing amyloplasts in columella cells play a role in gravity sensing and a corresponding enrichment of starch catabolism genes was identified here (P = 2.59E–5), whereas genes enriched in meristem determinacy and mitosis-specific expression colocalize with the sections containing the QC and just above the dividing initials (P = 2.22E–4, 3.79E–5). By contrast, only one of the patterns that showed fluctuating expression in developmental time correlated with a known biological process—cell cycle activity with expression peaks associated with the root apical meristem and the initiation of presumptive lateral root meristems (Fig. 3E and table S9). For the remaining patterns with fluctuating expression (fig. S9), the majority are enriched in gene functions associated with metabolic or transport activity (table S9).

The pattern-finding described above was performed on expression profiles from a single root because of variability in root growth. To assess the reproducibility of observed expression between replicate roots, we compared the expression of every probe set with itself, as well as with randomly selected probe sets. The resulting distributions showed that longitudinal expression is largely reproducible across roots (10) (fig. S10). Next, we asked how robust coexpression is for probe sets assigned to our dominant expression patterns. Methods used to analyze other microarray time series are not suitable because the sections are not identical between roots and the spacing between developmental time points is unknown (21). We therefore determined the extent to which probe sets assigned to dominant patterns in our first root were coexpressed in the second root (10) (Fig. 3, D to F). In the majority of patterns, greater than 90% of the probe sets maintained coexpression (figs. S11 and S12, table S11, and folder S3), which indicated that patterns are quite robust between replicates. In most cases where low coexpression was observed (table S11), subsets of probe sets showed remarkable co-regulation or phasing in several discrete regions along the root. This is reminiscent of oscillatory expression dependent on the phase of a developmental process (Fig. 3F) (18).

Longitudinal expression profiles were validated by analyzing expression conferred by upstream regulatory regions (4) of representative genes fused to GFP. Expression matched microarray profiles with relative peaks of high expression along the longitudinal axis in all four cases examined as quantified by image analysis (Fig. 3, G to J, and figs. S13 and S14). A gene that displays potentially oscillatory dynamic behavior was also validated (fig. S15), which further demonstrated the reliability of this approach in determining accurate in vivo temporal expression programs.

Together, these data sets describe the averaged expression within a cell-type population along portions of the longitudinal axis or among multiple cell types at specific developmental time points. To identify genes with high, relative expression in both space and time, we conditionally added the log2-normalized values of the radial and longitudinal dominant expression patterns (Fig. 4, A and B, and folder S4). The resulting high relative peaks at the intersections of expression were visualized in a heat map of the three dimensional (3D) root (Fig. 4B and folder S4). Good correlations were found between expression from the transcriptional fusions and the predicted spatiotemporal expression patterns (Fig. 4B and fig. S16).

Fig. 4.

Combining radial and longitudinal expression data to identify gene expression in space and time using dominant expression patterns (A and B) or binary logic (C to E). (A) Radial (x axis) and longitudinal (y axis) patterns are combined by conditional addition to yield relative enriched peaks of expression by cell type and developmental time. The heat map predicts an enriched relative peak of expression in the cortex and more weakly in the endodermis at the distal region of the meristematic zone and elongation zone. (B) A 3D representation of the conditional addition in (A) (left). The At3g05150 transcriptional GFP fusion confirms this expression. (C to E) All expression values were converted to binary representation and combined using the logical AND operator. Presence (red) or absence (blue) of expression in the radial data set (x axis) and in the longitudinal data set (y axis) is indicated. Yellow indicates expression found in both data sets. Marker line and longitudinal section order are as in Figs. 2 and 3. (C) Ubiquitous expression in all cell types and developmental stages. (D) Expression only in the maturation zone. (E) Expression in all root zones and cell types except maturing xylem.

Our computational pipeline revealed a number of dominant expression patterns, but the initial data-processing step eliminated genes that were not expressed or that displayed low expression variation. Converting expression to binary (present or absent) values captures different aspects of gene expression discarded when analyzing relative expression. Binary patterns from both data sets were superimposed using the AND logical operator, so that positive expression in both data sets was required, and the resulting 158 clusters were analyzed (10) (folder S5). By this method, 2018 probe sets were identified as being expressed ubiquitously (Fig. 4C). A number of clusters were found that mark developmental zones (Fig. 4D), cell types at specific developmental stages, and cases where expression was absent in a single cell type throughout developmental time (Fig. 4E).

Expression data associated with our identified patterns has been able to resolve transcriptional programs at high spatial and temporal resolution. We should then be able to use these groups of coexpressed genes to infer transcriptional regulatory modules. Cis-element enrichment within coexpressed gene groups can provide clues as to upstream regulatory transcription factors. We analyzed genes assigned to all patterns for enriched cis-elements and for the presence of corresponding transcription factors. We were able to infer a MYB-domain transcription factor–regulated auxin biosynthesis module, an auxin response factor (ARF)–regulated module in the columella, and two WRKY-regulated modules in hair cells at different developmental time points (fig. S21). The auxin biosynthesis phenotype conferred by an allele of the Altered Tryptophan Regulation 1 (ATR1) MYB-domain transcription factor supports the validity of our method in inferring these modules (10, 22). More sophisticated algorithms should be able to extend this analysis to identify the complement of transcriptional regulatory modules important for root development.

Together, the data obtained from this high-resolution data set both highlight and revise our view of the rich and complex network of transcriptional programs that underlie cell type and temporal aspects of root development. In the future, it will be interesting to compare the spatial and temporal transcriptional complexity that underlies organ development in other multicellular organisms.

Supporting Online Material

Materials and Methods

Figs. S1 to S21

Tables S1 to S14


Folders S1 to S6

References and Notes

View Abstract

Navigate This Article