Conservation of RET Regulatory Function from Human to Zebrafish Without Sequence Similarity

See allHide authors and affiliations

Science  14 Apr 2006:
Vol. 312, Issue 5771, pp. 276-279
DOI: 10.1126/science.1124070


Evolutionary sequence conservation is an accepted criterion to identify noncoding regulatory sequences. We have used a transposon-based transgenic assay in zebrafish to evaluate noncoding sequences at the zebrafish ret locus, conserved among teleosts, and at the human RET locus, conserved among mammals. Most teleost sequences directed ret-specific reporter gene expression, with many displaying overlapping regulatory control. The majority of human RET noncoding sequences also directed ret-specific expression in zebrafish. Thus, vast amounts of functional sequence information may exist that would not be detected by sequence similarity approaches.

A current hypothesis is that sequences conserved over greater evolutionary distances are more likely to be functional than those conserved over lesser distances (1). Many recent publications have focused attention on the regulatory potential of “ultra-conserved” noncoding sequences, conserved across great evolutionary distances, e.g., human to fugu (29) [≥300 million years, or average 74% protein identity (10)]. These are frequently enhancers associated with developmental genes, consistent with strong selective pressure to preserve critical mechanisms. Analyses of identified sequences have generally fallen into two categories: analyses confined to mammals, with functional verification done in mice, or analyses including mammalian and teleost sequences, focusing on highly conserved sequences alignable at the extremes. However, simply because an expression pattern is preserved through evolution, it does not necessarily follow that the cis-regulatory elements controlling that expression in one species will function in a second.

We have explicitly tested two hypotheses: First, using selective pressure as a guide across moderate evolutionary distances, we can identify the majority of enhancers controlling expression at a particular locus by functional testing in a comprehensive, unbiased manner, and second, regulatory function of noncoding sequences will be conserved over evolutionary distances beyond the limit of overt sequence conservation.

We have focused on the regulatory control of the gene encoding the RET receptor tyrosine kinase. RET is expressed in neural crest, urogenital precursors, adrenal medulla, and thyroid during embryogenesis, and in specific central and peripheral neurons and endocrine cells during development and postnatally (11). Although RET expression is highly conserved across evolution (1215), only the exons encoding the tyrosine kinase domain are overtly conserved [≥70%, ≥100 base pairs (bp)] from humans to zebrafish (1618). We first compared the genomic sequence of a ∼200–kilobase (kb) segment encompassing the zebrafish ret gene with the orthologous interval in fugu (Fig. 1), using AVID/VISTA (19, 20). We generated 10 ZCS (zebrafish conserved sequence) amplicons, corresponding to 14 discrete noncoding sequences (table S1).

Fig. 1.

Comparative sequence analysis of teleost ret loci reveals putatively functional noncoding sequences. VISTA plot displaying the alignment of the zebrafish ret locus with the orthologous fugu region. Red peaks represent conserved noncoding sequences; shaded green boxes represent ZCS amplicons. Boxes bordered by dashed lines denote amplicons containing two or more conserved sequences. ret exons are denoted by blue peaks. Red peaks boxed and shaded in blue denote 5′ and 3′ flanking genes pcbd and galnact2, respectively.

We also used these criteria to identify conserved noncoding human sequences, comparing a ∼200-kb segment encompassing human RET with the orthologous genomic intervals in 12 nonhuman vertebrates (16). We selected sequences shared among human and at least three nonprimate mammals (21). In total 13 HCS (human conserved sequence) amplicons, encompassing 28 discrete conserved sequences (table S2) were generated for analysis.

Although zebrafish transgenesis has been used to evaluate the regulatory potential of conserved noncoding sequences (2, 7, 22), its efficacy is compromised by mosaicism in injected (G0) embryos. We developed a reporter vector based on the Tol2 transposon; reporter expression in G0 embryos, driven from the ubiquitous ef1a promoter, was extensive and was dependent on transposase RNA (23).

All but one ZCS amplicon drove reporter expression consistent with endogenous ret expression (Table 1). As in the mouse, zebrafish ret is expressed in sensory neurons of the cranial ganglia, motor neurons in the ventral hindbrain, cells of the hypothalamus and pituitary primordia, sensory and motor neurons in the spinal cord, and primary sensory neurons in the olfactory pit (13, 14). We discovered elements driving expression consistent with all of these cell populations (Table 1), including small groups of cells, e.g., olfactory neurons (Fig. 2A) and lateral line placode ganglion (Fig. 3, A and B). Although ret is also expressed in amacrine and horizontal cell layers of the retina, we did not detect expression in the retina of G0 embryos with any of the tested elements.

Fig. 2.

Conserved noncoding sequences at the zebrafish and human ret loci drive reporter expression in zebrafish embryos consistent with the endogenous gene. Shown are GFP expression patterns in representative G0 embryos. (A to D) Zebrafish elements drive expression in (A) bilateral olfactory pits (arrowheads; ZCS-83); (B) hindbrain neuron consistent with nVII facial motor neuron (arrowhead; ZCS-19.7); (C) pronephric duct before 24 hours (arrowhead; ZCS-34); and (D) pronephric duct at 3 days (arrowheads; ZCS-7.6). Human elements drive expression in (E), pituitary (encircled, HCS+16); (F) dorsal spinal cord neurons (arrowheads, HCS-32; fp, floor plate; nc, notochord); (G) pronephric duct (solid white arrowheads) and enteric neurons (open arrowhead; HCS+9.7); and (H) enteric neurons (open arrowheads, HCS+9.7).

Fig. 3.

Mosaic G0 expression accurately reflects expression in G1 fish. (A) ZCS-35.5 G0 embryos display GFP in cells of the anterior (open arrowhead) and posterior (solid white arrowhead) lateral line placode ganglia. (B) ZCS-35.5 G1 embryos display GFP in the anterior (open arrowhead) and posterior (solid white arrowhead) lateral line placode ganglia, as in (A). (C) GFP detected by in situ hybridization (ISH) in the distal pronephric duct of ZCS+7.6 G1 embryo at 24 hours, consistent with ret expression at the same stage (D). (E and F) GFP detected by ISH in the pituitary (open arrowhead), trigeminal nuclei (arrow), and migrating nVII facial motor neurons [arrowhead in (E) and (F)] of a HCS+16 G1 embryo. (G) GFP detected by ISH in the retina of G1 ZCS-19.7 embryo.

Table 1.

Noncoding sequences from zebrafish ret or human RET direct expression consistent with endogenous ret. The elements are described by their species of origin and distance in kilobases from the translation start site, and (i.e., ZCS-50, HCS+16). Abbreviations: CG, cranial ganglia; SC, spinal cord; PND, pronephric duct; IM, intermediate mesoderm; NTC, notochord; OLF, olfactory pit/placode; +, present.

ConstructsBrainSCCGENSNTCOLFRetinaHeartIM/PNDFin bud
ZCS-83 + + + +
ZCS-50 + + + + + +
ZCS-36 + +View inline
ZCS-34 + +
ZCS-31 + +
ZCS-19.7 + + + + + +
ZCS-14.7 + +
ZCS-9.5 + + +
ZCS+7.6 +
ZCS+35.5 + + + +
HCS-32 + + + + +
HCS-30 +
HCS-23 +
HCS-12 + +View inline
HCS-8.7 +
HCS-5.2 + + +
HCS+9.7 + +
HCS+16 +
HCS+19 + +
  • View inline* Expression before 24 hours.

  • We found significant redundancy in the control of ret expression in the pronephric duct (Table 1; Fig. 2, C and D). Five elements drove expression in the intermediate mesoderm or pronephric duct; one was responsible for transient early expression (Fig. 2C), one for expression in the distal duct after 3 days (Fig. 2D), and three apparently redundantly control expression in the intervening period. Although three amplicons lie within a 5-kb region upstream of ret, they function independently in our assay. Similarly all but two ZCS amplicons drove expression in one or more cell populations of the central nervous system (Table 1), wherein ret is also dynamically expressed.

    Surprisingly, 11 out of 13 HCS amplicons drove expression in cell populations consistent with zebrafish ret (Table 1). These included cells not present in mammals, such as the afferent neurons of the lateral line ganglia. We also observed multiple sequences driving expression in the excretory system, despite its developmental and anatomical differences between fish and mammals (Fig. 2G). Two sequences contained within a genomic interval deleted from the rodent lineage also functioned in zebrafish, in one case driving expression in the pituitary (Figs. 2E, 3E). Several pairs of elements drove similar expression patterns, despite lack of detectable sequence conservation (Table 1). To rule out the possibility that nonconserved sequences could fortuitously display enhancer activity, we analyzed expression from vectors containing nonconserved zebrafish (n = 5) or human (n = 3) genomic DNA, from the RET intervals (tables S1 and S2). None of these nonconserved sequences provided reproducible patterns of expression.

    Through analysis of G0 expression, we identified enhancers active in small cell populations such as the cranial ganglia and olfactory neurons (Fig. 2), suggesting that mosaicism is not a significant limitation. We have passed a subset of transgenes through the germline (Fig. 3, A to C and E to G), to directly compare expression in G0 and G1 embryos. Expression of each transgene was largely consistent with that observed in G0 phases (Fig. 3, A and B), although in some cases we observed additional expression, particularly in small groups of cells and at later time points [retina (Fig. 3G)]. We also evaluated many G1 embryos using in situ hybridization (ISH) to detect gfp transcripts, which confirmed that green fluorescent protein (GFP) signal was present in ret positive cells (Fig. 3, C and D).

    While still functioning as tissue-specific enhancers in zebrafish, some HCSs directed expression differing in timing or location from that of the endogenous ret gene. For example, HCS-32 drives GFP expression in dorsal spinal cord neurons, apparent between embryonic day 2 and 3. ISH analyses of G1 transgenic embryos revealed expression at earlier stages in the posterior neural plate, where ret is not normally expressed. Additionally, two elements, HCS-23 and ZCS-50, directed expression strongly to the notochord, again not a site of endogenous ret expression. One possible reason for these discrepancies is that we are assaying elements out of context. Also, physical proximity does not mean that these elements normally regulate ret expression. In the case of HCSs, individual transcription factor–binding sites (TFBSs) may have evolved sufficiently to display different functions (i.e., binding related proteins, binding with different affinity), reflected in altered regulatory activity of the element as a whole.

    HCS function in zebrafish may arise from sequence elements ≤100 bp that are conserved but fail to meet our original criteria for identification. Consequently, we repeated our sequence analysis with AVID/VISTA, reducing the window size to 30 bp. We also analyzed the RET orthologous intervals using the anchored alignment algorithms Multi-LAGAN and Shuffle-LAGAN (24), the latter designed to detect alignable sequences in the presence of inversions and rearrangements. We also attempted to align each RET HCS independently, in both orientations, with the zebrafish ret interval (25). All analyses failed to detect sequences alignable between human and zebrafish RET intervals. We further searched the entire zebrafish genome (26) for homologies to the examined HCSs. Sixty-five sequences within these HCSs of ≥20 nucleotides in length demonstrated ≥70% identity with nonorthologous, intergenic zebrafish sequences, within 100 kb of a known or predicted gene; 41 out of 65 contain conserved TFBS motifs (table S3). However, we also aligned the nonconserved HCSs with the zebrafish genome and found alignments containing TFBSs at a similar frequency, which suggested that such analyses are not predictive of regulatory function. We posit that the responsible functional components in the conserved elements are single or multiple TFBSs (4 to 20 bp), beyond the ability of our current in silico tools to reliably detect. Our data suggest that restricting in vivo functional analyses to sequences conserved over great evolutionary distances (e.g., human to teleost) detects only a small fraction of functional information in the genome.

    We have developed an efficient method to evaluate putative enhancer elements, allowing rapid assessment of in vivo function in a vertebrate embryo. This method is suitable for rapid screening of putative enhancers on a large scale, even where the orthologous zebrafish sequence is not available. Our approach represents a significant advance over previous methods because of the decreased mosaicism and improved germline transmission achieved with Tol2 vectors. The transparent external development of zebrafish facilitates dynamic analysis of reporter activity throughout embryogenesis, allowing detection of biological activity throughout development. This has allowed us to survey without bias all conserved sequences at a single, complex locus.

    Our data strongly suggest that functional information is conserved in vertebrate sequences at levels below the radar of large-scale genomic sequence alignment, consistent with prior anecdotal observations (27, 28). Two alternative models could be invoked to explain our data. First, overall similar expression of the RET genes could be achieved through assemblage of analogously acting, although not orthologous, enhancers. A second, more parsimonious, explanation is that orthologous enhancer elements control expression of both RET genes, but have evolved beyond recognition through small changes in TFBSs, rearrangement of sites within enhancers, or multiple coevolved changes. Examination of enhancer evolution in Drosophila species reveals examples of these types of sequence changes, confounding traditional sequence alignment approaches while preserving enhancer function across species (2931). Comparison of human and mouse enhancer sequences suggests that similar widespread turnover of TFBSs is observed in vertebrate evolution (28), although there is no corresponding functional data to confirm that such changes occur while preserving the function of the enhancers. Our data cannot distinguish between these two models; however, it must be the case that largely the same set of transcription factors regulate expression of either gene, and the binding of these is conserved from mammalian to teleost enhancer elements, which allows the HCSs to function in zebrafish. These data may now significantly alter the manner in which the biological relevance of vertebrate noncoding sequences is evaluated.

    Supporting Online Material

    Materials and Methods

    Tables S1 to S3


    References and Notes

    View Abstract

    Navigate This Article