Divergence of Transcription Factor Binding Sites Across Related Yeast Species

See allHide authors and affiliations

Science  10 Aug 2007:
Vol. 317, Issue 5839, pp. 815-819
DOI: 10.1126/science.1140748


Characterization of interspecies differences in gene regulation is crucial for understanding the molecular basis of both phenotypic diversity and evolution. By means of chromatin immunoprecipitation and DNA microarray analysis, the divergence in the binding sites of the pseudohyphal regulators Ste12 and Tec1 was determined in the yeasts Saccharomyces cerevisiae, S. mikatae, and S. bayanus under pseudohyphal conditions. We have shown that most of these sites have diverged across these species, far exceeding the interspecies variation in orthologous genes. A group of Ste12 targets was shown to be bound only in S. mikatae and S. bayanus under pseudohyphal conditions. Many of these genes are targets of Ste12 during mating in S. cerevisiae, indicating that specialization between the two pathways has occurred in this species. Transcription factor binding sites have therefore diverged substantially faster than ortholog content. Thus, gene regulation resulting from transcription factor binding is likely to be a major cause of divergence between related species.

Differences in related individuals are generally attributed to changes in gene composition and/or alterations in their regulation. Previous efforts to examine divergence of regulatory information have relied on the analysis of conserved sequences in putative promoter regions (1, 2). However, these approaches are limited because transcription factor (TF) binding sites are often short and degenerate, making their computational detection difficult (3). In addition, requiring the conservation of motifs across species precludes the detection of sequences that are evolutionarily divergent.

The detection of binding sites with chromatin immunoprecipitation and microarray (ChIP-chip) analysis (4, 5) offers the ability to globally map TF binding locations experimentally rather than computationally. For species such as yeasts, where genome sequences of numerous related species are available (6), this approach can allow for the evolutionary comparison of binding sites of conserved TFs across species.

We have used this approach to investigate evolutionary divergence in the targets of two developmental regulators in the Saccharomyces sensu stricto yeasts S. cerevisae, S. mikatae, and S. bayanus. In S. cerevisiae diploids, Ste12 and Tec1 act cooperatively to regulate genes during pseudohyphal development (79), whereas in haploid cells, Ste12 regulates mating genes (10). The binding sites of Ste12 and Tec1 were mapped in all three species under low-nitrogen (pseudohyphal) conditions with the use of triplicate ChIP-chip experiments and species-specific high-density oligonucleotide tiling microarrays (fig. S1) (11). Ste12 bound to 380, 167, and 250 discrete sites, whereas Tec1 bound to 348, 185, and 126 sites, in S. cerevisiae, S. mikatae, and S. bayanus, respectively (tables S1 to S6). For each species, the two factors bound to a high proportion of common regions (86, 80, and 87% for S. cerevisiae, S. mikatae, and S. bayanus, respectively), suggesting that the cooperative interaction observed between Ste12 and Tec1 in S. cerevisiae is conserved across the three Saccharomyces species.

Analysis of the signal tracks allowed for global comparisons in TF binding to be made among the species, revealing qualitative and quantitative differences in ChIP binding regions (Fig. 1A). To systematically perform interspecies comparisons, we removed regions that were not represented across all three yeast genomes (12). Comparison of the overlap in binding across species as a function of rank order revealed significant binding differences throughout the rank order, indicating that even strong targets from one species may not be bound in the others (Fig. 1B). As a control, replicate experiments from S. cerevisiae (12) displayed >98% concordance in binding.

Fig. 1.

(A) Ste12 and Tec1 bind to discrete regions of chromosome IX of S. cerevisiae and to orthologous regions of S. mikatae and S. bayanus. ChIP-chip enrichment by Tec1 and Ste12 (log 2 ratios) is shown relative to ORFs of S. cerevisiae (red), S. mikatae (blue), and S. bayanus (green). bp, base pairs. (B) Rank-order analysis of Ste12 and Tec1 ChIP-chip targets in S. cerevisiae (red), S. mikatae (blue), and S. bayanus (green) (12). (C) Gene target overlap across the three Saccharomyces species.

Overall, three classes of TF binding events were observed: those conserved across all three species, those present in two of the three species, and species-specific binding events (Fig. 1C). Of the 221 and 255 targets bound in total by Ste12 and Tec1, respectively, only 47 (Ste12, 21%) and 50 (Tec1, 20%) targets were conserved across all three species (Figs. 1C and 2A). The conserved binding events were present throughout the rank order, indicating that both highly occupied and less-occupied regions are conserved (tables S7 and S8). To ensure that these binding differences were not due to the scoring threshold used, we calculated signal distributions for unbound orthologs of target regions (12). Of the unbound orthologous regions, 80% had signals similar to background, indicating that most will be unaffected by threshold changes (fig. S2). Even when identical binding regions were used, 23% differed in their intensity by at least 1.5 fold between species (0% between S. cerevisiae replicates), suggesting that quantitative differences exist in site occupation or binding strength between species (Fig. 2B and tables S9 and S10). Thus, most target genes were bound in only one or two of the three species, indicating considerable divergence in binding sites across these yeasts (Fig. 2C). Because the fraction of nonconserved genes among S. cerevisiae, S. mikatae, and S. bayanus is less than 0.05% (2), the amount of variation in TF binding is substantially larger than that of gene variation.

Fig. 2.

Comparison of binding by Ste12 and Tec1 across S. cerevisiae (red), S. mikatae (blue), and S. bayanus (green). (A) Conserved binding. (B) Conserved binding with quantitative signal differences. (C) Conserved binding with loss of consensus sequences in one species. (D) Species-specific binding despite conserved consensus sequences. (E) Binding only in S. mikatae and S. bayanus. ChIp-chip enrichment signals are shown (log 2 ratios). Circles and squares represent matches to Tec1 PWM and Ste12 PWM, respectively. Triangles, nonconserved peaks; **, >2-fold difference in peak signal intensity; *, >1.5-fold difference in peak signal intensity.

One possible cause for the interspecies differences in the ChIP binding locations is divergence in binding site sequences. To examine this possibility, we investigated sequence motifs in both bound and orthologous unbound regions across the three Saccharomyces species. Position weight matrices (PWM), representing the putative binding motifs for Ste12 and Tec1, were generated from the ChIP-chip data (13). Analysis of the Tec1 targets of the three species revealed an overrepresented sequence motif that matched the known Tec1 consensus (7) (Fig. 3A), whereas the targets of Ste12 in S. cerevisiae and S. mikatae revealed a motif that was similar to the known binding sequence (14) (Fig. 3B). This sequence was not overrepresented in S. bayanus.

Fig. 3.

Motif analysis of ChIP binding targets. Logo representations of the PWM for Tec1 (A) and Ste12 (B) (12). (C and D) Classes of binding targets after classification by both the conservation of ChIP binding and the presence or absence of consensus motifs. (E and F) Compiled proportions of binding targets and PWM matches for Tec1 and Ste12.

With the use of the PWM sequences, ChIP bound regions and orthologous unbound regions from each species were then scored for the presence of each motif (15). There were several significant classes of TF binding events, with those genes bound by all three factors present near the top of both the Tec1 (all bound, motif in all) and Ste12 (all bound, with and without motif) lists (Fig. 3, C and D). For promoter regions that displayed evolutionarily conserved ChIP binding in all three Saccharomyces species, 83% (Tec1) and 24% (Ste12) of the regions contained at least one significant occurrence of the PWM motif for that factor in each species (Fig. 3, E and F). In contrast, 2 and 62% of the promoters that displayed conserved ChIP binding did not contain a match to the PWM in at least two of the three species. Thus, the Ste12 motif is not present in a high proportion of pseudohyphal-responsive genes, implying that Tec1 may target Ste12 to these regulatory regions (16).

In contrast to the previous results in which experimentally determined binding correlated with the presence of predicted motifs, there were many examples where a species-specific loss of binding and/or a loss of sequence have occurred. There were 48 (Tec1, 14% of total binding events) and 35 (Ste12, 10% of total binding events) experimentally bound regions that contained a PWM match where the orthologous region in at least one other species neither was bound nor contained a motif match. For these loci, loss of ChIP binding is concordant with the loss of the motif for this factor, providing clear examples of regions where network evolution occurred through the gain or loss of regulatory sequences.

Furthermore, there were 45 (Tec1, 12%) and 9 (Ste12, 3%) instances where a PWM match occurred in all three species but where that region was experimentally bound in only two species (Fig. 2D). Either these loci are occupied at other times in the life cycle or they are not functional. Conversely, in 11 (Tec1, 3%) and 22 (Ste12, 6%) instances, genomic regions displayed conserved ChIP binding, but at least one species was missing a PWM motif match (Fig. 2E). Thus, sequence conservation does not readily predict binding.

To further examine the role of conserved versus nonconserved ChIP binding events and motifs, we compared these results with expression microarray studies of pseudohyphal formation in S. cerevisiae (17). Of the ChIP binding gene targets that had significantly altered expression (∼20% of the ChIP targets, a several-fold enrichment), there was no enrichment for genes with conserved binding (11% bound versus 14% unbound) or PWM matches (12% with motif versus 16% without motif) (table S11). Thus, sequence-based motif analyses in the absence of experimentally determined binding data are not sufficient for the accurate prediction of TF binding profiles and gene function. In addition, the presence of nonconserved ChIP targets upstream of pseudohyphal-regulated genes at the same frequency as conserved targets indicates that nonconserved sites are likely to be functional.

To elucidate the biological importance of both the conserved and species-specific gene targets, we mapped each bound region to its downstream target genes by identifying open reading frames (ORFs) that were 3′ of and directly flanking each ChIP binding event (tables S7 and S8). Conserved Ste12 and Tec1 gene targets displayed enrichment for two gene ontology (GO) (18) categories: “filamentous growth” and “regulation of transcription from RNA polymerase II promoters” (Fig. 4A). Because most of the genes from within the second category encode TFs, the predicted downstream TF networks of S. bayanus and S. mikatae were compared to those of S. cerevisiae (19) to determine which connections had been maintained during the evolution of the Saccharomyces sensu stricto group (Fig. 4C). The binding of Ste12 and Tec1 to downstream TFs was shown to be highly conserved (73% across the three species). The network of S. mikatae was most diverged and had several key regulatory omissions including Flo8 (not bound by either Ste12 or Tec1) and Mga1 (not bound by Ste12). Thus, although important differences can be found, TF binding to the promoters of other TFs was highly conserved between species relative to the level of conservation observed for other genes.

Fig. 4.

(A) Ste12 and Tec1 bind to common and distinct sets of genes across the Saccharomyces sensu stricto lineage. Overrepresented GO terms are listed for each combinatorial category. (B) Mating genes bound specifically by Ste12 in S. mikatae and S. bayanus. (C) TF network conservation in S. cerevisiae (red), S. mikatae (blue), and S. bayanus (green).

From those groups of genes that did not display conserved binding across the three species, one notable class was bound by Ste12 specifically in S. mikatae and S. bayanus and was enriched in genes involved in mating (GO category: “reproduction in single-celled organisms”) in S. cerevisiae (Fig. 4, A and B). Unlike the gene targets in the diploid cells used in this study, these genes are targets of Ste12 in haploid S. cerevisiae cells (20, 21), and this differential binding occurs despite the presence of conserved Ste12 binding motifs (fig. S3). Thus, Ste12 binding targets may be occupied under different conditions across related species. In S. cerevisiae, Ste12 binds to these sites only during mating, whereas in S. mikatae and S. bayanus, Ste12 binds to these same regions in diploid cells.

To extend this study outside of Saccharomyces yeasts, we also mapped the binding of the Candida albicans Ste12 ortholog, Cph1 (22). Cph1 functions in the dimorphic switch of this yeast, a process that shares many genetic components with pseudohyphal growth (23). A total of 52 significant Cph1 ChIP binding events (table S12) was detected under dimorphic growth conditions, with many residing upstream of known pathogenicity determinants (2427). From these gene targets, 33 have recognizable orthologs in S. cerevisiae, and of these orthologs, 10, 10, and 13 displayed conserved binding with S. cerevisiae, S. mikatae, and S. bayanus, respectively. Although most gene targets of Cph1 in C. albicans are not conserved with the Saccharomyces species, the C. albicans orthologs bound by Ste12, like those from S. mikatae and S. bayanus, included a significant number of genes that function during reproduction and mating in S. cerevisiae (P = 4 × 10–3) (18). Thus, in C. albicans, like in S. mikatae and S. bayanus, the Ste12 ortholog also binds to genes required for mating in S. cerevisiae under filamentous growth conditions, raising the possibility that these genes have become more specialized in S. cerevisiae.

We find that extensive regulatory changes can exist in closely related species, which is consistent with a recent study that showed that distinct regulatory circuits can produce similar regulatory outcomes in S. cerevisiae and C. albicans (28). Furthermore, although S. cerevisiae and S. mikatae are quite similar to one another at the nucleotide sequence level, they are equally different to each other and to S. bayanus in their TF profiles. We expect that the extensive binding site differences observed in this study reflect the rapid specialization of these organisms for their distinct ecological environments and that differences in transcription regulation between related species may be responsible for rapid evolutionary adaptation to varied ecological niches.

Supporting Online Material

Materials and Methods

Fig. S1 to S5

Tables S1 to S14


References and Notes

View Abstract

Stay Connected to Science

Navigate This Article