A Rapidly Evolving Homeobox at the Site of a Hybrid Sterility Gene

See allHide authors and affiliations

Science  20 Nov 1998:
Vol. 282, Issue 5393, pp. 1501-1504
DOI: 10.1126/science.282.5393.1501


The homeodomain is a DNA binding motif that is usually conserved among diverse taxa. Rapidly evolving homeodomains are thus of interest because their divergence may be associated with speciation. The exact site of the Odysseus (Ods) locus of hybrid male sterility inDrosophila contains such a homeobox gene. In the past half million years, this homeodomain has experienced more amino acid substitutions than it did in the preceding 700 million years; during this period, it has also evolved faster than other parts of the protein or even the introns. Such rapid sequence divergence is driven by positive selection and may contribute to reproductive isolation.

The homeodomain is a stretch of 60 to 62 amino acids that was first discovered to be conserved between many homeotic genes in Drosophila (1). Proteins containing this DNA binding motif are usually transcription factors and have been found in most metazoans (2). Evolutionary conservation of homeoboxes has been well documented; for example, the homeodomains in the Antp gene ofDrosophila and grasshoppers have identical sequences (3). Such a high degree of conservation in the protein sequences is often taken as evidence for conservation of the underlying functions. It is thus of great interest to find exceptions in such a highly conservative gene family. Are there homeobox genes that have evolved rapidly and, if there are, what are their functions? What are the selective forces that make them deviate from the norm for this class of genes? It is not inconceivable that their sequence divergence and evolution of the underlying functions may even play a role in differentiation among closely related species. In this report, we describe the cloning of a new homeobox gene that has experienced accelerated evolution in the Drosophila melanogaster clade. The acceleration is 100 to 1000 times greater than the rate experienced by its homologs in other taxa. The new homeobox gene was discovered in the search for a “speciation gene” that causes hybrid male sterility.

In a series of studies, several genetic elements responsible for reproductive isolation between Drosophila simulans andDrosophila mauritiana have been identified (4). One of them, mapped to the cytological interval 16D on the X chromosome, is named Odysseus (Ods) (5,6). The introgression of an appropriateOds-containing region of D. mauritiana intoD. simulans renders males completely sterile. The allelic state of Ods is nearly fixed in both species (7). In other words, Ods-induced hybrid male sterility is observable between any pairwise combination ofD. mauritiana and D.simulans lines.

To delineate the Ods locus precisely, we generated 190 new recombinants with progressively shortened introgressions (Fig. 1). With eight molecular markers (8), 63 of these introgressions are male fertile and the remaining 127 are male sterile. In agreement with (6), the distinction between fertile lines (>90% fertility penetrance for each line) and sterile lines (0%) is clear-cut. The two longest fertile introgressions and the two shortest sterile introgressions define the location of the Ods locus. Because the breakpoints of the four introgressions all fall within a genomic clone of 8.4 kb (U8 in Fig. 1), it is plausible that theOds gene, or at least part of it, resides within this clone.

Figure 1

Molecular mapping of the Ods gene. Thick line represents cytological interval 16D covered by two P1 phage clones; line segments below indicate subclones, with the number denoting their size in kilobases. Open boxes represent introgressions from D. mauritiana. Introgressions are grouped according to size and the number of lines in each size class is given. The difference between the fertile and sterile introgressions is mapped to the U8 clone. Three of the four exons of the OdsH transcript are contained within U8.

We first obtained the complete DNA sequence of the U8 clone fromD. melanogaster and used DNA software programs to identify three putative exons. On the basis of the putative exon sequences, we designed polymerase chain reaction (PCR) primers to analyze transcripts in a series of experiments. By the reverse transcriptase–PCR (RT-PCR) procedure, we could detect transcripts of the predicted sizes spanning exons 2 and 3 in both larval and adult stages (8). Then, a near full-length cDNA was obtained from a testis cDNA library by PCR amplification with primers in the exons and in the cloning vector. Finally, the 3′ end is determined by the RACE (rapid amplification of cDNA ends) protocol (8). Translation of this cDNA sequence including exon 1, which is located distal to U8, is shown in Fig. 2. The putative protein is 349 amino acids long. Because of the presence of a homeobox in exons 2 and 3, we have named this new transcript OdsH (forOds-site homeobox gene). The name implies the correspondence in position between the genetic and molecular data without stating their functional equivalence.

Figure 2

Amino acid sequence of OdsHtranslated from the testis cDNA clone of D. melanogaster. The homeodomain is underlined and the intron positions are indicated with arrowheads. The first methionine was putatively identified by analyzing DNA sequences from both D. melanogaster andD. mauritiana. GenBank accession number for the cDNA sequence is AF095575.

The best and highly significant matches with OdsH in the database are the unc-4 gene of Caenorhabditis elegans, uncx4.1 of mouse (and its rat homolog), and an unpublished sequence from planaria (9). These sequences comprise a homologous cluster belonging in the paired-type subfamily of homeobox genes. Some of these homologous genes from very divergent taxa—notably Drosophila, rodents, and planaria—also have significant matches beyond the homeodomain. For example, 13 of the 14 amino acids adjoining the COOH-terminus of the homeodomain are identical in mouse and Drosophila. A high level of conservation extends for 33 amino acids from the COOH-terminus. In this report, we focus on exons 2 and 3 because their products contain the conservative homeodomain, which would allow us to contrast long-term evolutionary stability (such as between mammals andDrosophila) with recent rapid changes (between sibling species). Homology with non-Drosophila species outside these two exons is too low to be informative.

In comparing homologous genes from different species, it is important to distinguish between orthologous and paralogous sequences. The former share a common ancestry due to speciation and the latter by gene duplication. Paralogous genes from different species must have diverged for a longer time than the orthologous copies. The mouse and C. elegans homeodomains in Fig. 3A have diverged at only 7 sites of the 60 residues. Between the mammalian homolog and OdsH of D. melanogaster, there are 17 differences.

Figure 3

(A) Amino acid sequence of exons 2 and 3 from four sibling species ofDrosophila: D. simulans (sim), D. sechellia (sec), D. mauritiana (mau), and D. melanogaster (mel). Multiple alleles (three to six) from each species are sequenced. With the exception of D. simulans, which has three polymorphic sites, all species are invariant in their amino acid sequence. Arrow indicates intron position and thick bar covers the homeodomain. Ods homologs from rat and C. elegans are also indicated. Alignment outside the homeodomain is not always possible. (B) Sequences of the intron between exons 2 and 3 from the three species ofDrosophila. Only 100 base pairs contiguous with either exon 2 or exon 3 are presented.

Although the homologs of OdsH appear conserved among distantly related taxa like mammals and Drosophila,OdsH has evolved rapidly among sibling species of theD. melanogaster clade. Amino acid sequences corresponding to exons 2 and 3 of OdsH in Drosophila species are presented in Fig. 3A. D. simulans, Drosophila sechellia, and D. mauritiana are close enough to produce fertile F1 females (10) and are estimated to have diverged for about half a million years (11). Each of the trio can also be crossed to D. melanogaster (10), which probably diverged from the trio about 1 million years ago (12, 13). We sequenced multiple alleles from each species to ensure that no anomaly is associated with sampling or laboratory procedures. Within-species variations (or lack of) will be useful for future population genetic analysis.

The most visually striking feature of Fig. 3 is the surge of amino acid changes between these sibling species within the homeodomain itself. Between D. simulans and D. mauritiana, the species pairs analyzed for the Ods-induced hybrid sterility, there are 15 amino acid differences in the homeodomain ofOdsH. This difference is almost as large as that betweenOdsH of Drosophila and uncx4.1 of mouse and it is larger than that between C. elegans and mouse. For a comparison with another closely related species pair, the homologs between mouse and rat have only 1 amino acid difference after more than 20 million years of divergence (14). It should be noted that the acceleration in OdsH is much greater than in other well-known fast-evolving homeoboxes; for example, the ftz homeobox differs by 10 or 11 amino acids in different orders of insects (2), less divergent than the OdsH of Drosophila sibling species. Finally, the homeodomain appears to have evolved more rapidly than the remaining part of the protein or the adjoining intron sequences between D. simulans and D. mauritiana(Fig. 3).

Given the large differences in the OdsH sequences of the sibling species, it is imperative to show that these genes are orthologous. Introns and the surrounding noncoding sequences were obtained from the corresponding genomic regions. Two short stretches of sequence from both ends of the intron between exons 2 and 3 are shown in Fig. 3B (the whole 1.3 kb has been analyzed as indicated in Table 1). The orthologous relationship is unambiguous. The difference between D. simulans and D. mauritiana in this entire intron of 1.3 kb is 1.44%, which agrees very well with the average value for other noncoding sequences at 1% to 2% between this species pair (11, 13).

Table 1

Divergence in the homeodomain-containing exons of OdsH between sibling species of Drosophila. Numbers are estimated substitutions per base pair (bp) (21). Most nucleotide changes in the exons are nonsynonymous (see Fig. 4). Sim, D. simulans; mau, D. mauritiana; mel,D. melanogaster.

View this table:

We now ask whether OdsH has become a nonessential gene in Drosophila with relaxed selection or whether positive selection for favorable amino acids has been driving the rapid evolution. From the results of Table 1, the rate of nucleotide substitution in the homeodomain between D. simulans andD. mauritiana is 8 times higher than that in the intron, which should be close to the neutral rate. The difference is highly significant (P < 0.01 by Fisher's exact test) and is most compatible with the positive-selection interpretation (15). On the other hand, the rate in the nonhomeodomain is slightly less than the intron rate. In the comparisons between more distantly related pairs involving D. melanogaster, the substitution rates in the homeodomain are only twice as high as those in the intron but the nonhomeodomain appears to have evolved rapidly as well. Therefore, a conservative conclusion is that negative selection on OdsH has generally been relaxed in the D. melanogaster clade but positive selection can be inferred in some cases, such as in the homeodomain of D. mauritiana and D. simulans.

In a more detailed analysis, we compare the number of nucleotide substitutions that result in amino acid replacements (R) and those that are silent (S) along each branch (8). The R/S numbers in the homeodomain are 10:1 in the branch leading to D. mauritiana but 10:9 in the branch between node A of Fig. 4 and D. melanogaster. The difference is significant (P < 0.05 by Fisher's exact test), which suggests an excess in replacements in D. mauritiana relative to that in D. melanogaster. We may also compare the replacement numbers in the homeodomain and nonhomeodomain, both about 60 residues, between species. The replacement numbers in the two domains are 10:1 in theD. mauritiana lineage and 10:15 in the D. melanogaster lineage (P < 0.01), which suggests that the selective pressure in the homeodomain, relative to that in the nonhomeodomain, is very different in these two species. This test also reveals a marginally significant difference between D. simulans and D. melanogaster (6:1 and 10:15, respectively, with P = 0.041 in the one-tailed test). Apparently, the homeodomain has been evolving rapidly in all three Drosophila species but the acceleration is most dramatic in D. mauritiana.

Figure 4

Summary of sequence evolution inOdsH and its homologs. (A) Numbers of inferred amino acid changes in the homeodomain are indicated next to the branches. Assignment is according to the method of Fitch (20). Because of multiple nucleotide substitutions within each codon, the numbers of nonsynonymous nucleotide substitutions underlying these amino acid changes are 8.4 (to C. elegans), 8.1 (to the mouse/rat ancestor), and 20.6 (toDrosophila). (B) Numbers of amino acid changes in the homeodomain are indicated next to the branches as in (A). All three branches are between node A and the extant species. Numbers of nucleotide substitutions underlying these amino acid replacements in the homeodomain are indicated in the first row below (R). The numbers of silent nucleotide changes are also indicated (S). For comparisons, changes in the nonhomeodomain are also indicated.

What may be the nature of the positive selection that drives the evolution of OdsH? We note that the putative homologs ofOdsH in mammals and nematodes function in neural tissues (9) but OdsH is also expressed in theDrosophila testis. It is plausible that the accelerated evolution of OdsH is concomitant with its acquisition of a male germ line function. [There is preliminary evidence for ancient duplication of OdsH in Drosophila and this paralogous gene expresses in neural tissue (16); therefore, OdsH could have been selected for new functions.] Male reproductive function is sometimes accompanied by the opportunities for sexual selection driving rapid sequence evolution (17). If the hypothesis is correct, one would expect to observe accelerated evolution in species whose OdsHhomolog has acquired male germ line expression. Whatever the cause of the rapid amino acid substitution, a by-product of these changes could be hybrid male sterility. Because males bearing either fertile or sterile introgressions indicated in Fig. 1 differ only in the species origin of the region of exons 3 and 4 of OdsH, which, moreover, is expressed in both fertile and sterile males, the cause(s) of hybrid sterility must be primarily in the amino acid sequences of these exons.

The studies of speciation among closely related species have barely entered the molecular era—notably in marine invertebrate systems (18). At the other end of the spectrum, homeobox genes have been extensively studied but mostly among very distantly related taxa (19). Molecular characterization of genes likeOdsH may allow us to combine speciation studies and molecular evolution analysis into a coherent discipline.

  • * To whom correspondence should be addressed. E-mail: ciwu{at}


View Abstract

Navigate This Article