Pesticide Resistance via Transposition-Mediated Adaptive Gene Truncation in Drosophila

See allHide authors and affiliations

Science  29 Jul 2005:
Vol. 309, Issue 5735, pp. 764-767
DOI: 10.1126/science.1112699


To study adaptation, it is essential to identify multiple adaptive mutations and to characterize their molecular, phenotypic, selective, and ecological consequences. Here we describe a genomic screen for adaptive insertions of transposable elements in Drosophila. Using a pilot application of this screen, we have identified an adaptive transposable element insertion, which truncates a gene and apparently generates a functional protein in the process. The insertion of this transposable element confers increased resistance to an organophosphate pesticide and has spread in D. melanogaster recently.

The Drosophila genome contains a large number of active transposable element (TE) families that generate new TE insertions (1, 2). Many such TEs are deleterious at least partly because ectopic recombination among them scrambles chromosomes (3, 4). Thus, many new TE insertions cannot reach high population frequencies unless they either recombine infrequently (4) or they lead to a sufficiently beneficial change to overcome the disadvantage of ectopic recombination.

To search for unusually frequent and therefore putatively adaptive TE insertions, we conducted a population survey of all of the 16 identified insertions of long interspersed element (LINE)–like Doc TEs (5) located in regions of high recombination in the sequenced D. melanogaster genome (4). All Doc elements except one (Doc1420) appeared to be subject to strong purifying selection. Doc1420 occurred unusually frequently (4) despite being neither unusually short nor unusually divergent, which suggests that it either generates or is closely linked to an adaptive mutation.

Doc1420 is very frequent worldwide (∼80%) and is much more rare (G test; P = 0.0001) in putatively ancestral African populations of Drosophila in Zimbabwe and Malawi (table S1). This pattern is consistent with a genetic linkage between Doc1420 and an adaptation associated with global expansion of D. melanogaster out of Africa.

To investigate the possible adaptive effects of Doc1420, we examined its effects on neighboring genes. Doc1420 interrupts a predicted gene, CG10618 (which we named CHKov1). CHKov1 contains four exons [of length 175, 494, 367, and 197 base pairs (bp)], and Doc1420 has inserted into the second exon (6). CHKov1 appears to be functional; the D. yakuba ortholog of CHKov1 [D. yakuba diverged from D. melanogaster ∼5 million years ago (7)] shows conservation of all six intron junctions and of the predicted open reading frame. Moreover, there is a significant lack of amino acid substitutions (Ka), compared with the number of synonymous substitutions (Ks) between these two sequences (Ka/Ks = 0.1332; 95% confidence interval is 0.0486 to 0.3648), indicating that purifying selection has been preserving the sequence of the CHKov1 protein for the past ∼10 million years.

The Doc1420 insertion in CHKov1 generates two sets of altered transcripts (Fig. 1B) (6). In the presence of Doc1420, the transcript containing all four exons (and thus the original protein) appears to be absent. CHKov1 and its paralogs share the PFAM domain DUF227 (amino acids 94 to 321), whose function is unknown (8), and a SMART domain CHK (amino acids 130 to 321), with a putative choline kinase function (9). The developmental profiles (10) of several of the closest paralogs of CHKov1 have a consistent pattern of expression: high levels of expression in larvae and adults and low levels of expression in the embryo and pupae (fig. S2). This pattern is consistent with involvement of the paralogs of CHKov1 (and by implication, with CHKov1 itself) in digestion or detoxification. The altered transcripts (Fig. 1B) lack most of the conserved PFAM and SMART domains, which suggests that the original enzymatic function of CHKov1 is likely to be lost in the presence of Doc1420.

Fig. 1.

Structure of CHKov1. (A) The structure of the ancestral transcript (before the insertion of Doc1420). The four exons are represented by rectangles. The transcript is depicted with black arrows. The striped and crosshatched areas represent PFAM domain DUF227 and SMART domain CLK, respectively. (B) Display of the altered transcripts in the strain possessing Doc1420. The first altered transcript contains the first exon, part of the second exon on the 5′ side of Doc1420, and ends 391 to 394 bp inside the Doc1420 sequence. Identified transcripts in the second set start 3465 bp inside Doc1420; contain a short region of variable length (21, 38, and 77 bp in the three sequenced cases); they splice out, respectively, 1555, 1685, and 1549 bp regions containing part of the second exon on the 3′ side of Doc1420 and part of the third exon; and they proceed until the end of the gene. In addition, one of the sequenced transcripts fails to splice out the intron between the third and fourth exons.

If Doc1420 is adaptive (or linked to an adaptive mutation), we expect to find the signature of an incomplete selective sweep: a sharp reduction of variability among the alleles linked to Doc1420. We tested this prediction by sequencing two regions [1.9 kilobases (kb) in the 5′ direction and 1.5 kb in the 3′ direction] immediately adjacent to Doc1420 in an unstratified sample of 43 isofemale North American strains and seven isofemale Zimbabwe strains. We also sequenced this region in a single isofemale strain of D. simulans.

The haplotype structure near Doc1420 (Fig. 2) strongly supports the hypothesis of a recent incomplete selective sweep. We found only six distinct haplotypes among the 34 alleles containing the Doc1420 insertion, whereas each one of the 23 alleles without Doc1420 is a unique haplotype. We conducted a series of coalescent simulations (6) to demonstrate that this structure is unexpected under neutrality. In 105 simulations, we failed to generate any samples with fewer than 18 distinct haplotypes (compared with the observed six distinct haplotypes) linked to a polymorphism as frequent as Doc1420 [P < 0.00001, assuming recombination rate of 3.3 cM/Mbp (11); P = 0.0009, assuming no recombination].

Fig. 2.

Sequence of the 3.4-kb region flanking Doc1420. The figure shows the segregating sites (SS) within the 3.4-kb region flanking the Doc1420 insertion. The SS number and the distance from Doc1420 are at the top. The SS within coding regions are in bold and are identified as replacement (R) or synonymous (S) polymorphisms. The SS within CHKov1 map to either derived transcript 1 (TRANSCRIPT 1) or to derived transcript 2 (TRANSCRIPT 2). Doc1420 is shown as a black rectangle; the absence of Doc1420 is shown as an empty rectangle. The state for all of the segregating sites in the D. simulans sequence is shown at the bottom.

If an incomplete selective sweep is associated with Doc1420, its signature should decay at increasing distances from Doc1420. Indeed, at a distance of ∼18 kb in both the 5′ and 3′ directions from Doc1420 (fig. S1), neutrality cannot be rejected, whereas at the two closer regions (15 kb in the 5′ direction and 11 kb in the 3′ direction from Doc1420), we found a modest signature of positive selection (P < 0.05 in each case). These results suggest that the incomplete selective sweep is centered at or very near to Doc1420.

We can use the rate of decay of the haplotype structure to estimate (12) that the latest incomplete selective sweep in this region occurred ∼500 to 2400 generations ago. If one assumes 10 to 20 generations per year, this translates into ∼25 to 240 years ago. The spread of Doc1420 in the worldwide population of D. melanogaster appears at about the same time that drastic anthropogenic changes in the environment occurred, and possibly concurrently with the worldwide expansion of D. melanogaster out of Africa.

What is the adaptive effect that might be responsible for this recent expansion of the Doc1420-containing allele? We deduced that changes in CHKov1 (a putative choline kinase) might affect choline metabolism in general and possibly the function of acetylcholine esterase. Given that acetylcholine esterase is the target of several types of pesticides, including organophosphates, we hypothesized that Doc1420 insertion into CHKov1 might confer resistance to organophosphates.

To test this possibility, we used repeated backcrosses to generate two D. melanogaster strains, with largely identical genetic backgrounds except for that at CHKov1, where one strain [Doc+ (intro)] carried the insertion of Doc1420 and the other [Doc– (intro)] lacked it (6). We assayed these two strains and the two parental strains for resistance to a commonly used organophosphate pesticide [azinphos-methyl-phosphate (AZM)]. The results (Fig. 3) indicate that the presence of Doc1420 is associated with increased resistance to organophosphates in both the parental and the introgression strains. Doc+ (introgression) has substantially lower mortality in this assay (20%) than does Doc– (intro) (68%) (P < 0.001).

Fig. 3.

Pesticide sensitivity assays. The average mortality in the presence of AZM for the four studied strains averaged over three independent experiments. The time of exposure and the dosage of AZM were chosen to achieve ∼50% mortality of the more susceptible parental strain (K1). The error bars represent standard errors.

To understand the history of this locus more completely, we investigated the molecular evolution of CHKov1 and CHKov2 (a paralog of CHKov1 located immediately on the 5′ side, also known as CG10675). We specifically asked whether the Doc1420-containing alleles have evolved in a pattern consistent with a complete knockout of CHKov1 or whether there is evidence of functionality of any of the resulting transcripts. We used the McDonald-Kreitman test (13), which tests the equality of the ratio of amino acid and synonymous polymorphisms to the ratio of the number of amino acid and synonymous substitutions expected under a neutral model of evolution. This test (Table 1) revealed no violations of neutral expectations either for CHKov2 or for the haplotypes of CHKov1 lacking Doc1420. In contrast, haplotypes of CHKov1 bearing Doc1420 exhibited excess amino acid replacement polymorphism (Table 1), but only within the first and not the second set of derived transcripts (Fig. 1B). This excess is largely due to seven apparently derived amino acid polymorphisms (segregating sites 57, 58, 61, 62, 63, 64, and 65 in Fig. 2). Moreover, at these sites the derived states are fixed within the alleles containing Doc1420, whereas the ancestral states are fixed within the alleles lacking Doc1420. Additional sequencing of the transcript-1 region of CHKov1 in 32 strains from Asia, Europe, Australia, and South America (table S3) confirmed this pattern. Thus, it appears that these amino acid changes have been newly generated within an allele containing Doc1420.

Table 1.

The McDonald-Kreitman test of selection acting on CHKov1. Divergence is calculated in comparison with the sequence of D. simulans. The numbers in parentheses refer to the number of segregating sites in the haplotypes lacking Doc1420.

Gene region Coding effect Divergence Polymorphism P value
CHKov2 Synonymous 31 16 (13) 0.96 (0.80)
Replacement 8 4 (4)
CHKov1 Synonymous 36 (38) 12 (10) 0.03 (0.45)
Replacement 19 (19) 17 (8)
Transcript 1View inline Synonymous 7 1 0.007
Replacement 3 8
Transcript 2View inline Synonymous 16 5 0.66
Replacement 9 4
  • View inline* Results for the positions within the first derived transcript or the second set of transcripts (assuming correct splicing of the third intron)

  • These seven new amino acid changes cluster together at the C terminus of the putative protein derived from derived transcript 1 (Fig. 1B and Fig. 2). Overall, they exchange primarily hydrophobic with primarily hydrophilic amino acids. In six out of seven cases, the changes are to either arginine or asparagines. No pattern of adaptive evolution is evident in the sequences on the 3′ side of Doc1420. These observations strongly suggest that positive natural selection has acted at the coding level of the new truncated polypeptide generated by the truncation of CHKov1 by Doc1420 (Fig. 1B; derived transcript 1). By implication, this new protein is likely to be functional.

    Although the spread of the Doc1420-containing allele apparently took place 25 to 240 years ago, this allele contains eight independent changes, which suggests that it is much older and had undergone substantial evolution before its recent expansion. Indeed, the divergence of Doc1420 from the consensus Doc sequence at 11 positions implies that Doc1420 inserted ∼90,000 years ago and certainly more than 240 years ago [G test, P = 0.0004; assuming 3 × 10–8 substitutions per bp per year (14)]. Further substantiating this claim, the pairwise divergence per nucleotide between the Doc1420-containing and Doc1420-lacking alleles (0.012 bp–1) is more than twice as large (P < 0.05, as determined by boot-strapping) as the divergence among the alleles lacking Doc1420 (0.005 bp–1). The Doc1420-containing allele appears unusually old, even if the analysis is limited only to synonymous sites in CHKov1 and CHKov2 (table S2).

    We propose that at some point in the past, one allele of CHKov1 went through a number of drastic changes: first the insertion of Doc1420 generated a functional truncated protein, which subsequently underwent rapid amino acid evolution. The final allele expanded 25 to 240 years ago in the worldwide population of D. melanogaster, with the aid of positive natural selection apparently by resistance to organophosphates. We speculate that the Doc1420-containing allele has been evolving in an isolated population of D. melanogaster for a substantial length of time, possibly ∼90,000 years. Thus, although the recent expansion of the Doc1420-containing allele might have been caused by organophosphate resistance, the original reasons for its fast evolution and persistence must be related to some other phenotypic effect. To evaluate these possibilities, we need to understand the function of the truncated version of CHKov1 and the phenotypic effects of the loss of its original function.

    Recently, resistance to some Bacillus thuringiensis (Bt) toxins in Heliothis virescens was mapped to the TE-induced loss of Bt-toxin receptors (15). Moreover, resistance to dichloro-diphenyl-trichloroethane (DDT) in D. melanogaster (16) and possibly in D. simulans (17) is largely caused by the TE-induced up-regulation of a cytp450 gene. Our research suggests another mechanism of pesticide resistance, which is mediated by either the loss of a nontarget gene or by the generation of a new protein. These cases underscore the importance of TEs to the evolution of pesticide resistance in particular and to adaptive evolution in general. The screen for adaptive TEs described in this work should help us understand the process and frequency of TE-generated adaptive mutations.

    Supporting Online Material

    Materials and Methods

    Figs. S1 and S2

    Tables S1 to S3


    References and Notes

    View Abstract

    Stay Connected to Science

    Navigate This Article