Research Article

Extensive introgression in a malaria vector species complex revealed by phylogenomics

See allHide authors and affiliations

Science  27 Nov 2014:
DOI: 10.1126/science.1258524


Introgressive hybridization is now recognized as a widespread phenomenon, but its role in evolution remains contested. Here we use newly available reference genome assemblies to investigate phylogenetic relationships and introgression in a medically important group of Afrotropical mosquito sibling species. We have identified the correct species branching order to resolve a contentious phylogeny, and show that lineages leading to the principal vectors of human malaria were among the first to split. Pervasive autosomal introgression between these malaria vectors means that only a small fraction of the genome, mainly on the X chromosome, has not crossed species boundaries. Our results suggest that traits enhancing vectorial capacity may be gained through interspecific gene flow, including between non-sister species.

The notion that species boundaries can be porous to introgression is increasingly accepted. Charismatic cases, such as gene flow between Neanderthals and anatomically modern humans (1), or between Heliconius butterflies (2, 3), show that introgression can transfer beneficial alleles between closely related species. Yet the broader role of introgression in evolution remains contentious and poorly documented, partly due to the challenges involved in accurately identifying introgression in the very groups where it is most likely to occur. Recently diverged species often have incomplete reproductive barriers, hence may hybridize in sympatry. However, another feature of rapid radiations is that ancestral polymorphism predating lineage splitting may be sorted stochastically among descendant lineages in a process known as incomplete lineage sorting (ILS). Alleles shared through ILS can be difficult to distinguish from those shared through secondary contact and introgression. Newly developed methods can differentiate these two processes (1, 4), but only if the correct species branching order is known. Because both introgression and ILS cause discordance between gene trees and the species tree, inference of the correct species phylogeny (i.e., the historical branching order of the taxa) is notoriously challenging for recent radiations (57).

Since the discovery that the single mosquito taxon described in 1902 as Anopheles gambiae (8) was actually a complex of several closely related and morphologically indistinguishable sibling species (known as the An. gambiae complex) (9), the correct species branching order has remained controversial and unresolved. This Afrotropical complex (1013) contains three widely distributed and extensively sympatric species that rank among the world’s most important vectors of human malaria owing to their association with humans (An. gambiae sensu stricto, its closest relative and sister species, Anopheles coluzzii, and Anopheles arabiensis) (Fig. 1A). Anopheles merus and Anopheles melas, salt-tolerant species that breed in brackish coastal waters of eastern and western Africa, respectively, are minor vectors. Anopheles quadriannulatus plays no role in malaria transmission despite vector competence for Plasmodium falciparum, as it tends to bite animals other than humans. On the basis of ecology and behavior, one might predict phylogenetic clustering of the highly anthropophilic vector species. Yet such clustering has not been supported by chromosomal inversion phylogenies (10, 14). The apparent phylogenetic affinity between An. arabiensis and An. gambiae supported by molecular markers and shared chromosomal inversion polymorphisms was instead attributed to introgression (15, 16). Introgression is plausible between any geographically overlapping pair of species in the complex, as reproductive isolation is incomplete: adult female F1 hybrids—although uncommon in nature—are fertile and vigorous (only F1 hybrid males are sterile) (10). Nonetheless, comprehensive evidence for introgression between An. arabiensis and the other major vector lineage (An. gambiae + An. coluzzii) has been lacking until now, because of insufficient genomic resources.

Fig. 1

Distribution and phylogenetic relationships of sequenced members of the An. gambiae complex. (A) Schematic geographic distribution of An. gambiae (gam, formerly An. gambiae S form), An. coluzzii [col, formerly An. gambiae M form (13)], An. arabiensis (ara), An. quadriannulatus (qua), An. merus (mer), and An. melas (mel). (B) Species topology as estimated from a maximum likelihood (ML) phylogeny of the X chromosome (see text) compared with the ML phylogeny estimated from the whole-genome sequence alignment. The scale-bar denotes nucleotide divergence as calculated by RAxML. Red branches of the latter tree highlight topological differences with the species tree. (C) Schematic of the species topology rooted by An. christyi (An. chr) and An. epiroticus (An. epi) showing the major introgression events (green arrows) and the approximate divergence and introgression times (in million years before present, Myr, ± 1 standard deviation, SD), assuming a substitution rate of 1.1 × 10−9 per-site per-generation and 10 generations per year. The grey bar surrounding each node shows ± 2 SD. Nodes marked by asterisks are not fully resolved owing to high levels of incomplete lineage sorting. (D) Neighbor-Joining tree displaying the Euclidian distance between individuals from population samples of each species, calculated using sequence data from the X chromosome. Each species is represented by a distinct symbol shape or color.

Important malaria vectors comprise a small fraction of the genus Anopheles, but almost invariably are embedded in species complexes similar to that of An. gambiae sensu lato (17). Thus, resolving the phylogeny of the An. gambiae complex has the potential to yield insights into the origin and evolution of traits that are associated with highly successful malaria vectors across the genus as a whole. Here we have used newly available whole genome reference assemblies (18) to infer the species phylogeny. We found that the two most important malaria vector lineages in the complex are not the most closely related, and we have uncovered pervasive introgression between them. The extent of introgression is such that the phylogenetic tree inferred from whole genome alignments supports the incorrect species branching order with high confidence.

The X chromosome reflects the species branching order

As a first step in phylogenomic analysis, alignments were generated for existing (19, 20) and newly available genome assemblies, representing six species of the An. gambiae complex and two Pyretophorus outgroup species (An. christyi and An. epiroticus) [supplementary text S1, fig. S1] (21). Among the ingroup species, the alignment included ~60% of non-gap and non-masked base pairs of the An. gambiae PEST reference assembly. Across all assemblies, the proportion of aligned base pairs was lower (~53%), but nevertheless spanned more than 93 Mb (~40% of the euchromatic genome).

As the An. gambiae complex reference genomes were assembled from laboratory colony-derived sequencing template [except for An. melas; (18)], we also performed whole genome shotgun sequencing of individual field-collected specimens sampled from at least one population of each of the six ingroup species (supplementary text S2). Sequencing reads were aligned to the species-appropriate reference assembly to avoid considerable interspecific mapping bias (supplementary text S2; figs. S3 to S4, table S3), and single nucleotide variant positions were converted to a common coordinate system relative to the reference genome alignment (supplementary text S2 to S3; fig. S2). The results presented below are based on the field-collected samples, but analyses were performed in parallel on the reference genome assemblies and our findings were consistent in all cases.

To infer the correct species branching order in the face of anticipated ILS and introgression, maximum-likelihood phylogenies were constructed from 50 kb non-overlapping windows across the alignments (referred to here as “gene trees” regardless of their protein-coding content), considering six ingroup species rooted alternatively with An. christyi or An. epiroticus (n = 4063 windows; supplementary text S3). As the choice of outgroup did not materially alter our results, we present our findings based on the more closely related species, An. christyi. When the 85 tree topologies observed at least once across the genome were sorted by chromosome arm and relative frequency, the most commonly observed trees were strongly discordant between the X chromosome and the autosomes (Fig. 2; table S9; fig. S16). Although weak disagreement among tree topologies concerning the branching order of basal nodes was a consequence of poor resolution due to ILS (Fig. 1C; fig. S16B), the striking discordance between the X chromosome and the autosomes was not a trivial consequence of lack of phylogenetic signal; conflicting topologies had strong bootstrap support (fig. S16B). Because the major disagreement between the X chromosome and the autosomes concerned the relative phylogenetic positions of An. arabiensis and An. quadriannulatus, for simplicity we grouped the most frequently observed topologies into three sets (Fig. 2; fig. S16), (i) A+CG: those that supported An. arabiensis clustering with An. coluzzii + An. gambiae; (ii) A+CG, R+Q: those that supported both (An. arabiensis (An. coluzzii + An. gambiae)) and An. quadriannulatus + An. merus; and (iii) A+Q: those that supported the clustering of An. arabiensis and An. quadriannulatus. On the X chromosome, all three of the most frequently observed topologies (inferred from 64% of windows) strongly supported the relationship (melas (arabiensis, quadriannulatus)), indicating a sister-taxon relationship between An. arabiensis and An. quadriannulatus (i.e., A+Q; orange shades in Fig. 2). This relationship was shared among all field-collected samples (Fig. 1D). Notably, the X chromosomal windows supporting these topologies are concentrated distal to the centromere (Fig. 3D; supplementary text S3), in a ~15 Mb region corresponding to the Xag inversion whose orientation is ancestral to the An. gambiae complex and shared by An. gambiae, An. coluzzii and An. merus (see supplementary text S5).

Fig. 2

Phylogenies inferred from regions on the autosomes differ strongly from phylogenies inferred from regions on the X chromosome. Maximum likelihood rooted phylogenies were inferred from n = 4063 50 kb genomic windows for An. arabiensis (A), An. coluzzii (C), An. gambiae (G), An. melas (L), An. merus (R), and An. quadriannulatus (Q), with An. christyi as outgroup. The nine most commonly observed topologies across the genome (trees i-ix) are indicated on each of the five chromosome arms (if found) by correspondingly colored blocks whose length represents the proportion of all 50 kb windows on that arm which support that topology. Topologies specific to 2La are represented by the dark grey block on 2L, and all other topologies observed on each chromosome arm were pooled and their combined frequencies are indicated by the light grey blocks. The X chromosome most often indicates that A and Q are sister taxa (trees vii–ix), while the autosomes indicate that A and C+G are sister groups (trees i–vi). Large portions of the autosomes (particularly 3L and 3R) indicate R and Q are sister taxa (trees iv–vi). Additional diversity in inferred trees is the result of rearrangement of three groups (R, C+G, and L+A+Q) due to incomplete lineage sorting. The most common phylogeny on the X chromosome (vii) represents the most likely species branching order. The 2L arm has a markedly different distribution of phylogenies due to the unique history of the 2La inversion region (see Fig. 5), which creates unusual phylogenetic topologies found nowhere else in the genome. The autosome-like trees on the X chromosome (i and ii) are entirely found in the pericentromeric region (15-24 Mb; see Fig. 3D), where introgression between An. arabiensis and An. gambiae + An. coluzzi has previously been implicated.

Fig. 3

Tree height reveals the true species branching order in the face of introgression. (A) Color-coded trees show the three possible rooted phylogenetic relationships for An. arabiensis (A), An. gambiae s.s. (G), and An. melas (L), with outgroup An. christyi. For any group of three taxa, there are two species divergence times (T1 and T2). When introgression has occurred, a strong decrease is expected in these divergence times. (B), Trees on the X chromosome have significantly higher mean values of T1 and T2 than the autosomes, indicating introgression is more likely to have occurred on the autosomes than the X chromosome (diamond = mean, whiskers ± SEM offset to the right for visual clarity; both P < 1.0×10−40). (C) Even among only autosomal loci, regions with the X majority relationship G(LA) (orange) have significantly higher T1 and T2 (both P < 1.0×10−40), indicating that the autosomal majority topology (green) is the result of widespread introgression between An. arabiensis and An. gambiae. The X majority relationship G(LA) therefore represents the true species branching relationships. (D) The gene tree distributions or “chromoplots” for all five chromosomal arms show that the autosomal and X chromosome phylogenies strongly disagree. Three-taxon phylogenies were inferred from 10 kb chromosomal windows across the genome, and colored vertical bars represent the relative abundance of the three alternative topologies (panel A) in a 200 kb window. The proportions of the three 10 kb gene trees for each chromosome arm (left) indicate that the autosomes all strongly show a closer relationship between An. gambiae and An. arabiensis (green), while the X indicates a closer relationship between An. arabiensis and An. melas (orange). Black semi-circles indicate the location of centromeres.

In stark contrast to the X chromosome, the overwhelming majority of window-based topologies across the autosomes supported An. arabiensis as sister to An. gambiae + An. coluzzii (green and purple shades in Fig. 2). On chromosomes 3 and 2R a subset of these topologies also supported a sister-taxon relationship between An. quadriannulatus and An. merus, in further disagreement with the X chromosome (purple shades, Fig. 2).

Autosomal introgression between An. arabiensis and the ancestor of An. gambiae + An. coluzzii has long been postulated (10, 22), and could explain the strong discordance between the dominant tree topologies of the X and autosomes. However, prior to this study the correct species branching order was unresolved, precluding definitive interpretation of these conflicting signals. To infer the correct historical branching order, we applied a strategy based on sequence divergence (supplementary text S3; fig. S16). Because introgression will reduce sequence divergence between the species exchanging genes, we expect that the correct species branching order revealed by gene trees constructed from non-introgressed sequences will show deeper divergences than those constructed from introgressed sequences. If the hypothesis of autosomal introgression is correct, this implies that the topologies supported by the X chromosome should show significantly higher divergence times between An. arabiensis and either An. gambiae or An. coluzzii than topologies supported by the autosomes.

To test this hypothesis we used An. arabiensis, An. gambiae, and An. melas, as these three species show strongly discordant trees on the X and autosomes. For this trio, there are three possible phylogenetic relationships and two divergence times for each, T1 and T2 (Fig. 3A). Using the metrics T1 and T2, we conducted two different tests. Initially, we compared mean sequence divergence between X chromosome-based versus autosome-based topologies. As predicted, the estimated mean values of T1 and T2 for trees inferred from 10 kb windows on the X chromosome were significantly higher than their counterparts on the autosomes (both P < 1.0×10−40; Fig. 3B). However, the possibility exists that this result was confounded or entirely driven by other factors that differ between the X and autosomes, including a faster rate of evolution on the X (23). Indeed, we found evidence supporting faster evolution of genes on the X chromosome (supplementary text S3; fig. S23). To avoid this problem, we conducted a second test on the autosomes alone, focusing on trees inferred from 10 kb windows and the mean divergence levels among those supporting the three possible phylogenetic relationships illustrated in Fig. 3A. The result, congruent with the first test but providing unequivocal evidence for the correct species branching order, was that the set of autosomal trees supporting the majority X chromosome topology (gambiae (melas, arabiensis)) again had the highest mean values of T1 and T2 (both P < 1.0×10−40; Fig. 3C). This indicates that the species branching order inferred from the X chromosome is the correct one (Fig. 1B), despite extensive amounts of An. gambiae-An. arabiensis autosomal introgression (Fig. 3D; fig. S16).

The total evidence approach to phylogenetic reconstruction (24) is premised on the notion that the best estimate of species relationships arises from character congruence when all available data are considered simultaneously. Alternative approaches seek taxonomic congruence between different data sets or data partitions, analyzed separately (25). Either way, standard phylogenomic practice entails some form of “majority rule,” based on the assumption that as the amount of data increases so will the probability of converging on the correct species branching order [apart from exceptional cases (26)]. Due to both ILS and introgression, the historical species branching order for the An. gambiae complex is represented by only 1.9% of 50 kb windows across the entire genome (Fig. 2). As a result, when we inferred a maximum likelihood tree for the An. gambiae complex based on whole genome alignments, we recovered the wrong species branching order supported by 100% of the bootstrap replicates at each node (Fig. 1B; supplementary text S3; figs. S17A and S18A). The extent of autosomal introgression in the An. gambiae complex has the paradoxical effect that as an increasing amount of the genome is sampled, support for the incorrect species branching order is maximized.

Autosomal permeability of species boundaries

Early cytotaxonomic evidence (10, 12, 27), and more recent rDNA-based evidence (28, 29), supports rare occurrences (<0.1%) of natural female F1 hybrids between An. arabiensis-An. gambiae + An. coluzzii, An. arabiensis-An. quadriannulatus, and An. melas-An. gambiae + An. coluzzii, although there are no reports of hybrids involving An. merus. The evolutionary importance of these rare hybrids as bridges to interspecific gene flow has remained controversial. Inference of the correct species branching order for the An. gambiae complex (Fig. 1B) allowed a systematic analysis of introgression across the genomes of six members of this complex using the D (1, 4) and DFOIL (30) statistics (supplementary text S4).

As expected, such tests revealed pervasive introgression across all autosomes between An. arabiensis and the ancestor of An. gambiae + An. coluzzii (figs. S24 and S25). Although introgression was detected in both directions, the majority involved genetic transfer from An. arabiensis into the ancestor of An. gambiae + An. coluzzii. This recent and massive episode of introgression impedes our ability to detect older introgression events between these species. Unexpectedly, we also found evidence of extensive autosomal introgression between another species pair, An. merus and An. quadriannulatus (Fig. 4; supplementary text S3; figs. S24 and S25). One of the most striking of the introgressed regions was a contiguous block of genes coincident with the ~22 Mb 3La chromosomal inversion (31). The corresponding sequence originally present in ancestral populations of An. quadriannulatus has been entirely replaced by its counterpart from An. merus, a conclusion supported by the clustering of An. merus with An. quadriannulatus in gene trees constructed from sequences in the 3La inversion (figs. S19, C and D, and S21B). Extant populations of both of these species, and indeed all recognized species in the An. gambiae complex, are fixed for the standard (3L+a) orientation except An. melas and its putative sister species An. bwambae, both fixed for the 3La orientation (31). Considering that the exact ~22 Mb 3La region was replaced between species whose contemporary populations are collinear for 3L+a, it is conceivable that ancestral An. quadriannulatus populations originally carried 3La prior to the 3L+a introgression. The expected reduced recombination between inverted and standard arrangements in inversion heterozygotes may explain why the introgressed region is coincident with the entire 22 Mb 3La region.

Fig. 4

Introgression between An. merus and An. quadriannulatus. Chromoplots for all five chromosomal arms show a highly spatially heterogeneous distribution of phylogenies inferred from 50 kb genomic regions, particularly on 3L. The three possible rooted phylogenetic relationships for An. quadriannulatus (Q), An. melas (L), and An. merus (R), with outgroup An. christyi are shown, colored according to the key in the lower-right. The region on 3L corresponding to the 3La inversion shows strong evidence of R-Q introgression, and a strong negative deviation of the D-statistic. The region on 2L corresponding to the 2La inversion is highly enriched for the R(LQ) relationship, as expected given that L and Q both have the 2L+a orientation, while R has 2La (see Fig. 5). Across all the autosomes, the D-statistic generally trends toward negative values, indicating weak or ancient R-Q introgression may have been occurring across the autosomes (see supplementary text S3).

The introgression profile was consistent between samples from natural populations and the reference genome assemblies, involving the same species pairs and the same chromosomal locations (supplementary text S4; fig. S24). Moreover, based on population samples from multiple geographic locations in Africa, patterns of introgression between An. arabiensis and An. gambiae + An. coluzzii, and between An. merus and An. quadriannulatus, are similar across their geographic range (fig. S25; table S10). These findings refute the possibility that introgression was an unnatural artifact of colonization and laboratory maintenance of multiple species. The lack of geographic variation in patterns of introgression also suggests that autosomal introgression occurred sufficiently long ago to have spread across sub-populations. By contrast, mtDNA revealed patterns consistent with ongoing gene flow between An. arabiensis and An. gambiae or An. coluzzii (SOM 3.3; fig. S22).

Trans-specific inversion polymorphism

The unusually high sequence divergence between alternative orientations of a chromosomal inversion polymorphic within An. gambiae and An. coluzzii (2La and 2L+a) has been noted previously (32, 33) but inadequately explained. Other species in the complex are fixed for either 2La (An. arabiensis, An. merus) or 2L+a (An. quadriannulatus, An. melas) (31) (Fig. 5A). In An. gambiae and An. coluzzii, the 2La arrangement has been shown to confer superior resistance to desiccating environments (34, 35) relative to 2L+a, and its frequency correlates with environmental gradients of aridity (10, 32). It has been argued that 2La introgressed from An. arabiensis into the presumed 2L+a ancestor of An. gambiae + An. coluzzii (36), a crucial step facilitating the range expansion of a presumed forest-adapted species into drier savannas. Sequence divergence-based estimates of the age of 2La and 2L+a arrangements relative to the age of the species complex (Fig. 1C) suggest a radically different scenario in which 2La/2L+a is an ancient inversion polymorphism that predates the initial diversification of the entire complex but is maintained as polymorphic only in An. gambiae and An. coluzzii (Fig. 5). Consistent with this scenario, the topology of the gene tree built from sequences inside the inversion boundaries indicates that species are grouped by their 2La or 2L+a karyotype (Fig. 5B). Furthermore, contrary to the longstanding assumption (36), our data suggest that 2La introgressed from ancestral An. gambiae into An. arabiensis, not the other way around—eventually replacing the An. arabiensis 2L+a arrangement (Fig. 5A). Our inference about the direction of 2La introgression, as well as the underlying phylogenetic hypothesis for the An. gambiae complex, are consistent with the genome-enabled chromosomal inversion phylogeny of the An. gambiae complex reconstructed from ancestral and derived gene orders at the breakpoints of ten fixed chromosomal inversions (supplementary text S5; fig. S27).

Fig. 5

Ancient trans-specific polymorphism of an inversion predates radiation of the An. gambiae complex. (A) All species in the complex are fixed for either the 2La (An. arabiensis, ara; An. merus, mer) or 2L+a (An. melas, mel) orientation of the 2La rearrangement except An. gambiae (gam) and An. coluzzii (col), which remain polymorphic for both orientations. The unique phylogenies observed in the 2La region (Fig. 2) are the result of differential loss (×) of the 2La and 2L+a orientations and the introgression of 2La from the ancestral population of An. gambiae + coluzzii into An. arabiensis (dotted arrow). The overall divergence between the 2La and 2L+a orientations inferred from sequences inside the inversion breakpoints is higher on average than the predicted divergence time of the species complex. (B) Maximum likelihood phylogeny inferred from sequences of the two different orientations of the 2La region (2La and 2L+a) shows that the divergence between opposite orientations is greater than the divergence between the same orientation present in different species (all nodes, 100% bootstrap support). Particularly notable is the separation of the sister taxa An. gambiae-2L+a and An. coluzzii-2La, as these are known sister taxa. The scale-bar denotes nucleotide divergence as calculated by RAxML.

Functional insights from differential genetic exchange

We found that introgression mainly involved the autosomes. Our data suggest that the X chromosome is largely resistant to introgression, consistent with studies in this and other systems indicating that the X (or Z) disproportionately harbors factors responsible for reproductive isolation (3, 6, 3741). The nature, number and chromosomal organization of these X-linked factors are unsolved puzzles for future research, but our data offer one tantalizing clue. An. gambiae males deliver to females large amounts of 20-hydroxyecdysone (20E) (42), a steroid hormone that increases female fertility (43) and fecundity (44) and regulates mating behavior and success (45). The cytochrome p450 gene CYP315A1 (AGAP000284) that synthesizes the 20E precursor ecdysone is located near the distal end of the X chromosome. Furthermore, this region is associated with male hybrid sterility between An. gambiae and An. arabiensis, with the An. gambiae X chromosome causing inviability in an An. arabiensis genetic background (40). Combined with our data showing that male An. arabiensis produce significantly less 20E compared to An. gambiae (supplementary text S6; fig. S28), these observations prompt the hypothesis that divergence in 20E function between the two species may have a role in speciation through possible effects on the reproductive fitness of hybrid males.

Pervasive autosomal introgression between An. arabiensis and the An. gambiae-clade ancestor is consistent with the paucity of sterility factors across much of the autosomes, although several autosomal QTL have been mapped in both species (40). Accordingly, we explored the small subset of autosomal genes (n = 485) that showed no indication of introgression (supplementary text S6), as these are candidates contributing to reproductive isolation. Interestingly, we found a significant over-representation of genes encoding cyclic-nucleotide phosphodiesterases, enzymes that regulate the levels of the messengers cAMP and cGMP (46), which in turn control (among other processes) ecdysone synthesis (47, 48) (table S16).

Implications for the evolution of vectorial capacity

Initial radiation of the An. gambiae complex was both recent and rapid. Counter to the traditional view (49), it is now clear that the ancestor of the principal malaria vectors An. gambiae and An. coluzzii separated from other species in the group approximately 2 Myr ago, and that An. gambiae and An. coluzzii are distantly related to the other primary vector in the group, An. arabiensis. Extant populations of An. gambiae and An. coluzzii are highly anthropophilic vectors, dependent upon humans for blood meals, adult shelter, and larval breeding sites, yet anthropogenic influences are unlikely to have triggered their cladogenesis an estimated 0.5 Myr ago. Instead, anthropophilic traits are likely to have developed in conjunction with the expansion of Neolithic human populations that occurred more recently. Despite a history of extensive introgression with An. arabiensis, An. gambiae and An. coluzzii are behaviorally, physiologically, ecologically, and epidemiologically distinct from An. arabiensis; the same is true for An. merus and An. quadriannulatus. Notably, experimental introgressions of certain autosomal inversions result in stable heterotic polymorphisms, while other introgressed autosomal and X chromosome inversions are rapidly eliminated (22), consistent with a role for natural selection in the fate of introgressed regions. Given evidence that the 2La inversion polymorphism is maintained by selection in An. gambiae and An. coluzzii (32, 50), it seems likely that its introgression into An. arabiensis was adaptive, and bidirectional introgressions across the genome between these species probably contributed to their wide ecological flexibility and their vectorial capacity. Broad overlap exists between geographic ranges of these species, and the potential for ongoing hybridization and introgression remains, including the opportunity for introgression of insecticide resistance alleles (51). Our study establishes a foundation for further study of adaptive introgression in this species complex, and its role in shaping vectorial capacity in this and other malaria vector species complexes.


Supplementary Text

Figs. S1 to S28

Tables S1 to S16

References (52110)

References and Notes

  1. D. E. Neafsey, R. M. Waterhouse, M. R. Abai, S. S. Aganezov, M. A. Alekseyev, J. E. Allen, J. Amon, B. Arcà, P. Arensburger, G. Artemov, L. A. Assour, H. Basseri, A. Berlin, B. W. Birren, S. A. Blandin, A. I. Brockman, T. R. Burkot, A. Burt, C. S. Chan, C. Chauve, J. C. Chiu, M. Christensen, C. Costantini, V. L. M. Davidson, E. Deligianni, T. Dottorini, V. Dritsou, S. B. Gabriel, W. M. Guelbeogo, A. B. Hall, M. V. Han, T. Hlaing, D. S. T. Hughes, A. M. Jenkins, X. Jiang, I. Jungreis, E. G. Kakani, M. Kamali, P. Kemppainen, R. C. Kennedy, I. K. Kirmitzoglou, L. L. Koekemoer, N. Laban, N. Langridge, M. K. N. Lawniczak, M. Lirakis, N. F. Lobo, E. Lowy, R. M. Maccallum, C. Mao, G. Maslen, C. Mbogo, J. McCarthy, K. Michel, S. N. Mitchell, W. Moore, K. A. Murphy, A. N. Naumenko, T. Nolan, E. M. Novoa, S. O'Loughlin, C. Oringanje, M. A. Oshaghi, N. Pakpour, P. A. Papathanos, A. N. Peery, M. Povelones, A. Prakash, D. P. Price, A. Rajaraman, L. J. Reimer, D. C. Rinker, A. Rokas, T. L. Russell, N. Sagnon, M. V. Sharakhova, T. Shea, F. A. Simão, F. Simard, M. A. Slotman, P. Somboon, V. Stegniy, C. J. Struchiner, G. W. C. Thomas, M. Tojo, P. Topalis, J. M. C. Tubio, M. F. Unger, J. Vontas, C. Walton, C. S. Wilding, J. H. Willis, Y. Wu, G. Yan, E. M. Zdobnov, X. Zhou, F. Catteruccia, G. K. Christophides, F. H. Collins, R. S. Cornman, A. Crisanti, M. J. Donnelly, S. J. Emrich, M. C. Fontaine, W. Gelbart, M. W. Hahn, I. A. Hansen, P. I. Howell, F. C. Kafatos, M. Kellis, D. Lawson, C. Louis, S. Luckhart, M. A. T. Muskavitch, J. M. Ribeiro, M. A. Riehle, I. V. Sharakhov, Z. Tu, L. J. Zwiebel, N. J. Besansky, Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes. Science 346, 10.1126/science.1258522 (2014).
  2. Supplementary text, figures, and tables are available on Science Online.
  3. ACKNOWLEDGMENTSWe thank M. Kern and B. J. White for early assistance with this project. Genome assemblies and sequence reads generated by the Broad and the BGI have been submitted to NCBI under two umbrella BioProject IDs: PRJNA6751 and PRJNA254046. The mitochondrial genome alignments, VCF files of field individual samples, the MAF TBA genome alignment of reference assemblies, and the MAF genome alignment from high depth field samples are available on Dryad: doi:10.5061/dryad.f4114. Sequencing was funded at the Broad Institute by a grant from the National Human Genome Research Institute. Individual laboratories were funded by the NIH [R01AI076584 (N.J.B., M.W.H.), R21AI101459 (N.J.B., H.A.S.), R21AI094289 and R21AI099528 (I.V.S.), R01AI104956 (F.C.), R01AI085079 (M.A.S.), HHSN272200900039C (S.J.E.), T32GM007757 (J.B.P.)], FNIH through the VCTR program of the Grand Challenges in Global Health Initiative (N.J.B., M.C.F.), Marie Curie International Outgoing Fellowship (R.M.W.), Department of Education GAANN fellowship (A.S.), Richard and Peggy Notebaert Premier Fellowship and Jack Kent Cooke Foundation Graduate Scholarship (R.R.L.), Fralin Life Science Institute of Virginia Tech (I.V.S.), MRC Career Development Award (M.K.L.), European Research Council FP7 (Starting Grant 260897 to F.C.). Author Contributions: Conceived project: N.J.B.; Coordinated project: N.J.B., M.C.F.; Supervised project at the Broad: D.E.N.; Genome alignments: R.M.W.; Genomic data processing, read mapping, development of pipelines for variant calling and scripts for coordinate conversion, mtDNA genome reconstruction: A.S., M.C.F., S.J.E.; Provided access to An. gambiae genomic sequences from Mali: R.R.L.; Window-based phylogeny reconstruction, development and implementation of chromoplots, development and implementation of divergence-based method to infer species branching order; estimation of sequence-based divergence times: J.B.P., M.W.H.; Introgression testing: J.B.P., M.C.F., M.W.H.; Polymorphism and divergence analyses, phylogenomic analyses of nuclear and mtDNA genomes: M.C.F.; Orthology predictions and gene-based phylogenetic reconstruction: R.M.W., Y.-C.W.; Identification of inversion breakpoints, chromosomal inversion phylogeny, estimation of divergence time based on rates of rearrangement: I.V.S., A.B.H., X.J.; Genomic library construction, H.A.S., M.C.F.; functional enrichment analyses: H.A.S., M.C.F., J.B.P.; Ecdysteroid quantification: F.C., E.K., S.N.M.; Molecular evolution: D.E.N., M.K.L., M.A.S.; Wrote paper: N.J.B., M.C.F., M.W.H. with input from other authors.
View Abstract

Navigate This Article