A Polytene Chromosome Analysis of the Anopheles gambiae Species Complex

See allHide authors and affiliations

Science  15 Nov 2002:
Vol. 298, Issue 5597, pp. 1415-1418
DOI: 10.1126/science.1077769


Field-collected specimens of all known taxa in the Anopheles gambiae complex were analyzed on the basis of chromosome inversions with reference to a standard polytene chromosome map. The phylogenetic relationships among the seven described species in the complex could be inferred from the distribution of fixed inversions. Nonrandom patterns of inversion distribution were observed and, particularly on chromosome arm 2R, provided evidence for genetically distinct populations in A. gambiae, A. arabiensis, and A. melas. In A. gambiae from Mali, stable genetic differentiation was observed even in populations living in the same region, suggesting a process of incipient speciation which is being confirmed by studies with molecular markers. The possible role of chromosome differentiation in speciation of theA. gambiae complex and in the emergence of distinct chromosomal forms within the nominal species is discussed in relation to human malaria.

The Afrotropical malaria vectorAnopheles gambiae sensu stricto (1) is a member of a group of at least seven closely related, morphologically indistinguishable species known as the A. gambiae complex (2). As do nearly all mosquitoes, they share a mitotic karyotype with two pairs of autosomes and one pair of sex chromosomes. The polytene complement consists of five chromosome arms, with readily discernible correspondence of the banding patterns among the different species [(3–6), see polytene map fig. S1]. Paracentric chromosome inversions are abundant in this complex (7). Ten inversions are fixed in different species in the complex (i.e., found only as inverted homozygotes in natural populations) and can be used to differentiate individual specimens, but more than 120 polymorphic inversions have been detected in natural populations (7, 8). Only those observed in field samples from multiple localities and/or dates have been analyzed for this report (Fig. 1 and table S1).A. gambiae and A. arabiensis, the two species with near-continent-wide distributions in Africa, have the highest number of inversion polymorphisms (7), followed byA. melas, the brackish water breeding species from West Africa (9). Very few inversion polymorphisms have been recorded in A. bwambae and A. quadriannulatusspecies A, and no inversion polymorphisms have been observed in A. quadriannulatus species B and A. merus (10, 11).

Figure 1

The main fixed and polymorphic 2R paracentric inversions in the sibling species of the A. gambiae complex. Fixed rearrangements characterize 2R of A. melas (2Rm, with ↑m indicating the boundaries of the inversion) and A. merus (2Rop, with ↑o and ↑p indicating the boundaries of the overlapping o and p inversions). Brackets indicate polymorphic inversions. In A. gambiae, A. arabiensis, A. bwambae, and A. quadriannulatus(species A from southern Africa) all polymorphic inversions are based on the standard 2R arrangement except for inversions 2Rf and 2Rr in A. arabiensis and 2Rk in A. gambiae. The latter three inversions are based on a preexisting 2Rb inversion and thus are designated 2Rbf, 2Rbr, and 2Rbk. Inversion 2Rc was recorded almost exclusively in combination with 2Rb and 2Ru as 2Rbc or 2Rcu, and 2Re was always found in combination with 2Rb as 2Rbe. Most of the 2R inversions occurred as groups of nonoverlapping, contiguous inversions that resulted in sets of nearly parallel inversion polymorphism (43) in these two species. The A. gambiaeinversion arrangements 2Rbc, 2Rcu, 2Rbcu, 2Rbcd, 2Rjd, 2Rjbcu, and 2Rjcu covered roughly the same regions of 2R as the A. arabiensis arrangements 2Rbc, 2Rbe, 2Rbcd1, 2Rad1, and 2Rabe (the superscript 1 in 2Rd1 identifies an inversion that is morphologically similar but distinct from 2Rd). The 2Rb and 2Rc inversions appear cytologically identical in A. gambiae and A. arabiensis (i.e., with the same telomeric and centromeric breakpoints), so were probably transferred by introgression. Parallel 2R inversion polymorphisms are also found in allopatric populations ofA. melas, where 2Rmm1 (2Rm1 in combination with the fixed 2Rm) is observed in Congo and Angola, 2Rmn1 (2Rn1 in combination with the fixed 2Rm) is present in Benin and Togo, and 2Rmn (2Rn in combination with the fixed 2Rm) is present in The Gambia, Senegal, the Casamance region of Senegal, and Guinea Bissau. Geographical barriers between theseA. melas populations probably consist of sandy or rocky coastal areas without mangroves. The region spanned by 2Rbcu (yellow) is involved in virtually all nonstandard chromosome 2R rearrangements differentiating the members of the A. gambiaecomplex.

The polytene chromosome relationships among taxa are shown diagrammatically in Fig. 2 and fig. S2. Chromosomal differentiation supports independent speciation processes for the saltwater taxa, A. melas in West Africa and A. merus in East Africa, contrasting with their remarkably similar ecologies and morphological characteristics (6,7). Similarly, A. gambiae and A. arabiensis can be regarded as the most similar species ecologically in view of their common adaptation to human environments. This is in marked contrast to the relationship suggested by fixed chromosome inversions, which indicates two independent speciation processes (Fig. 2) (7). Most molecular data are consistent with species relationships inferred from the fixed inversions. The ecological and morphological similarity betweenA. melas and A. merus appears to reflect evolutionary convergence, whereas ecological similarity betweenA. gambiae and A. arabiensis, the two main malaria vectors in the complex, probably reflects both convergence and genetic introgression (2, 12–15).

Figure 2

Diagrammatic representation of chromosomal relationships among the sibling species of the A. gambiaecomplex and their chromosomal forms. Whereas A. quadriannulatus species A is widespread in southern Africa, species B occurs in Ethiopia (44). The inversion nomenclature follows, with minor changes, the convention implemented inDrosophila by Wasserman (45) and Carson and co-workers (46). All recorded inversions in a group refer to the conventional standard and are designated by lowercase letters independently for each chromosomal arm. Consequently, each nonstandard chromosomal sequence is designated by the letter(s) of the inversion(s) involved, following the chromosomal arm in which the rearrangement occurs (e.g., Xbcd, 2Rop, 2La, etc.). The heterozygosity symbol (a/+) shows that inversion “a” is polymorphic, and the notation “+” is used to indicate the standard whole chromosomal arm and/or any intraspecific arrangement alternative to a. Thus, 3R(a/+) shows the coexistence in the same taxon of two alternative whole-arm arrangements, i.e., 3Ra (inverted) and 3R+ (standard), whereas 2Rm(n/+) refers to a polymorphism involving the arrangements 2Rm and 2Rmn. For the sake of simplicity on this figure, we have designated the uninverted or basic sequence of an inversion or group of inversions with the notation “+.” For example, a heterokaryotype for inversion 2La would be written 2L a/+ or 2L +/a. For designating specifically the uninverted or basic sequence of inversion a, we use the notation “+/a” (as seen on the poster) not necessarily corresponding to the whole-arm standard. In the case of multiple, independent (nonoverlapping) inversions a, b, and d on the same chromosomal arm, we use the notation “+/a, +/b, +/d” (for +a, +b, +d on the poster) in order to designate unequivocally the uninverted or standard sequence of each inversion. With respect to this standard sequence, we treat the alternate arrangements of each inversion as alleles at one locus independently from the whole chromosomal arm arrangement. In A. gambiae, the standard “+” arrangement is almost fixed in rain forest samples but polymorphic in all savanna populations. The same polymorphism in Mopti segregates as “+” again as it approaches the rain forest. The two arrows indicate the taxonomic contribution of the molecular markers for the S and M lineages (19).

The nonrandom distribution of inversions within species supports additional taxonomic splitting within A. arabiensis,A. melas, and A. gambiae. These genetic discontinuities within A. arabiensis and A. melasinvolve geographically isolated (allopatric) populations. InA. gambiae populations in Mali, however, three distinct chromosomal forms coexist in time and space (sympatric). The stability of these chromosomal forms of A. gambiae (13,16, 17) is evidence of assortative mating consistent with the hypothesis of reproductively isolated, incipient species. These have been named Bamako, Savanna, and Mopti (17, 18). Each is characterized by different chromosome 2R arrangements, namely, jcu and jbcu for Bamako; bc, u, and + for Mopti; and b, cu, bcu, and + for Savanna. Moreover, 2La is generally fixed or nearly fixed in Bamako and Mopti populations, respectively, yet this inversion is usually found at significantly lower frequencies, around 90%, in sympatric Savanna populations (table S2). Although laboratory crossing experiments reveal no genetic incompatibility among these three cytogenetic forms, chromosomal data from natural populations consistently support the stability of genetic differentiation among them, suggesting the existence of intrinsic mechanisms of reproductive isolation acting at the premating level (13, 17). Only one Bamako/Mopti hybrid heterokaryotype has been detected in field samples, although the expected number, if one assumes random mating, should exceed 2000. Potential Mopti/Savanna and Bamako/Savanna heterokaryotypes have been identified, although at levels that are significantly less than would be expected in the absence of genetic isolating mechanisms (table S3). Molecular studies of these potential hybrids with ribosomal DNA markers “M” and “S” (19) suggest that they are a consequence of the distribution of the 2R arrangements b and bc that, although typical of Savanna and Mopti respectively, are present at very low frequencies in the other taxon as well (20–22).

If one assumes the random occurrence of inversion breakpoints, the expected number of inversions on each polytene chromosome arm would depend on its relative length. However, we have observed that chromosome X, which represents 11% of the total euchromatic complement, has 5 of 10 fixed inversions, whereas 18 of the 31 polymorphic inversions (58%) are on chromosome 2R, which represents less than 30% of the polytene complement (expected = 9.31, Poisson test, P = 0.003, see table S1 and the polytene map, fig. S1). Moreover, breakpoints of at least three different inversions (c, d, and u) are cytologically coincident (Fig. 1, table S1). This nonrandom pattern of inversion distribution strongly suggests that these rearrangements are the product of selection. Greater ecological flexibility and more efficient exploitation of different niches may be achieved through the capture and stabilization within inversions of blocks of coadapted genes. An inversion-based speciation model (23) emphasizes the importance of transitional isolates in geographically or ecologically marginal zones in this process. Transitory population expansions and crashes and the attendant genetic drift and/or strong directional selection pressures would favor genetic mechanisms like inversions that can stabilize novel, adaptive gene associations. Chromosomal inversions can protect part of the genome from recombination in the heterozygous state, whereas inverted homozygotes found in more marginal environments would be capable of further ecological expansions. Very stable associations of sets of 2R inversions have developed in both A. gambiae and A. arabiensis, the two species in the complex with the largest geographic ranges (Fig. 1).

The central part of the chromosomal arm 2R appears to play a major role in both chromosomal changes between species and in the polymorphism within species of the A. gambiae complex. This is the area around inversion c and corresponding to the rearrangements bcu, bcd inA. gambiae or bce, bcd1 in A. arabiensis. The same area is involved in the interspecific fixed inversions 2Rm and 2Rop as well as in the inversion polymorphism characterizing the allopatric populations of A. melas, with inversions n, n1, and m1, all based on 2Rm (Fig. 1). The choice of the oviposition site is the most obvious ecological characteristic correlated with these chromosomal changes. Different chromosomal forms and species have specialized to use larval breeding sites that include pools that are dependent on rain, man-made excavations, agricultural irrigation, or tides (24).

The most obvious ancestor of the A. gambiae complex would have the chromosomal features of an A. quadriannulatus–like taxon. The standard chromosome arrangement of this species occupies a central position relative to other species in the complex (Fig. 2). Furthermore, A. quadriannulatus has several less specialized traits expected of an ancestral form: a large number of hosts and feeding habits associated with ingesting animal blood. Its tolerance for temperate climates and its disjunct distribution (A. quadriannulatus A is found in southern Africa, whereas A. quadriannulatus B is found only in Ethiopia) suggest a wider range during the African Pleistocene pluvial period. Apart from A. quadriannulatus, all other members of the A. gambiaecomplex are vectors of human malaria parasites although their vectorial capacities are much lower than that of A. gambiaeand vary significantly between species (25). A. gambiae seems to be the least likely candidate for the ancestral line, as this highly anthropophilic species appears to be the product of a speciation process driven by human impact on the environment subsequent to the Neolithic revolution. It is difficult to hypothesize the origin of human-adapted A. gambiae in the savanna, as this environment would not favor biting humans rather than animals. If the standard chromosome 2 arrangement is associated with the ancestral form of an anthropophilic A. gambiae, the equatorial forest would be the likely original environment. The high rainfall there would create enormous opportunities for larval breeding, provided the vegetation cover were opened by human agriculture (26–28). Unlike most other anopheline mosquitoes, A. gambiae andA. arabiensis characteristically breed in temporary, sunlit pools with bare soil at the edge. African agriculture initially exploited Sudan-Savanna zones, with some groups approaching the forest fringe. True forest settlements, however, were probably uncommon until about 4000 years ago (29, 30). Mass human penetration in the Central African forest probably started around 2800 to 2500 years ago with a significant change in the African climate, which transformed wide forest areas into savanna and lasted for about 500 years (31, 32). Retention in the regrown forest of pockets of agricultural Bantu groups hosting the A. quadriannulatus–like progenitor may have created the opportunity for the evolution of a highly anthropophilic mosquito. In the rain forest environment, all anthropophilic traits would have been under strong selection as humans represented not only the available host for blood meals but also the biological indicator of unique larval breeding opportunities.

Spreading of anthropophilic A. gambiae from the rain forest into savanna areas was probably achieved also through close association with humans. One hypothetical step favoring this process could have been contact with the savanna-adapted A. arabiensis and the introgression from this species into A. gambiae of chromosome inversions 2Rb and 2La, which conferred adaptive fitness to the drier, savanna environment (14). These two inversions appear to be the most prevalent and ancient alternatives to the standard form of 2R, and polymorphisms of these inversions inA. gambiae populations are widespread throughout Africa, with frequencies of the inverted arrangements increasing with aridity (7, 33). The chromosomal inversions apparently constitute a mechanism for ecotypic differentiation in these A. gambiae populations. As a high chromosomal diversity is expected to involve some degree of niche partitioning and compression (i.e., the restriction of each chromosomal form to a more narrowly defined ecological niche), this should result in increases in such fitness parameters of vectorial capacity (34) as longevity and population density.

The highest level of chromosomal variability in A. gambiaeis observed in south-eastern Senegal; northern Guinea; southern Mali; most of Burkina Faso; and the northern parts of Ivory Coast, Ghana, Togo, and Benin. In these regions of West Africa, there is an increase of 2R polymorphisms in the absence of any clear clinal intergradation of karyotypes. The most reasonable explanation of these polymorphisms is the cytotaxonomic recognition of different sympatric but genetically isolated savanna populations of A. gambiaenamely the Bamako, Savanna, and Mopti cytogenetic forms (20,21). The evolutionary differentiation of these forms is likely to have been extremely recent, because the Bamako and Mopti forms are still quite localized in relation to their potential for spreading to similar environments. All three forms are characterized by high anthropophily and endophily, as would be expected of the splitting of a taxon already adapted to humans.

If one assumes the origin of A. gambiae in the African rain forests during the third millennium before the present, it is only with the penetration of this powerful vector into the savanna areas that the intensity of Plasmodium falciparum transmission could have reached its current levels of highly stable endemicity. The area now showing the highest level of transmission intensity overlaps closely that with the highest level of A. gambiae chromosomal diversity (35). These savanna areas of West Africa also have the highest frequency of hemoglobin C (36), a hemoglobin variant that confers protection against malaria primarily to individuals homozygous for the C allele (37). This is consistent with the assertion by Modiano and co-workers that the ideal epidemiological context for selection of such a protective genetic factor is one with very high rates of malaria transmission (37). The selection among human populations of other genetic traits protective against malaria appears also to be consistent with the above entomological inference because it offers good evidence that mortality due to P. falciparum has been common only within the past 6000 years or less (38, 39). The tremendous rise in malaria transmission that accompanied the speciation of A. gambiae may have influenced the emergence of modernP. falciparum from a less pathogenic, ancestral parasite (27). The large increase in the rate of parasite transmission could have favored the selection of fast-growing, aggressive strains responsible for acute, short-lived infections. Such recent emergence of the pathogenic P. falciparum is supported by at least some of the genetic studies on the parasite (40, 41).

The availability of the complete A. gambiae genome will greatly accelerate study of the evolution of this complex taxon and its siblings. The close linkage between the genome sequence and the polytene chromosome complement (42) is already being used to analyze sequences at the breakpoints of major polymorphic inversions, particularly those that are diagnostic for chromosomal forms. Molecular assays for these inversions will allow analysis of the population genetics and ecology of chromosomal forms to be extended to all life stages, including early instar larvae and adult males in which polytene chromosomes cannot be analyzed directly. However, it will almost certainly be analysis of sets of alleles balanced by inversions covering genes in the central area of 2R that will be among the most important and rewarding challenges for post genomic study of A. gambiae and its siblings.

Supporting Online Material

Materials and Methods

Figs. S1 and S2

Tables S1 to S3

  • * To whom correspondence should be addressed. E-mail: mario.coluzzi{at}


View Abstract

Stay Connected to Science

Navigate This Article