Extensive Nuclear DNA Sequence Diversity Among Chimpanzees

See allHide authors and affiliations

Science  05 Nov 1999:
Vol. 286, Issue 5442, pp. 1159-1162
DOI: 10.1126/science.286.5442.1159


Although data on nucleotide sequence variation in the human nuclear genome have begun to accumulate, little is known about genomic diversity in chimpanzees (Pan troglodytes) and bonobos (Pan paniscus). A 10,154–base pair sequence on the chimpanzee X chromosome is reported, representing all major subspecies and bonobos. Comparison to humans shows the diversity of the chimpanzee sequences to be almost four times as high and the age of the most recent common ancestor three times as great as the corresponding values of humans. Phylogenetic analyses show the sequences from the different chimpanzee subspecies to be intermixed and the distance between some chimpanzee sequences to be greater than the distance between them and the bonobo sequences.

To place the genomic variation in humans in a relevant evolutionary perspective, it is necessary to study nuclear DNA sequence variation in the African apes. Also, intraspecific variability in the apes is relevant to the understanding of physiological and cultural differences between and within species. For example, it has been shown that chimpanzee populations differ in behavior (1); to assess whether these differences could be due to genetic factors, it is important to study the variation in the nuclear gene pools of chimpanzee populations.

Studies on genomic diversity in chimpanzees have yielded contradictory results. Although some loci involved in the immune response show higher diversity in chimpanzees than in humans, other show less (2). Similarly, mitochondrial DNA (mtDNA) sequences in chimpanzees are more variable than those in humans (3–5), whereas microsatellites are less so (5,6). The latter observation has been attributed to both a putative overall shorter length of chimpanzee microsatellites (6) and an ascertainment bias resulting from studying microsatellites originally selected to be variable in humans (7).

Noncoding DNA at Xq13.3 is well suited for obtaining an initial view of the variation in the nuclear genome of the apes. First, a worldwide study of human variation at Xq13.3 is available (8), allowing a direct comparison between humans and the great apes. Second, this locus is noncoding and therefore unlikely to be the direct target of selection. Third, its low mutation rate (8), combined with a low recombination rate (9), allows evolutionary analyses to be performed without much influence from multiple substitutions and recombination events.

Because polymorphism in ancestral populations may cause different parts of the human genome to be most closely related to either chimpanzees or gorillas (10), we first determined how the African ape species are related to humans at Xq13.3. A phylogenetic tree was estimated with a maximum likelihood approach (11) with human, chimpanzee, bonobo, gorilla, and orangutan sequences (Fig. 1). In this tree, humans fall together with the chimpanzee and bonobo to the exclusion of the gorilla. Thus, the analysis of Xq13.3 agrees with most other studies in identifying the chimpanzees and bonobos as the closest relatives of humans (12). Consequently, we decided to study intraspecific variation at Xq13.3 in these species.

Figure 1

Phylogenetic tree (11) relating the human and great ape Xq13.3 sequences. The sequences used are from a Sumatran orangutan (Pongo pygmaeus abelii), a western lowland gorilla (Gorilla gorilla gorilla) (8), a human (Buriat) (8), and the bonobo “B4” and western African chimpanzee “W2” from this study. Numbers refer to “Puzzle” reliability values in percent (11).

First, we investigated whether the Xq13.3 region evolves at a constant rate among humans and apes. The human, chimpanzee, bonobo, and gorilla sequences differ at 287 to 296 positions from the orangutan, indicating a similar overall evolutionary rate in humans and the African great apes. A test comparing the likelihoods of trees reconstructed with and without a clock assumption (11) confirms that the sequences evolve at a constant rate. Thus, patterns of intraspecies diversity of humans and chimpanzees cannot be attributed to differences in evolutionary rates. This finding does not support the hypothesis of a general slow-down in evolutionary rates on the human lineage (13).

To obtain an overview of the chimpanzees' diversity at Xq13.3 (14), we sequenced about 10,000 base pairs (bp) (15) from 30 chimpanzees representing the three currently recognized major subspecies: central African chimpanzees (Pan troglodytes troglodytes), western African chimpanzees (P. troglodytes verus), and eastern African chimpanzees (P. troglodytes schweinfurthii) (4,16). In addition, we determined the homologous sequence in five bonobos. Among the chimpanzees, we identified 84 variable positions defining 24 different sequences (Fig. 2). This result can be compared with humans (8), for which only 33 variable positions (20 sequences) were found when more than twice as many individuals (n = 70) were sequenced. The mean pairwise sequence difference (MPSD) among chimpanzees is 0.13%, about four times that of the human sequences (0.037%) (Fig. 3). The central African chimpanzees, which carry 64 out of 84 variable positions observed and have an MPSD of 0.18%, contribute most to the high variation in chimpanzees, whereas western African chimpanzees carry 23 variable positions (MPSD = 0.05%). This result is in contrast to mtDNA, for which western African chimpanzees show the greatest diversity (6.2%), whereas diversity is lower in central African chimpanzees (4.7%) (Fig. 3). Additional nuclear and mtDNA studies are needed to exclude that this discrepancy is due to sampling differences.

Figure 2

Variable positions of the chimpanzee and bonobo (B) DNA sequences (31) and the corresponding nucleotides of humans and a gorilla. The geographical origin of the chimpanzees is indicated as central (C), eastern (E), and western (W) (32). On top, the homologous gorilla (Go) and human (Hu) sequences, as well as the chimpanzee/bonobo consensus sequence (Con), are given. “d” indicates a deletion, and “K” indicates that either a guanine (G) or thymine (T) nucleotide is present at that position. Numbers on the right designate the different sequences (1 to 27).

Figure 3

MPSD given as numbers of differences per sequence [Xq13.3: 10,154 bp (chimpanzees) and 10,163 bp (humans) (8); mtDNA: 320 bp (chimpanzees) (33) and 360 bp (humans) (21)] (34). Abbreviations: C, central African chimpanzees; W, western African chimpanzees; E, eastern African chimpanzees.

A possible explanation for the lower diversity observed in humans relative to chimpanzees is a selective event that would have reduced the variation at Xq13.3 in humans. However, when a test for selection (17) is used to compare the variation at Xq13.3 with that at three other nuclear loci for which human population sequence data are available (18), no indication of selection is detected. Furthermore, when the nucleotide diversity observed at Xq13.3 in humans (0.037%) is compared with seven other loci on the X chromosome (19), it is higher than three loci, identical to one of them, and lower than the remaining three. Thus, the variation observed at Xq13.3 in humans seems to be similar to the variation at other loci on the X chromosome. Lower genetic diversity in humans than in chimpanzees has also been observed in a survey by denaturing gradient gel electrophoresis of a 1000-bp segment of the chimpanzee HOXB6 locus (20), as well as for mtDNA (21), which carries about three times as much variation in chimpanzees as it does in humans (Fig. 3). Therefore, the results from Xq13.3 are likely to reflect a generally higher diversity in the chimpanzee genome than in the human genome and therefore to be the result of a difference in population history between the species, for example, a recent founder effect in humans (22).

To estimate the extent to which recombination or parallel mutation events (or both) have shaped the sequences observed in the chimpanzees, we used a test (23) based on the assumption that if no recombination occurs, the minimum number of substitutions required in a maximum parsimony tree should not be different from the number of variable positions in the data. Because the number of substitutions in a tree relating the Xq13.3 sequences (97) exceeds the number of variable positions (84) by 13, recombination or parallel substitutions have occurred. However, no reshuffling of large blocks of sequence is apparent in the data (Fig. 2). In view of this, as well as of the low recombination rate at Xq13.3 (9), recurrent mutations (or gene conversion) may predominate over recombination events. Therefore, a coalescent approach (24) that allows for parallel substitutions was used to estimate the time to the most recent common ancestor (MRCA). Assuming a separation of the chimpanzee and human lineages of 5 million years, the chimpanzee effective population size (N e) was estimated to be 35,000 and the age of the MRCA to be 2,100,000 years (95% confidence interval: 1,400,000 to 3,300,000 years). When the MRCA for the human sequences was similarly estimated, it was found to be 675,000 years (95% confidence interval: 525,000 to 975,000 years) (N e = 11,000), indicating that the genetic history of the chimpanzee nuclear genome is about three times as deep as that of the human genome.

When the MRCAs of the two chimpanzee subspecies for which multiple samples were available were estimated, they were found to be 1,755,000 years (95% confidence interval: 915,000 to 3,660,000 years) for the central chimpanzees and 502,000 years (95% confidence interval: 270,000 to 1,010,000 years) for the western chimpanzees. For the bonobo sequences, a small effective population size (N e = 4600) and a recent MRCA (277,000 years; 95% confidence interval: 70,500 to 1,180,000 years) were estimated. However, the small sample size (n = 5) makes any conclusions regarding the bonobos highly tentative.

In a phylogenetic tree (Fig. 4), central African chimpanzees are more widely distributed than the other subspecies. Furthermore, the first two branches in the chimpanzee tree lead to exclusively central African chimpanzee sequences. Thus, central African chimpanzees carry the oldest chimpanzee lineages. However, the subspecies are highly intermixed. For example, the single eastern chimpanzee sequence falls within a clade containing a central as well as western chimpanzees. Furthermore, one sequence is identical between a western (W17) and a central African chimpanzee (C1). Thus, for Xq13.3, monophyly of the subspecies is not observed. This result is in contrast to mtDNA (3, 4), for which the different subspecies form monophyletic clades. A likely reason for this discrepancy is that the effective population size for X-linked sequences is three times as great as that of mtDNA, which is only maternally inherited. Other factors being equal, it is therefore expected to take three times as long to achieve monophyly for Xq13.3 as for mtDNA. Thus, it is likely that the separation of the chimpanzee subspecies postdates the variation at Xq13.3 but predates the variation in mtDNA. In addition, it is possible that gene flow among chimpanzee subspecies has contributed to the intermixing of nuclear genetic lineages.

Figure 4

Phylogenetic tree of chimpanzee and bonobo Xq13.3 sequences (35). A human sequence (“Chukchi”) (8) was used as an outgroup. Letters and numbers identifying the individuals refer to Fig. 2. “**” indicates branches of significantly positive length (P = 0.01). The scale indicates the length of the branches in units of expected nucleotide substitutions per site (×10).

The absence of subspecies-specific clusters is notable in view of mtDNA studies indicating that chimpanzee subspecies are very old. For example, the estimated time for the divergence of western chimpanzees from the other two subspecies based on mtDNA is 1.58 million years (4). In fact, the mtDNA results have been used to suggest that western chimpanzees might be elevated to species status provided the mtDNA results could be confirmed by nuclear loci (4). The intermixing of Xq13.3 lineages is evidence against a long independent genetic history of the chimpanzee subspecies. This finding is interesting in view of the fact that it is difficult or impossible to distinguish members of the different subspecies on the basis of morphological characters (25). There seems to be no obvious correlation between different chimpanzee “cultures” and the geographical location or subspecies of the groups studied (1). This supports the notion that different behavioral traits observed in different chimpanzee groups are transmitted culturally rather than genetically.

Bonobos are monophyletic in the Xq13.3 tree (Fig. 4). This is consistent with a recent evolutionary history distinct from chimpanzees (26). However, there are many chimpanzees that differ from each other at 22 to 29 positions, whereas chimpanzees differ from the bonobos at only 13 to 23 positions. Thus, some chimpanzees are more distant from each other than they (or other chimpanzees) are from bonobos. Moreover, only seven nucleotide differences (Fig. 2) are unique to the bonobos. On the basis of the mutation rate at Xq13.3 (8) of about one mutation per 100,000 years and the average number of substitutions observed between the species, we calculated a divergence time between chimpanzees and bonobos of 930,000 years (range: 690,000 to 1,220,000 years). This period is shorter than the mtDNA and the β-globin gene estimates of 2,500,000 years (27, 28) and 2,780,000 years (29), respectively. Reports that bonobos and chimpanzees can interbreed (30) are relevant to this finding, because they raise the possibility that certain loci, for example, Xq13.3, may have crossed the “species barrier” much later than other loci. Consequently, not only chimpanzee subspecies, but also bonobos and chimpanzees, may have an intermixed genetic relationship.

  • * To whom correspondence should be addressed. E-mail: kaessmann{at}


View Abstract

Navigate This Article