Chimpanzee genomic diversity reveals ancient admixture with bonobos

See allHide authors and affiliations

Science  28 Oct 2016:
Vol. 354, Issue 6311, pp. 477-481
DOI: 10.1126/science.aag2602

Of chimpanzees and bonobos

Modern non-African human genomes contain genomic remnants that suggest that there was interbreeding between ancient humans and archaic hominoid lineages. Now, de Manuel et al. show similar ancestral interbreeding between the ancestors of today's chimpanzees and bonobos (see the Perspective by Hoelzel). The study also provides population-specific genetic markers that may be valuable for conservation efforts.

Science, this issue p. 477; see also p. 414


Our closest living relatives, chimpanzees and bonobos, have a complex demographic history. We analyzed the high-coverage whole genomes of 75 wild-born chimpanzees and bonobos from 10 countries in Africa. We found that chimpanzee population substructure makes genetic information a good predictor of geographic origin at country and regional scales. Multiple lines of evidence suggest that gene flow occurred from bonobos into the ancestors of central and eastern chimpanzees between 200,000 and 550,000 years ago, probably with subsequent spread into Nigeria-Cameroon chimpanzees. Together with another, possibly more recent contact (after 200,000 years ago), bonobos contributed less than 1% to the central chimpanzee genomes. Admixture thus appears to have been widespread during hominid evolution.

Compared with our knowledge of the origins and population history of humans, much less is known about the extant species closest to humans, chimpanzees (Pan troglodytes) and bonobos (Pan paniscus). Unraveling the demographic histories of our closest living relatives provides an opportunity for comparisons with our own history and thus for studying processes that might have played a recurring role in hominid evolution. Because of a paucity of fossil records (1), our understanding of the demographic history of the Pan genus has primarily relied on population genetic data from mitochondrial genomes (2, 3), nuclear fragments (4, 5), and microsatellites (6, 7). More recently, the analysis of whole-genome sequences from chimpanzees and bonobos has hinted at a complex evolutionary history for the four taxonomically recognized chimpanzee subspecies (8). However, although chimpanzees and bonobos hybridize in captivity (9), the extent of interbreeding among chimpanzee subspecies and between chimpanzees and bonobos in the wild remains unclear.

We analyzed 75 complete genomes from the Pan genus, of which 40 were sequenced for this project to a mean sequence coverage of 25-fold. Our samples span 10 African countries, from the westernmost to the easternmost regions of the chimpanzee range (Fig. 1A). We discovered 32% more variable sites than previously identified (8, 10), highlighting the value of our sampling scheme. Different lines of evidence suggest larger historical effective population sizes in central chimpanzees, including haplotype diversity in each subspecies (fig. S5), Y chromosome diversity (fig. S3), fixation index (FST)–based phylogenies (fig. S16), and genome-wide linkage disequilibrium (fig S6). An analysis of the long-term demographic history with the Pairwise Sequentially Markovian Coalescent model (11) (fig. S17) and a composite-likelihood modeling approach performed to fit the observed joint site frequency spectrum (SFS) (12) indicate a high long-term population size in central chimpanzees (10). The apportionment of genetic diversity among Pan populations reveals that central chimpanzees retain the highest diversity in the chimpanzee lineage, whereas the western, Nigeria-Cameroon, and eastern subspecies harbor signals of population bottlenecks.

Fig. 1 Chimpanzee geography and genetic substructure.

(A) Geographic distribution of Pan populations. Reported coordinates for chimpanzee individuals are shown as circles colored by broad region of origin. Grouping is based on prior information about geographical origin (table S1), with lines connecting to clustered locations within the current range of subspecies. No further coordinates were available for Equatorial Guinea and Nigeria-Cameroon. DRC, Democratic Republic of the Congo. (B and C) PCA plots of chromosome 21 single-nucleotide polymorphism data for (B) central and (C) eastern chimpanzees. PCA coordinates were modified by Procrustes transformation. Samples of unknown origin are colored in gray. Circles, high-coverage genomes; squares, low-coverage genomes; triangles, chromosome 21 captured from fecal samples. These GPS-labeled samples cluster within the range of regional genetic variation reported in whole-genome sampling.

We explored chimpanzee population structure to determine the extent to which genetic information can predict geographic origin. This is important because determining the geographical origin of confiscated individuals can help to localize hotspots of illegal trafficking (13). Principal component analysis (PCA) and population clustering analyses both reveal local stratification in central and eastern chimpanzees (Fig. 1, B and C) (10). Although we could not include enough geolocalized samples to assess fine-scale population structure in Nigeria-Cameroon and western chimpanzees, we expect that similar stratification would be found with broader sampling. To test the accuracy of our predictions, we produced low-coverage whole-genome sequences for six additional individuals whose geographical origins were known (table S1) and sequenced chromosome 21 from four Global Positioning System (GPS)–labeled fecal samples, all from central and eastern chimpanzees (10). The genetic predictions are accurate to the levels of country and region within a country (Fig. 1, B and C). In the future, the origins of confiscated chimpanzees will probably be discernible with sufficient data from reference populations, with implications for the in situ and ex situ management of this species.

Given that multiple events of gene flow between modern and archaic humans have been described (1417), we explored similar evidence of admixture within the Pan genus. In our SFS-based demographic model, we found support for gene flow among chimpanzee subspecies embedded in an improved picture of the complex population history (10, 12). Previously, gene flow between chimpanzees and bonobos was not supported in analyses of low-coverage genomes (18). However, we found that central, eastern, and Nigeria-Cameroon chimpanzees share significantly more derived alleles with bonobos than western chimpanzees do (Fig. 2A and fig. S26). Although an excess of derived allele sharing has been reported previously, it was attributed to greater genetic drift in the western subspecies (6, 19) or described as inconclusive owing to insufficient sampling (20), but using high-coverage data from more individuals allows us to investigate the possibility of migration. Because the chance of derived alleles being introduced through gene flow from bonobos into chimpanzees increases with the frequency in the donor population, alleles that occur at high frequencies are expected to exhibit greater sharing (15, 17). Indeed, derived alleles that occur at high frequencies in bonobos are disproportionally shared with central chimpanzees relative to western chimpanzees (Fig. 2B and fig. S28). Because we used sites with high sequence coverage, we can exclude contamination as a potential source of unequal allele sharing. Furthermore, gene flow should introduce bonobo alleles into chimpanzee populations at low frequencies. We found that these shared derived alleles do segregate at low and moderate frequencies in the nonwestern chimpanzee populations (Fig. 2C and fig. S30). This observation suggests ancient low-level gene flow from bonobos, with a minority of introgressed alleles drifting to moderate frequencies after segregating in the chimpanzee populations—a scenario that is supported by the demographic model, particularly with respect to gene flow into the ancestor of central and eastern chimpanzees (Fig. 3). An alternative explanation for such shared ancestry through incomplete lineage sorting would predict such alleles to drift toward fixation after their separation (10).

Fig. 2 Genome-wide statistics support gene flow between chimpanzees and bonobos.

(A) Population-wise D statistic of the form D(X,Y; bonobos, human). Error bars correspond to two standard errors. Nonwestern chimpanzees share more derived alleles with bonobos than western chimpanzees do. (B) Western and central chimpanzee allele sharing with bonobos, binned by derived allele frequency in bonobos (Dj); bonobo alleles are more often shared with central chimpanzees across bonobo allele frequencies. Top, real data; middle, simulations without gene flow; bottom, simulations of a model with gene flow into nonwestern chimpanzees. Lines of best fit are in gray. (C) Western and central chimpanzee allele sharing with bonobos, stratified by both bonobo and chimpanzee derived allele frequency, calculated at a given frequency in bonobos and at least one of the chimpanzee subspecies (the color gradient represents the extent of sharing). (D) Divergence between chimpanzee subspecies versus minimum divergence to bonobos at sites with bonobo derived allele frequencies of ≥90% in windows of 50 kbp. Error bars represent 95% confidence intervals from 500 bootstrap replicates. Segments in the genomes of central chimpanzees with low divergence to bonobos show high divergence to western chimpanzees. Top, real data; middle, simulated data without gene flow; bottom, simulated data with gene flow.

Fig. 3 Conceptual model of a complex population history.

SFS-based modeling indicates several contacts between chimpanzees and bonobos after their divergence. Split times (thousands of years ago, kya) and migration rates correspond to 95% confidence intervals obtained with the demographic model with western, central, and eastern chimpanzees (10). Gene flow is quantified as the scaled migration rate (2Nm). Red arrows, gene flow from bonobos into chimpanzees. The ancestral population of central and eastern chimpanzees received the highest amount of bonobo alleles, whereas central chimpanzees received additional, more recent gene flow (<200,000 years ago). Blue arrows, highest inferred migrations within chimpanzee subspecies (intense gene flow between central and eastern chimpanzees). α (dotted line), putative ancient gene flow between the ancestors of all chimpanzees and bonobos. β, more recent gene flow from chimpanzees into bonobos. Shaded area, range of estimates across all chimpanzee populations. γ, admixture between Nigeria-Cameroon and central and eastern chimpanzees; indirect gene flow from bonobos into Nigeria-Cameroon chimpanzees might have occurred through these contacts. δ, divergence time between western and Nigeria-Cameroon chimpanzees, estimated by using MSMC2 (10).

If bonobos contributed alleles to chimpanzees, these should be recognizable as introgressed segments in chimpanzees (i.e., regions with unusually low divergence to bonobos and unusually high heterozygosity). Following an approach used to identify gene flow from modern humans into Neandertals (17), we calculated the divergence from bonobos to the chimpanzee alleles that result in the minimum divergence to derived alleles occurring at high frequencies (≥90%) in sequence windows of 50 kilo–base pairs (kbp) (10), and we compared it with the maximum divergence between chimpanzee subspecies. Genomic regions in the nonwestern chimpanzees that are least divergent to bonobos are more divergent to western chimpanzees than vice versa (Fig. 2D and fig. S36). These windows also show an increase in heterozygosity (fig. S37).

We identified discrete putatively introgressed regions in the individual genomes harboring heterozygous bonobo-like and chimpanzee haplotypes (17). We detected almost an order of magnitude more of such haplotypes in central chimpanzees than in western chimpanzees, making up a total of ~2.4% across the genomes of 10 individuals, whereas the amount is smaller in eastern and Nigeria-Cameroon chimpanzees (Fig. 4A and table S6). Furthermore, central chimpanzees carry an excess of haplotypes that do not overlap with any of the other subspecies (P < 0.01, G-test). These regions also show a significant depletion in background selection (21) (P < 0.01, Wilcoxon rank test), suggesting that bonobo alleles might have been disadvantageous in a chimpanzee genetic background (10). This observation, together with the X chromosome not carrying more derived alleles shared between bonobos and nonwestern chimpanzees (fig. S31), resembles patterns in modern and archaic human genomes (16, 22, 23).

Fig. 4 Introgressed segments and inferred age of introgressed haplotypes.

(A) Numbers of putatively introgressed heterozygous segments per population (top of each bar) and percentage of the chimpanzee genome that they constitute (bars). Dark bars represent segments uniquely found in each population; gray bars represent simulated data without gene flow. (B) Age distribution of bonobo-like haplotypes in chimpanzee populations, as estimated by ARGweaver. Chimpanzee subspecies are compared pairwise, and bonobo-like haplotypes are defined as regions of at least 50 kbp that coalesce within the bonobo subtree before coalescing with the other chimpanzee population (inset). Error bars represent 95% confidence intervals across Markov chain Monte Carlo replicates (10).

Further support for a scenario of gene flow between chimpanzees and bonobos is provided by a model-based inference from TreeMix (24) (fig. S24) and the SFS-based demographic models described above (10), which have significantly higher likelihoods when they include multiple gene flow events between species (figs. S50 and S52). The best-fitting models indicate a complex admixture history, including low-level gene flow from bonobos to central chimpanzees and the ancestors of central and eastern chimpanzees, as well as from chimpanzees into bonobos (Fig. 3 and figs. S51 and S54).

We used simulations to test whether these differential allele-sharing patterns would be expected in the absence or presence of gene flow under the demographic history inferred from our models (10). Only a scenario including gene flow reproduced stratified D statistics that were different from zero (fig. S33), and even substantial genetic drift in the western subspecies could not explain the asymmetries in derived alleles shared with bonobos (figs. S34 and S35). Additionally, only models with gene flow from bonobos into the ancestors of central and eastern chimpanzees, and to a lesser extent into Nigeria-Cameroon chimpanzees, reproduced the observed patterns in sequence windows (Fig. 2D and figs. S38 to S43) and heterozygous regions (fig. S45). In sum, the unequal allele and haplotype sharing is unlikely to result from alternative demographic models without gene flow from bonobos into chimpanzee populations. Although alternative models—e.g., with different gene flow events or differences in population size—may explain some features of the data, none of those tested in this study could reproduce all features of the data (10).

Lastly, if gene flow occurred between the Pan species after their separation 1.5 to 2.1 million years ago (Fig. 3), haplotypes younger than this should be shared among them. Using ARGweaver (25), we estimated the age of haplotypes for which one chimpanzee subspecies coalesces within the subtree of bonobos more recently than with another chimpanzee subspecies (10). A fraction of these haplotypes may result from incomplete lineage sorting, but it has been shown that gene flow introduces an excess of young haplotypes into the receiving population (17). We found that central chimpanzees carry 4.4-fold more bonobo-like haplotypes than western chimpanzees do and that these are longer (P < 0.01, Wilcoxon rank test), whereas eastern and Nigeria-Cameroon chimpanzees carry smaller amounts (table S10). These haplotypes are inferred to coalesce 200,000 to 550,000 years ago, consistent with gene flow from bonobos into the ancestors of central and eastern chimpanzees less than 650,000 years ago (Figs. 3 and 4B). The smaller amount of such haplotypes in Nigeria-Cameroon and western chimpanzees might result from subsequent gene flow between chimpanzee populations. Additionally, central chimpanzees carry a slightly larger proportion of younger haplotypes (from 100,000 to 200,000 years ago), supporting another, more recent phase of secondary contact between chimpanzees and bonobos. These estimates agree with the phases of gene flow before and after the split of central and eastern chimpanzees (<180,000 years ago) inferred from our demographic model (Fig. 3) and with the excess of bonobo-like alleles and haplotypes in central chimpanzees. We estimate the overall contribution to individual genomes at less than 1% (10).

Through the analysis of multiple high-coverage genomes, we were able to reconstruct a complex history of admixture within the Pan genus. It appears that there was gene flow from an ancestral bonobo population mostly into nonwestern chimpanzees several hundred thousand years ago. Although we cannot distinguish whether the gene flow occurred at low levels over a long time or in discrete pulses, it seems likely that at least two phases of secondary contact between the two species took place. This study reveals that our closest living relatives experienced a history of admixture similar to that within the Homo clade. Thus, gene flow might have been widespread during the evolution of the great apes and hominins.

Correction (26October 2016): REPORT: "Chimpanzee genomic diversity reveals ancient admixture with bonobos” by M. de Manuel et al. (28 October 2016, p. 477). In the print version of the article, the affiliation given for authors C.T.-S. and Y.X. was incorrect. Their correct affiliation is the Wellcome Trust Sanger Institute (#7). The HTML and PDF have been corrected.

Supplementary Materials

Materials and Methods

Figs. S1 to S58

Tables S1 to S19

References (2690)

Data S1

References and Notes

  1. Materials and methods are available as supplementary materials on Science Online.
Acknowledgments: All sequence data have been submitted to the European Nucleotide Archive and are available under accession code PRJEB15086. We greatly appreciate all the sample providers: Centre de Conservation pour Chimpanzés, Chimfunshi Wildlife Orphanage Trust, Centre de Primatologie, Centre International de Recherches Médicales de Franceville, Jeunes Animaux Confisqués Au Katanga, Ngamba Island Chimpanzee Sanctuary, Sweetwaters Chimpanzee Sanctuary, Stichting AAP, Bioparc Valencia, Edinburgh Zoo, Furuviksparken, Kolmårdens Djurpark, Zoo de Barcelona, Parc le Pal, Zoo Aquarium de Madrid, Zoo Parc de Beauval, and Zoo Zürich. We respectfully thank the Agence Nationale des Parcs Nationaux (Gabon), Centre National de la Recherche Scientifique (Gabon), Uganda National Council for Science and Technology, and the Ugandan Wildlife Authority for permission to collect fecal samples in Gabon and Uganda. We particularly thank K. Prüfer for advice on technical aspects; M. Meyer and A. Weihmann for helping with the capture of fecal samples; M. J. Hubisz for advice on ARGweaver; and J. Bertranpetit, C. Lalueza-Fox, M. Mondal, and S. Han for reading the manuscript and providing valuable comments. M.d.M. is supported by a Formació de personal Investigador fellowship from Generalitat de Catalunya (FI_B01111). M.K. is supported by a Deutsche Forschungsgemeinschaft fellowship (KU 3467/1-1). V.C.S., I.D., and L.E. are supported by Swiss National Science Foundation grants 31003A-143393 and 310030B-16660. T.D. is funded by the Gates Cambridge Trust. O.L. is supported by a Ramón y Cajal grant from Ministerio de Economía y Competitividad (MINECO) (RYC-2013-14797) and MINECO grant BFU2015-68759-P [Fondo Europeo de Desarrollo Region (FEDER)]. P.H. is supported by Estonian Research Council grant PUT1036. J.M.S., A.M.A., and S.C. are funded by the Max Planck Society. J.P.-M., C.T.-S., and Y.X. were supported by The Wellcome Trust (098051). J.M.H.-G. is supported by the María de Maeztu Programme (MDM-2014-0370). A.S. is supported by an Isaac Newton Trust/Wellcome Trust Institutional Strategic Support Fund Joint Research Grant. J.N. had support from a U.S. NIH U01CA198933 grant, and B.M.P. is supported by a Swiss National Science Foundation postdoctoral fellowship. A.N. is supported by MINECO grant BFU2015-68649-P. The collection of fecal samples was supported by the Max Planck Society and Krekeler Foundation’s generous funding for the Pan African Programme. T.M.-B. thanks ICREA; the European Molecular Biology Organization Young Investigator Programme 2013; MINECO grants BFU2014-55090-P (FEDER), BFU2015-7116-ERC, and BFU2015-6215-ERCU01; U.S. NIH grant MH106874; Fundacio Zoo Barcelona; and Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya for the support to his laboratory.
View Abstract

Navigate This Article