The Shaping of Modern Human Immune Systems by Multiregional Admixture with Archaic Humans

See allHide authors and affiliations

Science  07 Oct 2011:
Vol. 334, Issue 6052, pp. 89-94
DOI: 10.1126/science.1209202


Whole genome comparisons identified introgression from archaic to modern humans. Our analysis of highly polymorphic human leukocyte antigen (HLA) class I, vital immune system components subject to strong balancing selection, shows how modern humans acquired the HLA-B*73 allele in west Asia through admixture with archaic humans called Denisovans, a likely sister group to the Neandertals. Virtual genotyping of Denisovan and Neandertal genomes identified archaic HLA haplotypes carrying functionally distinctive alleles that have introgressed into modern Eurasian and Oceanian populations. These alleles, of which several encode unique or strong ligands for natural killer cell receptors, now represent more than half the HLA alleles of modern Eurasians and also appear to have been later introduced into Africans. Thus, adaptive introgression of archaic alleles has significantly shaped modern human immune systems.

Whether or not interbreeding occurred between archaic and modern humans has long been debated (1, 2). Recent estimates suggest that Neandertals contributed 1 to 4% to modern Eurasian genomes (3), and Denisovans, a likely sister group to the Neandertals, contributed 4 to 6% to modern Melanesian genomes (4). These studies, based on statistical genome-wide comparisons, did not address if there was selected introgression of functionally advantageous genes (5). We explored whether the highly polymorphic HLA class I genes (HLA-A, -B, and -C) (fig. S1) of the human major histocompatibility complex (MHC) are sensitive probes for such admixture. Because of their vital functions in immune defense and reproduction, as ligands for T cell and natural killer (NK) cell receptors, maintaining a variety of HLA-A, -B, and -C proteins is critical for long-term human survival (6). Thus, HLA-A, -B, and -C are subject to strong multiallelic balancing selection, which, with recombination, imbues human populations with diverse HLA alleles and haplotypes of distinctive structures and frequencies (7).

An exceptionally divergent HLA-B allele is HLA-B*73:01 (8, 9). Comparison with the other >2000 (10) HLA-B alleles and chimpanzee and gorilla alleles from the same locus (MHC-B) shows that HLA-B*73:01 is most closely related to subsets of chimpanzee and gorilla MHC-B alleles (11) (figs. S2 to S4). This relation extends throughout a ~9-kilobase (kb) region of the B*73:01 haplotype (Fig. 1A) and defines a deeply divergent allelic lineage (MHC-BII), distinct from the MHC-BI lineage to which other human HLA-B alleles belong. These two lineages diverged ~16 million years ago (Fig. 1B), well before the split between humans and gorillas, but whereas MHC-BI comprises numerous types and subtypes, MHC-BII is only represented in modern humans by B*73:01 (fig. S5). HLA-B*73:01 combines ancient sequence divergence with modern sequence homogeneity, properties compatible with modern humans having recently acquired HLA-B*73:01 through introgression.

Fig. 1

Modern humans acquired HLA-B*73 from archaic humans. (A) The B*73 haplotype contains segments most closely related to chimpanzee and gorilla MHC-B alleles (green) and flanking segments highly related to other HLA-B (blue) (brown segment is related to HLA-C) (fig. S4). (B) B*73’s divergent core has its roots in a gene duplication that occurred >16 million years ago (MYA on figure). (Left to right) MHC-B duplicated and diverged to form the MHC-BI and BII loci. One allele of BII recombined to the BI locus, giving rise to the ancestor of B*73 and its gorilla and chimpanzee equivalents. B*73 is thus the only remnant in modern humans of a deeply divergent allelic lineage. §Mean and 95% credibility interval. (C to E) B*73:01 is predominantly found outside Africa (C) as is C*15:05 (D), which is strongly associated with B*73 in 3676 individuals worldwide (E). (C and D) Color scale bars give allele frequency (af) categories (top number, highest tick mark). Individuals with the B*73 haplotype were categorized on the basis of their geographic origin and the status of the most commonly linked (C*15) and second-most commonly linked (C*12:02) HLA-C alleles (fig. S24). Number sign (#) includes Hispanic-Americans, double number sign (##) includes African-Americans. (F) Archaic admixture (model a) or African origin (model b) could explain the distribution and association of B*73 with C*15:05; simulations favor the former (α = 0.01 to 0.001) (figs. S9 to S11) (11). The large dotted box indicates the part of the models examined by simulation; kya, thousand years ago.

In modern humans, HLA-B*73 is concentrated in west Asia and is rare or absent in other regions (12) (Fig. 1C and fig. S6). This distribution is consistent with introgression of HLA-B*73 in west Asia, a site of admixture between modern and archaic humans (3). Also consistent with introgression is the linkage disequilibrium (LD) between B*73:01 and HLA-C*15:05 (13), an allele having wider distribution than B*73, but concentrated in west and southeast Asia (Fig. 1D).

Worldwide, ~98% of people carrying B*73 also carry C*15:05 (Fig. 1E and fig. S7). In Africans, the LD reaches 100%, but in west Asians, it is weaker (~90%). These data are all consistent with introgression in west Asia of an archaic B*73:01-C*15:05 haplotype that expanded in frequency there, before spreading to Africa and elsewhere. HLA-B*73 is absent from Khoisan-speaking and pygmy populations who likely diverged from other Africans before the Out-of-Africa migration (14) (fig. S8). That Khoisan and pygmies uniquely retain ancient mitochondrial and Y-chromosome lineages (14, 15), as well as MHC-BI diversity (fig. S8), suggests that B*73 was probably not present in any African population at the time of the migration. These data argue for models in which modern humans acquired B*73 by archaic admixture in west Asia and against models in which B*73 arose in Africa and was carried to other continents in the Out-of-Africa migration (Fig. 1F), as do the results of coalescence simulations that implement rejection-based approximate Bayesian inference (16) (α = 0.01 to 0.001) (figs. S9 to S11).

By reanalyzing genomic sequence data (3, 4, 11), we characterized archaic HLA class I content from a Denisovan and three Neandertals. The Denisovan’s two HLA-A and two HLA-C allotypes are identical to common modern allotypes, whereas one HLA-B allotype corresponds to a rare modern recombinant allotype, and the other has never been seen in modern humans (Fig. 2B and fig. S12). The Denisovan’s HLA type is thus consistent with an archaic origin and the known propensity for HLA-B to evolve faster than HLA-A and HLA-C (17, 18).

Fig. 2

Effect of adaptive introgression of Denisovan HLA class I alleles on modern Asian and Oceanian populations. (A) Simplified map of the HLA class I region showing the positions of the HLA-A, -B, and -C genes. (B) Five of the six Denisovan HLA-A, -B, and -C alleles are identical to modern counterparts. Shown at the left for each allele is the number of sequence reads (4) specific to that allele and their coverage of the ~3.5-kb HLA class I gene. Center columns give the modern-human allele (HLA type) that has the lowest number of single-nucleotide polymorphism (SNP) mismatches to the Denisovan allele. The next most similar modern allele and the number of SNP differences are shown in the columns on the right. ¶A recombinant allele with 5′ segments originating from B*40. §The coding sequence is identical to C*15:05:02. (C and D) Worldwide distributions of the two possible Denisovan HLA-A to -C haplotype combinations. Both are present in modern Asians and Oceanians but absent from sub-Saharan Africans. (E to G) The distribution of three Denisovan alleles: HLA-A*11 (E), C*15 (F), and C*12:02 (G), in modern human populations shows they are common in Asians but absent or rare in sub-Saharan Africans. (H) Estimation of divergence times shows that A*11, C*15, and C*12:02 were formed before the Out-of-Africa migration. Shown on the left are the alleles they diverged from, on the right are the divergence time estimates: median, mean, and range.

Not knowing the haplotype phase, we examined all possible combinations of Denisovan HLA-A and HLA-C for their current distribution worldwide. All four combinations are present in Asia and Oceania, but absent from sub-Saharan Africa, and uncommon in Europe (Fig. 2, C and D, and fig. S13). Genome-wide comparisons showed that modern and archaic non-African genomes share only 10 long, deeply divergent haplotypes (3), which are all considerably shorter (100 to 160 kb) than the ~1.3-megabase (Mb) HLA-A-C haplotype (Fig. 2A). Because modern HLA haplotypes diversify rapidly by recombination (1719), it is improbable that the HLA-A-C haplotypes shared by modern humans and Denisovans were preserved on both lines since modern and Denisovan ancestors separated >250 thousand years ago (kya) [~10,000 generations (4)]. More likely is that modern humans acquired these haplotypes by recent introgression from Denisovans [note II.6 in (11)]. Both alternative haplotype pairs are common in Melanesians and reach 20% frequency in Papua New Guinea (PNG), consistent with genome-wide assessment of Denisovan admixture in Melanesians (4). The current distribution of the Denisovan haplotypes (Fig. 2, C and D, and fig. S13) shows, however, that Denisovan admixture widely influenced the HLA system of Asians and Amerindians.

Of the two Denisovan HLA-A alleles (Fig. 2B), A*02 is widespread in modern humans, whereas A*11 is characteristically found in Asians (Fig. 2E) and reaches 50 to 60% frequency in PNG and China, is less common in Europe, and is absent from Africa (fig. S14). This distribution coupled with the sharing of long HLA-A-C haplotypes between Denisovans and modern Asians, particularly Papuans (fig. S13), indicates that Denisovan admixture minimally contributed the A*11:01-C*12 or A*11:01-C*15 haplotype to modern Asians. A*11:01, which is carried by both these archaic haplotypes, is by far the most common A*11 subtype (12). Because HLA alleles evolve subtype diversity rapidly (17, 18), it is highly improbable that A*11:01 was preserved independently in Denisovan and modern humans throughout >250,000 years (4), as would be required if the Out-of-Africa migration contributed any significant amount of A*11. The more parsimonious interpretation is that all modern A*11 is derived from Denisovan A*11 and that, after introgression, it increased in frequency to ~20% to become almost as common in Asia as A*02 at ~24% (11).

Denisovan HLA-C*15 and HLA-C*12:02 are also characteristic alleles of modern Asian populations (Fig. 2, F and G, and fig. S14). At high frequency in PNG, their distribution in continental Asia extends further west than A*11 does, and in Africa, their frequencies are low. C*12:02 and C*15 were formed before the Out-of-Africa migration (Fig. 2H and fig. S15) and exhibit much higher haplotype diversity in Asia than in Africa (fig. S16), contrasting with the usually higher African genetic diversity (20). These properties fit with C*12:02 and C*15 having been introduced to modern humans through admixture with Denisovans in west Asia, with later spreading to Africa (21, 22) (Fig. 1F and fig. S11 for C*15). Given our minimal sampling of the Denisovan population, it is remarkable that C*15:05 and C*12:02 are the two modern HLA-C alleles in strongest LD with B*73 (Fig. 1E). Although B*73 was not carried by the Denisovan individual studied, the presence of these two associated HLA-C alleles provides strong circumstantial evidence that B*73 was passed from Denisovans to modern humans.

Genome-wide analysis showing that three Vindija Neandertals exhibited limited genetic diversity (3) is reflected in our HLA analysis: Each individual has the same HLA class I alleles (fig. S17). Because these HLA identities could not be the consequence of modern human DNA contamination of Neandertal samples, which is <1% (3), they indicate that these individuals likely belonged to a small and isolated population (fig. S18). Clearly identified in each individual were HLA-A*02, C*07:02, and C*16; pooling the three sequence data sets allowed identification of HLA-B*07, -B*51 and either HLA-A*26 or its close relative A*66 as the other alleles (Fig. 3A). As done for the Denisovan, we examined all combinations of Neandertal HLA-A and HLA-C for their current distribution worldwide. All four combinations have highest frequencies in Eurasia and are absent in Africa (Fig. 3, B and C, and fig. S19). Such conservation and distribution strongly support introgression of these haplotypes into modern humans by admixture with Neandertals in Eurasia. The Neandertal HLA-B and -C alleles were sufficiently resolved for us to study their distribution in modern human populations (fig. S20); their frequencies are high in Eurasia and low in Africa (Fig. 3, D to G, and fig. S21). Our simulations of HLA allele introgression predicted the increased frequency and haplotype diversity in Eurasia that we observed (Fig. 1 and fig. S11) and was particularly strong for B*51 and C*07:02 (fig. S22), and the presence of such alleles in Africa was due to back-migrations. Thus, Neandertal admixture contributed B*07-, B*51-, C*07:02-, and C*16:02-bearing haplotypes to modern humans and was likely the sole source of these allele groups. Unlike the distributions of Denisovan alleles, which center in Asia (Fig. 2, E to G), Neandertal alleles display broader distributions peaking in different regions of Eurasia (Fig. 3, D to G).

Fig. 3

Effect of adaptive introgression of Neandertal HLA class I alleles on modern human populations. (A) All six Neandertal HLA-A, -B, and -C alleles are identical to modern HLA class I alleles. Shown at the left for each allele is the number of allele-specific sequence reads (3) and their coverage of the ~3.5-kb HLA gene. Center columns give the modern-human allele (HLA type) having the lowest number of SNP differences from the Neandertal allele. The next most similar modern allele and the number of SNP differences are shown in the columns on the right. Alleles marked with § include additional rare alleles. (B and C) Worldwide distributions of the two possible Neandertal HLA-A to -C haplotype combinations. Both are present in modern Eurasians, but absent from sub-Saharan Africans. (D to G) Distribution of four Neandertal alleles: HLA-B*07:02/03/06 (D), B*51:01/08 (E), C*07:02 (F), and C*16:02 (G), in modern human populations.

Modern populations with substantial levels of archaic ancestry are predicted to have decreased LD (23). From analysis of HapMap populations (20), we find that HLA class I recombination rates are greater in Europeans (1.7- to 2.5-fold) and Asians (2.9- to 7.7-fold) than in Africans, consistent with their higher frequencies of archaic HLA class I alleles (Fig. 4A). Enhanced LD decay correlates with the presence of archaic alleles (Fig. 4B and fig. S23), and the strongest correlation was with HLA-A, for which the six haplotypes exhibiting enhanced LD decay are restricted to non-Africans. These haplotypes include A*24:02 and A*31:01, along with the four archaic allele groups we characterized (A*11, A*26, and two A*02 groups). A*24:02 and A*31:01 are common in non-Africans and thus likely also introgressed from archaic to modern humans. From the combined frequencies of these six alleles, we estimate the putative archaic HLA-A ancestry to be >50% in Europe, >70% in Asia, and >95% in parts of PNG (Fig. 4, C and D). These estimates for HLA-A are much higher than the genome-wide estimates of introgression (1 to 6%), which shows how limited interbreeding with archaic humans has, in combination with natural selection, substantially shaped the HLA system in modern human populations outside of Africa. Our results demonstrate how highly polymorphic HLA genes can be sensitive probes of introgression, and we predict that the same will apply to other polymorphic immune-system genes, for example, those encoding the killer-cell immunoglobulin-like receptors (KIR) of NK cells. Present in the Denisovan genome (11), a candidate KIR for introgression is KIR3DS1*013 (Fig. 4E), rare in sub-Saharan Africans, but the most common KIR3DL1/S1 allele outside Africa (24).

Fig. 4

LD decay patterns of modern HLA haplotypes identify putative archaic HLA alleles. (A) HLA class I recombination rates in Eurasia exceed those observed in Africa. We focused on the three intergenic regions between HLA-A, -B, and -C (left-most column) in the four HapMap populations (center column) (20). Recombination rates were corrected for effective population size (11). (B) Enhanced HLA class I LD decay significantly correlates with archaic ancestry (α = 0.0042) (11). Shown for each HapMap population are (top row) the number of distinct HLA-A alleles present and (second row) the number exhibiting enhanced LD decay [all allele-defining SNPs (correlation coefficient r2 > 0.2) are within 500 kb of HLA-A (31)]. The allele names are listed (rows 3 to 8) and colored green when observed in archaic humans (Figs. 2 and 3) or associated with archaic-origin haplotypes (fig. S25). HLA-B and -C are shown in fig. S23. Dashed line indicates absent in the population. (C) Predicted archaic ancestry at HLA-A [on the basis of the six alleles of panel (B)] for the four HapMap populations and six populations from PNG; for the latter, mean and extreme values are given. (D and E) Worldwide distribution in modern human populations of putative archaic HLA-A alleles (D) and KIR3DS1*013, a putative archaic NK cell receptor (E).

On migrating out of Africa, modern humans encountered archaic humans, residents of Eurasia for more than 200,000 years, who had immune systems better adapted to local pathogens (25). Such adaptations almost certainly involved changes in HLA class I, as exemplified by the modern human populations who first colonized the Americas (17, 18). For small migrating populations, admixture with archaic humans could restore HLA diversity after population bottleneck and also provide a rapid way to acquire new, advantageous HLA variants already adapted to local pathogens. For example, HLA-A*11, an abundant archaic allotype in modern Asian populations, provides T cell–mediated protection against some strains of Epstein-Barr virus (EBV) (26), and in combination with a peptide derived from EBV, is one of only two HLA ligands for the KIR3DL2 NK cell receptor (27). HLA*A11 is also the strongest ligand for KIR2DS4 (28). Other prominent introgressed HLA class I proteins are good KIR ligands. HLA-B*73 is one of only two HLA-B allotypes carrying the C1 epitope, the ligand for KIR2DL3 (29). Prominent in Amerindians, C*07:02 is a strong C1 ligand for KIR2DL2/3 and both B*51 and A*24 are strong Bw4 ligands for KIR3DL1 (30). Such properties suggest that adaptive introgression of these HLA alleles was driven by their role in controlling NK cells, lymphocytes essential for immune defense and reproduction (6). Conversely, adaptive introgression of HLA-A*26, -A*31, and -B*07, which are not KIR ligands, was likely driven by their role in T cell immunity. Adaptive introgression provides a mechanism for rapid evolution, a signature property of the extraordinarily plastic interactions between MHC class I ligands and lymphocyte receptors (6).

Supporting Online Material

Materials and Methods

Figs. S1 to S26

References (3287)

References and Notes

  1. Materials and methods are available as supporting material on Science Online.
  2. Acknowledgments: We thank individual investigators and the Bone Marrow Donors Worldwide (BMDW) organization for kindly providing HLA class I typing data, as well as bone marrow registries from Australia, Austria, Belgium, Canada, Cyprus, Czech Republic, France, Ireland, Israel, Italy, Lithuania, Norway, Poland, Portugal, Singapore, Spain, Sweden, Switzerland, Turkey, United Kingdom, and United States for contributing typing data through BMDW. We thank E. Watkin for technical support. We are indebted to the large genome sequencing centers for early access to the gorilla genome data. We used sequence reads generated at the Wellcome Trust Sanger Institute as part of the gorilla reference genome sequencing project. These data can be obtained from the National Center for Biotechnology Information (NCBI) Trace Archive ( We also used reads generated by Washington University School of Medicine; these data were produced by the Genome Institute at Washington University School of Medicine in St. Louis and can be obtained from the NCBI Trace Archive ( Funded by NIH grant AI031168, Yerkes Center base grant RR000165, NSF awards (CNS-0619926, TG-DBS100006), by federal funds from the National Cancer Institute (NCI), NIH (contract HHSN261200800001E), and by the Intramural Research Program of the NCI, NIH, Center for Cancer Research. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. government. Sequence data have been deposited in GenBank under accession nos. JF974053 to 70.
View Abstract

Stay Connected to Science

Navigate This Article