Traces of Human Migrations in Helicobacter pylori Populations

See allHide authors and affiliations

Science  07 Mar 2003:
Vol. 299, Issue 5612, pp. 1582-1585
DOI: 10.1126/science.1080857


Helicobacter pylori, a chronic gastric pathogen of human beings, can be divided into seven populations and subpopulations with distinct geographical distributions. These modern populations derive their gene pools from ancestral populations that arose in Africa, Central Asia, and East Asia. Subsequent spread can be attributed to human migratory fluxes such as the prehistoric colonization of Polynesia and the Americas, the neolithic introduction of farming to Europe, the Bantu expansion within Africa, and the slave trade.

Geographic subdivisions exist for a variety of human pathogens and commensals, including JC virus (1), Mycobacterium tuberculosis(2), Haemophilus influenzae (3), andHelicobacter pylori (4–8). H. pylori, a Gram-negative bacterium that colonizes the human gastric mucosa for decades and does not spread epidemically (9), has the potential to be informative about human migrations (10). Sequence diversity within H. pylori is greater than that of most other bacteria (4) and about 50-fold greater than that of human beings (11). Furthermore, frequent recombination between different H. pylori strains (12–14) implies that only partial linkage disequilibrium exists between polymorphic nucleotides within genes (15), which increases the information content for population genetic analysis. In this report, we use a population genetic tool that we have developed (16) on a large, global sample of H. pylori isolates to define modern populations and reconstruct their ancestral sources.

Previous data with 20 H. pylori isolates from East Asia, Europe, and Africa show that the sequences of fragments of seven housekeeping genes and one virulence-associated gene (vacA) differ according to the continent of origin (4). We sequenced the same fragments from 370 strains isolated from 27 geographical, ethnic, and/or linguistic human groupings (Table 1). Of the 3850 nucleotides sequenced for each isolate, 1418 were polymorphic and were used to define bacterial populations (15).

Table 1

Assignments of 370 H. pyloriisolates from diverse continents to seven populations and subpopulations by Structure (no admixture). Language classifications were based on the Ethnologue online database (www. The two strains ofH. pylori whose genomes have been sequenced belong to hpEurope (26695, isolated in the UK) and hpAfrica1 (J99, isolated from a white American in Tennessee).

View this table:

The program Structure (16,17) implements a Bayesian approach for deducing population structure from multilocus data by a variety of models, including the no-admixture model, which assumes that each individual has derived all of its ancestry from only one population. We used this model to identify four modern populations (15), designated hpAfrica1, hpAfrica2, hpEastAsia, and hpEurope on the basis of their current distributions (Table 1 and Fig. 1A). Further analyses split hpEastAsia into the hspAmerind, hspEAsia, and hspMaori subpopulations, and hpAfrica1 into hspWAfrica and hspSAfrica (Fig. 1B). These results confirm and extend previous data showing geographical subdivisions (4, 7, 8).

Figure 1

Relationships between modern populations (A), modern subpopulations (B), and ancestral populations (C) of H. pylori. The black lines show neighbor-joining population trees as measured byδ̂, the net nucleotide distance between populations (15). The circle diameters indicate their genetic diversity, measured as the average genetic distance between random pairs of individuals. The larger circles in (A) versus (C) reflect the effects of admixture between ancestral populations. Filled arcs reflect the number of isolates (A and B) or nucleotides (C) in each population. Color coding is consistent in different parts of the figure, except for modern hpEurope, which is an admixture between the ancestral AE1 and AE2 populations. Scales are at lower right.

Almost all H. pylori strains isolated from various countries in East Asia were assigned to the hspEAsia subpopulation. The hspMaori subpopulation was isolated exclusively from Maoris and other Polynesians in New Zealand, whereas the hspAmerind strains were isolated from Inuits and from Amerinds in North and South America.

The hspSAfrica and hpAfrica2 populations were found only in South Africa, where they made up a majority of the strains isolated. The hspWAfrica strains were found at low frequency in South Africa but at high frequency in West Africa and also in the Americas, particularly among African Americans in Louisiana and Tennessee. The hpEurope population contained almost all H. pylori from Europeans as well as from Turks, Israelis, Bangladeshis, Ladakhis, and Sudanese. These bacteria were also isolated from the Americas and Australia, and from whites, blacks, and Cape Coloured in South Africa, where they were predominantly associated with whites.

The current global sample is still incomplete, and additional isolates from large parts of Asia and Africa and from aboriginal groups around the world will be needed to determine whether additional populations exist. However, our definition of seven modern populations and subpopulations provides a solid basis for deducing the global patterns of spread of H. pylori with their human hosts.

Our attempts to define subpopulations by the same method among the 200 hpEurope isolates were not successful because of inconsistent clustering (15). We hypothesized that this inconsistency reflected the complex history of Europe, which was populated in several independent waves of migration (18) of unknown genetic composition (19). We have therefore developed an approach, the linkage model in Structure, that can reconstruct ancestral populations even after substantial genetic hybridization (16). This approach uses the mosaic ancestry of genomes within breeding species, assigning individual nucleotides to ancestral populations on the basis of their linkage to neighboring nucleotides.

Analysis of the global H. pylori sample with the linkage model defined five ancestral populations (15), which we named ancestral Africa1, Africa2, EastAsia, Europe1 (AE1), and Europe2 (AE2) (Fig. 1C). H. pylori strains within modern hpEurope are recombinants between AE1 and AE2 bacteria. No single isolate possesses more than 80% estimated ancestry from either of these populations (fig. S1); instead, each genome is a mosaic of multiple small chromosomal chunks (Fig. 2, F and G; fig. S2). In contrast, the other populations are more homogeneous. Despite clear evidence for occasional import (Fig. 2, C and D), many isolates have derived 85 to 98% of their nucleotides from the ancestral population (Fig. 2, A and B; fig. S2).

Figure 2

Ancestral sources of individual nucleotides in eight selected isolates. The origin of each polymorphic nucleotide (colors as in Fig. 1C) is shown for each of the eight gene fragments. The geographical sources of each isolate are shown above each graph.

Recombination between populations alters their genetic distances and blurs the branching order of trees (20). The ability to infer nucleotide pools in ancestral populations now allows more accurate estimates of ancestral relationships and evolutionary history. The ancestral population tree (Fig. 1C) suggests that Africa2 evolved before the other populations split and that AE1 and ancestral East Asia diverged from each other most recently. Additional detailed analyses (15) support these inferences.

Knowledge of ancestral gene pools also allows inferences about gene flow between populations. The high diversity in hpEurope (Fig. 1A) is due to fusion between AE1 and AE2. Within our sample, the proportion of AE1 nucleotides is highest in Finland, Estonia, and Ladakh (Fig. 3A). However, all European isolates also possess AE2 nucleotides, but only 3 of 17 isolates from Ladakh do so (fig. S1). Similarly, AE2 nucleotides are most frequent in Spain, Sudan, and Israel, but the isolates from Sudan and Israel possess lower levels of AE1 than do European isolates. Thus, AE1 and AE2 probably reached Europe from different sources, AE1 primarily from the direction of central Asia and AE2 primarily from the Near East and North Africa.

Figure 3

Putative modern and ancient migrations of H. pylori. (A) Average proportion of ancestral nucleotides by source. Numbers correspond to the codes in Table 1 and colors are as in Fig. 1C. (B) Interpretation. Arrows indicate specific migrations of humans and H. pylori populations. BP, years before present.

Further reconstruction of the history of H. pylori is best done in the context of current knowledge about human migration. As with a human population tree (21), hpEurope derives from a short central branch between hpEastAsia and hpAfrica1 (Fig. 1A), hinting at a parallel history of intercontinental gene flow to Europe for humans and bacteria. Furthermore, the relative contribution of AE2 versus AE1 correlates significantly with the first principle component of European human variation (table S1), which is thought to reflect the entry of neolithic farmers into Europe from the Near East (20). The second principle component has been tentatively attributed to the migratory fluxes that brought Uralic languages to Europe, and indeed correlated weakly with AE1 versus AE2 (r = 0.6, P = .13) (table S1). It seems that neither AE1 nor AE2 was harbored by the original Paleolithic hunter-gatherers in Europe, because considerable AE1 or AE2 ancestry is found outside Europe, whereas paleolithic Y-chromosome haplotypes are largely restricted to Europe (18).

Known human migrations can also explain the spread of hpEastAsia and hpAfrica1 populations (Fig. 3B). Current models (22, 23) agree that speakers of Austronesian languages (Maoris and other Polynesians) arrived in New Zealand after sequential island-hopping that is likely to have resulted in repeated human population bottlenecks. Indeed, consistent with population bottlenecks, the genetic diversity within the hspMaori sample is extremely low (Fig. 1), and the pattern of nucleotide polymorphisms within subpopulations implies that there has been strong drift in the evolution of the hspMaori population (15) (fig. S3). The isolation of hpEastAsia from Native Americans (7,8) can be similarly explained by hpEastAsia's being carried during the colonization of the Americas that began at least 12,000 years ago. Unlike hspMaori, hspAmerind did not show signs of strong drift, implying that H. pylori accompanied the ancestors of modern Amerinds and Inuits in large numbers of individuals and/or was introduced on multiple occasions.

The high degree of similarity between hspWAfrica and hspSAfrica (Fig. 1B, fig. S3) is concordant with the low genetic distances (20) observed between speakers of the Niger-Congo family of languages and is consistent with hspSAfrica's being carried to Southern Africa during the rapid expansion of Bantu farmers from central West Africa (24). Given this scenario, one possibility to account for the extremely distinct hpAfrica2 population is that they colonized the Khoisan hunter-gatherer inhabitants of Southern Africa, who fall on one of the deepest branches of an African human population tree (20) and are very distinct from Bantu.

Modern migrations of slaves from West Africa to the Americas and of Europeans to South Africa, the Americas, and Australasia are probably responsible for the current existence of hspWAfrica and hpEurope in these and other locations (Table 1). According to this interpretation, the past few centuries since modern human migrations were too short for the distinctions between multiple bacterial populations to become blurred.

The assignments of particular human migrations to migrations ofH. pylori populations can allow dating of the bacterial population tree by archaeological events. The five ancestral populations existed before the separation of hspAmerind from the other hpEastAsia populations (Fig. 1, B and C), which is estimated to have occurred at least 12,000 years ago. Accordingly, H. pylorihas probably accompanied anatomically modern humans since their origins. The high sequence diversity in H. pylori allows the recognition of distinct populations after centuries of coexistence in individual geographic locations, as demonstrated in the Americas and South Africa. Even after thousands of years of contact in Europe between bacteria introduced by distinct waves of migration, residual short-range linkage disequilibrium has allowed us to identify ancestral chunks of chromosome. Thus, analysis of H. pylori from human populations could also help resolve details of human migrations.

Elucidation of the pattern of population subdivision is also of medical relevance (25). Geographically variable results regarding the association of putative virulence factors with disease (26) might well reflect differences in the local prevalence of the individual H. pylori populations. Similarly, the development of diagnostic tests, antibiotics, and vaccines needs to account for global diversity and will be aided by the availability of representative isolates.

Supporting Online Material

Materials and Methods

Supporting Text

Figs. S1 to S3

Tables S1 and S2


  • * To whom correspondence should be addressed. E-mail: achtman{at}


View Abstract

Navigate This Article