The 5300-year-old Helicobacter pylori genome of the Iceman

See allHide authors and affiliations

Science  08 Jan 2016:
Vol. 351, Issue 6269, pp. 162-165
DOI: 10.1126/science.aad2545

Stomach ache for a European mummy

Five thousand years ago in the European Alps, a man was shot by an arrow, then clubbed to death. His body was subsequently mummified by ice until glacier retreat exhumed him in 1991. Subsequently, this ancient corpse has provided a trove of intriguing information about copper-age Europeans. Now, Maixner et al. have identified the human pathogen Helicobacter pylori within the mummy's stomach contents. The strain the “Iceman” hosted appears to most closely resemble pathogenic Asian strains found today in Central and Southern Asia.

Science, this issue p. 162


The stomach bacterium Helicobacter pylori is one of the most prevalent human pathogens. It has dispersed globally with its human host, resulting in a distinct phylogeographic pattern that can be used to reconstruct both recent and ancient human migrations. The extant European population of H. pylori is known to be a hybrid between Asian and African bacteria, but there exist different hypotheses about when and where the hybridization took place, reflecting the complex demographic history of Europeans. Here, we present a 5300-year-old H. pylori genome from a European Copper Age glacier mummy. The “Iceman” H. pylori is a nearly pure representative of the bacterial population of Asian origin that existed in Europe before hybridization, suggesting that the African population arrived in Europe within the past few thousand years.

The highly recombinant pathogen Helicobacter pylori has evolved to live in the acidic environment of the human stomach (1). Today, this Gram-negative bacterium is found in approximately half the world’s human population, but fewer than 10% of carriers develop disease that manifests as stomach ulcers or gastric carcinoma (2, 3). Predominant intrafamilial transmission of H. pylori and the long-term association with humans has resulted in a phylogeographic distribution pattern of H. pylori that is shared with its host (4, 5). This observation suggests that the pathogen not only accompanied modern humans out of Africa (6), but that it has also been associated with its host for at least 100,000 years (7). Thus, the bacterium has been used as a marker for tracing complex demographic events in human prehistory (4, 8, 9). Modern H. pylori strains have been assigned to distinct populations according to their geographic origin (hpEurope, hpSahul, hpEastAsia, hpAsia2, hpNEAfrica, hpAfrica1, and hpAfrica2) that are derived from at least six ancestral sources (4, 5, 8). The modern H. pylori strain found in most Europeans (hpEurope) putatively originated from recombination of the two ancestral populations Ancestral Europe 1 and 2 (AE1 and AE2) (6). It has been suggested that AE1 originated in Central Asia, where it evolved into hpAsia2, which is commonly found in South Asia. On the other hand, AE2 appears to have evolved in northeast Africa and hybridized with AE1 to become hpEurope (4). However, the precise hybridization zone of the parental populations and the true origin of hpEurope are controversial. Early studies observed a south-to-north cline in AE2/AE1 frequency in Europe (4, 6). This finding has been attributed to independent peopling events that introduced these ancestral H. pylori components, which eventually recombined in Europe since the Neolithic period. More recently, it has been suggested that the AE1/AE2 admixture might have occurred in the Middle East or Western Asia between 10,000 and 52,000 years ago and that recombinant strains were introduced into Europe with the first human recolonizers after the last glacial maximum (7).

In this study, we screened 12 biopsy samples from the gastrointestinal tract of the Iceman, a 5300-year-old Copper Age mummy, for the presence of H. pylori. Stable isotope analyses showed that the Iceman originated and lived in Southern Europe, in the Eastern Italian Alps (10). Genetically, he most closely resembles early European farmers (1113). The Iceman’s stomach was discovered in a reappraisal of radiological data and contains the food he ingested shortly before his death (Fig. 1) (14). The study material included stomach content, mucosa tissue, and content of the small and large intestines (table S1). By using direct polymerase chain reaction (PCR), metagenomic diagnostics, and targeted genome capture (figs. S1 and S2), we determined the presence of H. pylori and reconstructed its complete genome.

Fig. 1 H. pylori–specific reads detected in the metagenomic data sets of the Iceman’s intestine content samples.

The color gradient displays the number of unambiguous H. pylori reads per million metagenomic reads. Control metagenomic data sets of the Iceman’s muscle tissue and of the extraction blank were included in the analysis. The different intestinal content sampling sites are marked in the radiographic image by the following symbols: asterisk, stomach content; circle, small intestine; square, upper large intestine; triangle, lower large intestine. The sampling site of the muscle control sample is highlighted in the Iceman overview picture (diamond).

Metagenomic analysis yielded endogenous ancient H. pylori DNA (15,350 reads) in all gastrointestinal tract contents (Fig. 1 and table S4). A control data set derived from Iceman’s muscle tissue was negative. The distribution of the observed read counts throughout the Iceman’s intestinal tract is similar to that in modern H. pylori–positive humans, with abundance decreasing from the stomach toward the lower intestinal tract (15, 16). The retrieved unambiguous reads were aligned to a modern H. pylori reference genome (strain 26695) and showed damage patterns indicative of ancient DNA (fig. S7) (17). After DNA repair, the H. pylori DNA was enriched up to 216-fold by using in-solution hybridization capture (Agilent) (fig. S5). From this data set, 499,245 nonredundant reads mapped to 92.2% of the 1.6-Mb H. pylori reference genome with an 18.9-fold average coverage (Fig. 2). In comparison with the reference, the Iceman’s ancient H. pylori genome had ~43,000 single-nucleotide polymorphisms (SNPs) and 39 deletions that range from 95 base pair (bp) to 17 kb and mainly comprise complete coding regions. Owing to deletions, the number of genomic variants is slightly below the range of what can be observed between modern H. pylori strains (table S13). The analysis of SNP allele frequencies does not indicate an infection by more than one strain (supplementary materials S6). In addition, as expected for this highly recombinant bacterium, we found evidence for gene insertions from H. pylori strains that differ from the reference genome (details about the InDels are provided in supplementary materials S8).

Fig. 2 Gene coverage and distribution of the enriched and validated Iceman H. pylori reads mapped onto the 1.6 Mb large reference genome H. pylori 26695.

The coverage plot displayed in black is superimposed onto the genomic plot. The bar on the right-hand side indicates a coverage of up to 50×. The gene coding sequences are shown in blue (positive strand) and yellow (negative strand) bars in the genomic plot. The loci of the ribosomal RNA genes, of two virulence genes (vacA and cagA), and of seven genes used for MLST analysis are highlighted in the genome plot.

Subsequent sequence analysis classified the ancient H. pylori as a cagA-positive vacA s1a/i1/m1 type strain that is now associated with inflammation of the gastric mucosa (fig. S11) (18). Using multistep solubilization and fractionation proteomics, we identified 115 human proteins in the stomach metaproteome, of which six were either highly expressed in the stomach mucosa (trefoil factor 2) (19) or present in the gastrointestinal tract and involved in digestion (supplementary materials S10). The majority of human proteins were enriched in extracellular matrix organizing proteins (P = 3.35 × 10–14) and proteins of immune processes (P = 2 × 10–3) (fig. S13). In total, 22 proteins observed in the Iceman stomach proteome are primarily expressed in neutrophils and are involved in the inflammatory host response. The two subunits S100A8 and S100A9 of calprotectin (CP) were detected with the highest number of distinct peptide hits in both analyzed samples. Inflamed gastric tissues of modern H. pylori–infected patients also show high levels of CP subunit S100A8 and S100A9 expression (20, 21). Thus, the Iceman’s stomach was colonized by a cytotoxic H. pylori–type strain that triggered CP release as a result of host inflammatory immune responses. However, whether the Iceman suffered from gastric disease cannot be determined from our analysis owing to the poor preservation of the stomach mucosa (fig. S3).

Comparative analysis of seven housekeeping gene fragments with a global multilocus sequence typing (MLST) database of 1603 H. pylori strains with the STRUCTURE (22) no-admixture model assigned the 5300-year-old bacterium to the modern population hpAsia2, commonly found in Central and South Asia (Fig. 3A and fig. S14). The detection of an hpAsia2 strain in the Iceman’s stomach is rather surprising because despite intensive sampling, only three hpAsia2 strains have ever been detected in modern Europeans. Stomachs of modern Europeans are predominantly colonized by recombinant hpEurope strains. Further analysis with the STRUCTURE linkage model (23), used to detect ancestral structure from admixture linkage disequilibrium, revealed that the ancient H. pylori strain contained only 6.5% [95% probability intervals (PI) 1.5 to 13.5%] of the northeast African (AE2) ancestral component of hpEurope (Fig. 3B). Among European strains, this low proportion of AE2 is distinct and has thus far only been observed in hpAsia2 strains from India and Southeast Asia. In contrast, the three European hpAsia2 strains (Fig. 3B, black arrows) contained considerably higher AE2 ancestries than that of the H. pylori strain of the Iceman (Finland 13.0%, PI 5.9 to 21.7; Estonia 13.2%, PI 6.2 to 22.3; and the Netherlands 20.8%, PI 11.5 to 31.7), although 95% probability intervals did overlap. A principal component analysis (PCA) of the MLST sequences of the hpAsia2, hpEurope, and hpNEAfrica populations revealed a continuum along PC1 that correlates with the proportion of AE2 ancestry versus AE1 ancestry of the isolates (Fig. 3C). The Iceman’s ancient H. pylori was separated from modern hpEurope strains, and its position along PC1 was close to modern hpAsia2 strains from India, reflecting its almost pure AE1 and very low AE2 ancestry.

Fig. 3 Multilocus sequence analyses.

(A) Bayesian cluster analysis performed in STRUCTURE displays the population partitioning of hpEurope, hpAsia2, and hpNEAfrica and the Iceman’s H. pylori strain (details about the worldwide population partitioning of 1603 reference H. pylori strains are available in fig. S14). (B) STRUCTURE linkage model analysis showing the proportion of Ancestral Europe 1 (from Central Asia) and Ancestral Europe 2 (from northeast Africa) nucleotides among strains assigned to populations hpNEAfrica, hpEurope, and hpAsia2 and the Iceman’s H. pylori strain on the extreme right. The black arrows indicate the position of the three extant European hpAsia2 strains. (C) Principal component analysis of contemporary hpNEAfrica, hpEurope, and hpAsia2 strains and the Iceman’s H. pylori strain.

Comparative whole-genome analyses (neighbor joining, STRUCTURE, and principle component analyses) with publicly available genomes (n = 45) confirmed the MLST result by showing that the Iceman’s ancient H. pylori genome has highest similarity to three hpAsia2 genomes from India (figs. S15 to S17). Although the Iceman’s H. pylori strain appears genetically similar to the extant strains from northern India, slight differences were observed along PC2 in both MLST (Fig. 3C) and genome PCAs (fig. S17) and in the neighbor joining tree (fig. S15). To further study genomic-scale introgression, we performed a high-resolution analysis of ancestral motifs using fineSTRUCTURE (24). The resulting linked co-ancestry matrix (Fig. 4) showed that the ancient H. pylori genome shares high levels of ancestry with Indian hpAsia2 strains (Fig. 4, green boxes), but even higher co-ancestry with most European hpEurope strains (Fig. 4, blue boxes). In contrast, the Iceman’s H. pylori shares low ancestry with the hpNEAfrica strain, a modern representative of AE2 (Fig. 4, black box), and with European strains originating from the Iberian Peninsula, where the proportion of AE2 ancestry is relatively high (Fig. 4, white box) (4). Our sample size (n = 1) does not allow further conclusions about the prevalence of AE1 in ancient Europe and the course or rate of AE2 introgression. However, the ancient H. pylori strain provides the first evidence that AE2 was already present in Central Europe during the Copper Age, albeit at a low level. If the Iceman H. pylori strain is representative of its time, the low level of AE2 admixture suggests that most of the AE2 ancestry observed in hpEurope today is a result of AE2 introgression into Europe after the Copper Age, which is later than previously proposed (4, 6). Furthermore, our co-ancestry results indicate that the Iceman’s strain belonged to a prehistoric European branch of hpAsia2 that is different from the modern hpAsia2 population from northern India. The high genetic similarity of the ancient strain to bacteria from Europe implies that much of the diversity present in Copper Age Europe is still retained within the extant hpEurope population, despite millennia of subsequent AE2 introgression.

Fig. 4 Comparative whole-genome analysis.

Co-ancestry matrix showing H. pylori population structure and genetic flux. The color in the heat map corresponds to the number of genomic motifs imported from a donor genome (column) to a recipient genome (row). The inferred tree and the H. pylori strain names are displayed on the top and left of the heat map. Strain names are colored according to the H. pylori population assignment provided in the legend below the heat map. Signs for population ancestry are highlighted in the heat map with green, blue, black, and white boxes.

Supplementary Materials

Materials and Methods

Figs. S1 to S17

Tables S1 to S13

References (2593)

References and Notes

Acknowledgments: We acknowledge the following funding sources: the South Tyrolean grant legge 14 (F.M., N.O.S., G.C., V.C., M.S., and A.Z.), the Ernst Ludwig Ehrlich Studienwerk, dissertation completion fellowship of the University of Vienna (D.T.), the Graduate School Human Development in Landscapes and the Excellence Cluster Inflammation at Interfaces (B.K. and A.N.), the European Research Council (ERC) starting grant APGREID (J.K. and A.H.), the National Institutes of Health from the National Institute of General Medical Sciences under grants R01 GM087221 (R.M.), S10 RR027584 (R.M.), and 2P50 GM076547/Center for Systems Biology (R.M). E. Leproust and O. Hardy are highly acknowledged for their help in the RNA bait design. We thank the sequencing team of the Institute of Clinical Molecular Biology at Kiel University for support and expertise. We are grateful to E. Hütten for proofreading of the main text. We are grateful to Olympus, Italy, for providing us with equipment for endoscopy. F.M. and A.Z. conceived the investigation. F.M., B.K., D.T., R.G., J.K., A.N., Y.M., T.R., and A.Z. designed experiments. P.M., L.E., E.E.V., M.S., F.M., and A.Z. were involved in the sampling campaign. F.M., B.K., M.R.H., J.H., U.K., and G.C. performed laboratory work. F.M., B.K., D.T., A.H., M.R.H., N.O.S., B.L., R.L.M., R.G., J.K., Y.M., and T.R. performed analyses. F.M. wrote the manuscript with contributions from B.K., D.T., A.H., M.R.H., U.K., N.O.S., V.C., B.L., R.L.M., R.G., J.K., Y.M., A.N., T.R., and A.Z. Data are available from the European Nucleotide Archive under accession no. ERP012908. The authors declare no competing interests.

Stay Connected to Science

Navigate This Article