Origins and Genetic Legacy of Neolithic Farmers and Hunter-Gatherers in Europe

See allHide authors and affiliations

Science  27 Apr 2012:
Vol. 336, Issue 6080, pp. 466-469
DOI: 10.1126/science.1216304


The farming way of life originated in the Near East some 11,000 years ago and had reached most of the European continent 5000 years later. However, the impact of the agricultural revolution on demography and patterns of genomic variation in Europe remains unknown. We obtained 249 million base pairs of genomic DNA from ~5000-year-old remains of three hunter-gatherers and one farmer excavated in Scandinavia and find that the farmer is genetically most similar to extant southern Europeans, contrasting sharply to the hunter-gatherers, whose distinct genetic signature is most similar to that of extant northern Europeans. Our results suggest that migration from southern Europe catalyzed the spread of agriculture and that admixture in the wake of this expansion eventually shaped the genomic landscape of modern-day Europe.

The transition from the hunter-gatherer lifestyle to a sedentary farming economy spread throughout Europe from the southeast, starting ~8400 years before the present (yr B.P.) (1). Early population genetic studies argued that clines in genetic variation within Europe favored a model in which this expansion occurred in concert with substantial replacement of resident hunter-gatherer populations (24), contradicting models emphasizing the culturally mediated spread of farming economy (5). Current results from extant populations are inconclusive (4, 6); however, ancient DNA analyses (710) present tentative evidence for population replacement but suffer from uncertainties associated with single-locus studies of the Y chromosome or the mitochondrion. Population genomic analysis of ancient human remains (6, 11) shows promise, but poses technical difficulties due to low DNA yield and the risk of present-day human contamination (6). In addition, geographical and temporal differences between ancient samples make observations of genetic differentiation difficult to interpret. Thus, more robust interpretations of ancient DNA might be gained by analyzing samples from cultural complexes occurring in the same region and during the same time period. In this study, we focused on the Neolithic era of northern Europe, where the relatively late arrival of farming (~6000 yr B.P.) was followed by more than 1000 years of coexistence between hunter-gatherer and farming cultures (12).

We obtained genomic DNA sequences from three samples (Ajv52, Ajv70, and Ire8) from a hunter-gatherer context [“Neolithic hunter-gatherers” associated with the Pitted Ware Culture (PWC)], one sample from a farming context [“Neolithic farmer” associated with the Funnel Beaker Culture or Trichterbecher kultur (TRB)], and two animal remains as contamination controls. The human remains were chosen from a larger panel on the basis of their molecular preservation and previously yielded reproducible single-locus genetic data (8, 13, 14). The Neolithic farmer sample (Gök4) was excavated from a megalithic burial structure in Gökhem parish, Sweden, and has been directly 14C-dated to 4921 ± 50 calibrated yr B.P. (cal yr B.P.), similar to the age (5100 to 4900 cal yr B.P.) of the majority of other finds in the area (15). There were no indications from the burial context suggesting that Gök4 was different from other TRB individuals (15, 16), and strontium isotope analyses indicate that Gök4 was born less than 100 km from the megalithic structure, similar to all other analyzed TRB individuals from the area (17). The three Neolithic hunter-gatherer samples were excavated from burial grounds with single inhumation graves on the island of Gotland, Sweden, for which associated remains have been dated to 5300 to 4400 cal yr B.P. (16).

Indexed libraries were prepared from decontaminated DNA extracts (8, 13) in dedicated ancient DNA facilities and sequenced with the Illumina GAIIx platform (16). The fraction of nonredundant reads that could be mapped (18) to the human genome in the two animal samples was 0.02 to 0.12%, which is substantially lower than the 2.4 to 6.3% found in the human samples (Table 1). Furthermore, the sequences from the human remains showed characteristic features of ancient DNA degradation, including short fragment read length [average ~55 base pairs (bp)], nucleotide misincorporations, and an increased fraction of purines close to sequence read termini (19) (fig. S12). Moreover, the mitochondrial DNA (mtDNA) sequences assembled to ~4- to 14-fold coverage from our shotgun data (Table 1) matched results previously obtained by polymerase chain reaction–based sequencing (8). These lines of evidence point to the presence of substantial amounts of endogenous DNA. After quality filtering, we analyzed 249 million bp of autosomal sequence data and extracted genetic variants from the Neolithic individuals at previously identified single-nucleotide polymorphisms (SNPs) in various reference data sets (Table 1), excluding SNPs that could be affected by postmortem nucleotide misincorporation and randomly sampling a single haploid variant from both ancient and modern individuals (16).

Table 1

Summary of ancient genomic sequence data from Neolithic individuals.

View this table:

Because the incomplete coverage causes little overlap between the typed SNPs in the different Neolithic individuals, we performed principal component analysis (PCA) (20) separately for each Neolithic individual, together with a particular reference panel, and combined PC1 and PC2 loadings from each independent analysis, using a novel approach based on Procrustes transformation (16). We found that compared to a worldwide set of 1638 individuals (2123), all four Neolithic individuals clustered within European variation (fig. S5). However, when the analysis was focused on 505 individuals of European and Levantine descent, the three Neolithic hunter-gatherers appeared largely outside the distribution of the modern sample but in the vicinity of Finnish and northern European individuals (Fig. 1A). In contrast, the Neolithic farmer clustered with southern Europeans but was differentiated from Levantine individuals. This general pattern persisted for a geographically broader reference data set of 1466 extant individuals of European ancestry (22, 24) (Fig. 1B), for a much larger number of markers from 241 individuals in the 1000 Genomes Project (25) (Fig. 1C), and using model-based clustering (26, 27) (Fig. 1D). Although all Neolithic individuals were excavated in Sweden, neither the Neolithic farmer nor the Neolithic hunter-gatherers appeared to cluster specifically within Swedish variation, a pattern that remained also for a larger sample of 1525 individuals from across Sweden (28) (figs. S9, S21, and S22).

Fig. 1

Population genetic structure in Neolithic northern Europe. (A) to (C) show results of PCA from each Neolithic individual and a reference sample combined with Procrustes transformation. Blue symbols denote the Neolithic hunter-gatherers, and the red diamond denotes the Neolithic farmer. (A) Results using 505 individuals from Europe and the Levant genotyped at ~520,000 positions {from HapMap 3 [HM3 (23)]}, the Finnish HapMap [FINHM (22)], and the Human Genome Diversity Panel [HGDP (21)] (16). Centroids are indicated with the population label. (B) Results using 1466 European individuals genotyped at ~280,000 positions {FINHM and the Population Reference Sample [POPRES (24)]} (16). Selected centroids are indicated with the population label. (C) Results using 241 individuals from Europe genotyped at ~2.3 million positions {1000 Genomes Project [1KGPomni (25)]} (16). (D) Population structure in Neolithic individuals and extant individuals from Europe and the Levant inferred by model-based clustering (25). Each individual is shown as a vertical line partitioned into colored components representing the inferred membership in four genetic clusters.

Although several observations suggest that authentic ancient molecules were sequenced, it is possible that some degree of contamination from modern humans could be present in the data. To investigate whether contamination could influence our analyses, we partitioned the data on the basis of the fraction of cytosine deamination toward the ends of the sequence reads (Fig. 2A), a property that has been observed to be enriched in authentic ancient DNA but not in contaminating modern human molecules (19, 29, 30). We divided the data into sequences that had evidence of cytosine deamination in the first 10 bp of the read and those that did not. Examining the first principal component obtained for Neolithic individuals and extant individuals from Italy and Finland, combined by means of Procrustes transformation, we found a robust separation between the Neolithic hunter-gatherers and the Neolithic farmer in both the data with evidence of cytosine deamination and the data without (Fig. 2B).

Fig. 2

Assessing authenticity by population genetic analysis of degraded and nondegraded molecules. Authentic ancient DNA molecules show an increased rate of cytosine-to-thymine (C→T) mismatches due to nucleotide misincorporation, at the 5′ ends of sequences. We use this feature to test whether sequences with evidence of nucleotide misincorporation and sequences without such evidence have different population genetic affinities, thus providing information on the possible influence of potential modern human contamination. (A) T base frequency in the sequence reads at positions where a C is seen in the human reference genome for all four Neolithic individuals. (B) Procrustes-transformed PCAs of data from each Neolithic individual divided into molecules with a C→T mismatch in the first 10 bp (“degraded”: solid red and blue symbols) and molecules without a C→T mismatch in the first 10 bp (“not degraded”: open red and blue symbols). TSI, individuals from Italy; FIN, individuals from Finland. The x axis displays random numbers to aid visualization.

To more closely investigate the genetic similarity of extant European populations (22, 24) to Neolithic humans, we determined for each SNP and each extant population the average frequency of the particular allele found in either the Neolithic hunter-gatherers or the Neolithic farmer (16). The Neolithic hunter-gatherers shared most alleles with northern Europeans, and the lowest allele sharing was with populations from southeastern Europe (Fig. 3A). In contrast, the Neolithic farmer shared the greatest fraction of alleles with southeastern European populations (Cypriots and Greeks) and showed a pattern of decreasing genetic similarity to populations from the northwest and northeast extremes of Europe (Fig. 3B). Individuals from Turkey stand out because of low levels of allele sharing with both Neolithic groups, possibly due to gene flow from outside of Europe, but all other European populations can roughly be represented as a cline in which allele sharing with Neolithic hunter-gatherers is negatively correlated with allele sharing with Neolithic farmers (Fig. 3C).

Fig. 3

Allele sharing between Neolithic and extant Europeans. (A) Interpolated allele sharing between the Neolithic hunter-gatherers (pooled, a star shows the sampling location) and extant European populations (black squares). (B) Allele sharing between the Neolithic farmer (Gök4, a star shows the sampling location) and extant European populations (black squares). (C) In all extant European populations except for Turkey (black disk, excluded from the test of correlation), a high degree of allele sharing with one Neolithic population tends to be associated with a low degree of allele sharing with the other Neolithic population. The population codes are explained in table S10.

We also conducted tests of population topology using genealogical concordance (16), and found that Neolithic hunter-gatherers have the strongest affinity to modern Finnish individuals [FIN and “Late Settlement” FIN (LSFIN) (22)], whereas the Neolithic farmer appears most related to extant Mediterranean European populations (fig. S17). However, in formal tests for admixture (31) that assumed the topology (Outgroup, Neolithic farmer),(X,LSFIN), we found widespread evidence for gene flow between the Neolithic farmer and other European populations (X) for various non-European outgroups (table S14). To estimate the extent of this putative gene flow, we constructed a hypothetical model in which each of 14 modern European populations (21, 22, 28) is a mixture of genetic material from a population that is most similar to LSFIN and a second population more related to the Neolithic farmer (Gök4) than to Levantine Druze (fig. S18). We estimated that people of southern, central and northern Swedish descent are, on average, of 41 ± 8%, 36 ± 7%, and 31 ± 6% Neolithic farmer–related ancestry, respectively (±1 SE). Across Europe, this fraction decreases from 95 ± 13% in Sardinians to 52 ± 8% in the CEU population (individuals of northwestern European descent) and 11 ± 4% in Russians (table S15). Additionally, on the basis of two complete CEU genomes (25), we estimated similar population divergence times (32) between the CEU population and the Neolithic farmer [9.8 thousand years ago (ka); confidence interval (CI), 1.5 to 18.5 ka] and the Neolithic hunter-gatherers (6.5 ka; CI, 1.5 to 11.5 ka) [(16) fig. S15 and table S11], which is consistent with an intermediate fraction of Neolithic farmer–related ancestry in the CEU population.

In our genomic analyses, the Scandinavian Neolithic hunter-gatherers (PWC) have a genetic profile that is not fully represented by any sampled contemporary population (Fig. 1) and may thus constitute a gene pool that is no longer intact or no longer exists. Although the origin of the Neolithic hunter-gatherers is contentious, the similar mtDNA haplogroup composition of PWC individuals (8) (Table 1) and Mesolithic and Paleolithic individuals (7, 29) indicates some continuity with earlier European populations, but resolving this hypothesis will require pre-Neolithic genomic data.

A parsimonious model for explaining our results is that farming practices were brought to northern Europe by a group of people that were genetically distinct from resident hunter-gatherers. The alternative explanation—that Gök4 is not typical of the Neolithic farmer population—appears less likely, based on isotopic analyses and burial context (16, 17). Furthermore, the mtDNA haplogroups of Gök4 and other investigated TRB individuals (8) occur among central European Neolithic farmers associated with the LBK culture [Linearbandkeramik (9)], a culture with close connection to the TRB culture (5). However, the alternative explanation, that Gök4 is a recent migrant, would still suggest long-range migration across the European continent. Although Neolithic farmers associated with cultures other than TRB could potentially have different histories, the observation that Gök4 is genetically most similar to extant populations found in Mediterranean Europe is in contrast (see e.g. Figs. 1 and 3) to an mtDNA study that suggests extant populations in Turkey and the Near East as being genetically most similar to central European Neolithic farmers (9). The genetic affinity of an individual (Gök4) from the northern frontier of the agricultural expansion to southern Europeans suggests persistent barriers to gene flow between resident and colonizing groups during the initial stages of expansion and settlement, barriers that perhaps became more permeable over time. That gene flow between farmer and hunter-gatherer populations, possibly over a long period, eventually gave rise to the present pattern of genetic variation in Europe is also supported by the observation that most European populations appear genetically intermediate to the two Neolithic groups. Regardless of the underlying model, our study provides direct genomic evidence of stratification between Neolithic cultural groups separated by less than 400 km, differentiation that encapsulates the extremes of modern-day Europe and appears to have been largely intact for ~1000 years after the arrival of agriculture. Thus, the genetic composition of contemporary Europeans may have been shaped by prehistoric migration that drove the expansion of agriculture.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S22

Tables S1 to S15

References (3385)

References and Notes

  1. See supplementary materials on Science Online.
  2. Acknowledgments: We thank M. Rasmussen, K. Magnussen, E. Salmela, A. Vargas Velasquez, L. Orlando, and K.-G. Sjögren for technical assistance and discussions; G. Burenhult, L. Drenzel, P. Persson, and J. Norderäng for access to the Ajvide material; and the 1000 Genomes Project for access to population data. Computations were performed at the Swedish National Infrastructure for Computing (SNIC-UPPMAX) under project b2010050. The POPRES data were obtained from dbGaP (accession number phs000145.v1.p1). Per Hall (, Karolinska Institutet, retains governance over the sample collection from Sweden. Supported by the Lars Hierta Memorial Foundation (grant FO2010-0563 to P.S.), the Nilsson-Ehle Donationerna (P.S.), Marie Curie Actions (M.R.), the Danish National Research Council (E.W. and M.T.P.G.), the Royal Swedish Academy of Science (A.G.), and the Swedish Research Council (grant 2009-5129 to M.J.). P.S., A.G., and M.J. conceived and designed the study; M.R. and H.M. performed experiments coordinated by P.S., E.W., M.T.P.G., A.G., and M.J.; P.S. and H.M. processed the data; P.S. and M.J. analyzed the data; J.S. described the archaeology; J.S. and P.H. contributed samples; and P.S., A.G., and M.J. wrote and edited the manuscript with input from all authors. Data are available from the European Nucleotide Archive under accession no. ERP001114 and data aligned to the human reference genome are available at The authors declare no competing interests.
View Abstract

Stay Connected to Science

Navigate This Article