Ancient genomes revisit the ancestry of domestic and Przewalski’s horses

See allHide authors and affiliations

Science  06 Apr 2018:
Vol. 360, Issue 6384, pp. 111-114
DOI: 10.1126/science.aao3297

Revisiting the origins of modern horses

The domestication of horses was very important in the history of humankind. However, the ancestry of modern horses and the location and timing of their emergence remain unclear. Gaunitz et al. generated 42 ancient-horse genomes. Their source samples included the Botai archaeological site in Central Asia, considered to include the earliest domesticated horses. Unexpectedly, Botai horses were the ancestors not of modern domestic horses, but rather of modern Przewalski's horses. Thus, in contrast to current thinking on horse domestication, modern horses may have been domesticated in other, more Western, centers of origin.

Science, this issue p. 111


The Eneolithic Botai culture of the Central Asian steppes provides the earliest archaeological evidence for horse husbandry, ~5500 years ago, but the exact nature of early horse domestication remains controversial. We generated 42 ancient-horse genomes, including 20 from Botai. Compared to 46 published ancient- and modern-horse genomes, our data indicate that Przewalski’s horses are the feral descendants of horses herded at Botai and not truly wild horses. All domestic horses dated from ~4000 years ago to present only show ~2.7% of Botai-related ancestry. This indicates that a massive genomic turnover underpins the expansion of the horse stock that gave rise to modern domesticates, which coincides with large-scale human population expansions during the Early Bronze Age.

Horses revolutionized human mobility, economy, and warfare (1). They are also associated with the spread of Indo-European languages (2) and new forms of metallurgy (3) and provided the fastest land transport until modern times. Together with the lack of diachronic changes in horse morphology (4) and herd structure (5, 6), the scarce archaeological record hampered the study of early domestication. With their preponderance of horse remains, Eneolithic sites (fifth and fourth millennia BCE) of the Pontic-Caspian steppe (2, 7) and the northern steppe of Kazakhstan (6, 8) have attracted the most attention.

We reconstructed the phylogenetic origins of the Eneolithic horses associated with the Botai culture of northern Kazakhstan, representing the earliest domestic horses (6, 8). This culture was characterized by a sudden shift from mixed hunting and gathering to an extreme focus on horses and larger, more sedentary settlements (5). Horse dung on site (6), as well as evidence for poleaxing and against selective body-part transportation, suggests controlled slaughter at settlements rather than hunting (9). Tools associated with leather thong production, bit-related dental pathologies (7, 10), and equine milk fats within ceramics support pastoral husbandry, involving milking and harnessing (8).

Geological surveys at the Botai culture site of Krasnyi Yar, Kazakhstan, described a polygonal enclosure of ~20 m by 15 m with increased phosphorus and sodium concentrations (6), likely corresponding to a horse corral. We revealed a similar enclosure at the eponymous Botai site, ~100 km west of Krasnyi Yar (Fig. 1A), that shows close-set post molds, merging to form a palisade trench, and a line of smaller parallel postholes inside (Fig. 1B). Radiocarbon dates on horse bones from these postholes are consistent with the Botai culture (11). The presence of enclosures at Krasnyi Yar and Botai builds on the evidence supporting horse husbandry.

Fig. 1 Sample location and corral enclosure at Botai.

(A) Archaeological sites. The age (years ago) of the genomes considered is reported to the right of each site name. The number of genomes sequenced per site is reported between parentheses if greater than one. Triangles refer to the ancient genomes characterized here, whereas diamonds indicate those previously published. Blue refers to wild ancient individuals, light and dark green to the first domestic clade (Botai and Borly4), and yellow to individuals of the second domestic clade (DOM2). The Botai culture site of Krasnyi Yar is indicated with an asterisk, although no samples were analyzed from this site. (B) Magnetic gradient survey and excavation at Botai, with interpretation. The enclosure and its excavated boundary are indicated by red and yellow squares, respectively. Round black circles correspond to pit houses.

We sequenced the genomes of 20 horses from Botai and 22 from across Eurasia and spanning the past ~5000 years (table S1). With the published genomes of 18 ancient and 28 modern horses, this provided a comparative panel of 3 wild archaic horses (~42,800 to 5100 years ago), 7 Przewalski’s horses (PH, 6 modern and 1 from the 19th century), and 78 domesticates (25 Eneolithic, including 5 from Borly4, Kazakhstan, ~5000 years ago; 7 Bronze Age, ~4100 to 3000 years ago; 18 Iron Age, ~2800 to 2200 years ago; 1 Parthian and 2 Roman, ~2000 to 1600 years ago; 3 post-Roman, ~1200 to 100 years ago; and 22 modern from 18 breeds).

The 42 ancient-horse genomes, belonging to 31 horse stallions and 11 mares, were sequenced to an average depth of coverage of ~1.1 to 9.3X (median = 3.0X). Damage patterns indicative of ancient DNA were recovered (figs. S8 and S9). Base-quality rescaling and termini trimming resulted in average error rates of 0.07 to 0.14% per site (tables S13 and S14).

Principal component analysis (PCA) revealed PH and the archaic horses as two independent clusters (Fig. 2A). Within domesticates, all 25 Botai-Borly4 Eneolithic specimens grouped together to the exclusion of all remaining horses.

Fig. 2 Horse genetic affinities.

(A) PCA of the genome variation present in 88 ancient- and modern-horse genomes. Only the first two principal components (PCA1 and PCA2) are shown. (B) Phylogenetic relationships. The tree was reconstructed on the basis of pairwise distances calculated with ~14.1 million transversion sites. Node supports derive from 100 bootstrap pseudoreplicates. The archaeological site and age (years ago) of ancient specimens are indicated in the first and last fields of the sample name. (C) Outgroup f3-statistics showing the pairwise genetic affinities.

Phylogenetic reconstruction confirmed that domestic horses do not form a single monophyletic group as expected if descending from Botai (Fig. 2B). Instead, PH form a highly drifted, monophyletic group, unambiguously nested within Botai-Borly4 horses. All remaining domesticates cluster within a second, highly supported monophyletic group (DOM2). Applying TreeMix (12) to the 60 genomes with minimal 3.0X average depth of coverage confirmed this tree topology (fig. S23).

Outgroup f3- and D-statistics (13) support PH as genetically closer to Botai-Borly4 individuals than any DOM2 member (Fig. 2C and figs. S25 and S26). Finally, ancestry tests (14) confirmed Botai horses as the direct ancestors of Borly4 horses, and the Borly4 as ancestral to the only PH in our data set predating their massive demographic collapse and introgression of modern domestic genes (15).

Outgroup f3- and D-statistics also revealed that Dunaújváros_Duk2 (Duk2), the earliest and most basal specimen within DOM2, was divergent to all other DOM2 members. This is not due to sequencing errors, because the internal branch that splits from Duk2 and leads to the ancestor of all remaining DOM2 horses is long (Fig. 2B). This suggests instead shared ancestry between Duk2 and a divergent ghost population. We thus excluded Duk2 in admixture graph reconstructions (16) to avoid bias due to contributions from unsampled lineages (Fig. 3).

Fig. 3 Admixture graphs.

(A to F) The six scenarios tested. The scenario in panel (A) received decisive Bayes factor support, as indicated below each corresponding alternative scenario tested. Domestic-Ancient and Domestic-A or -B refer to three phylogenetic clusters identified within DOM2 (excluding Duk2): ancient individuals; modern Mongolian, Yakutian (including Tumeski_CGG101397), and Jeju horses; and all remaining modern breeds. (G) Posterior distributions of admixture proportions. p1 and p2 represent admixture proportions along the dotted branches in the best-supported scenario.

In the absence of admixture, the best admixture graph matched the trees reconstructed above. We also reconstructed admixture graphs for five additional scenarios with one or two admixture event(s), including between PH and domesticates (15). Bayes factors best supported a horse domestication history in which a first lineage gave rise to Botai-Borly4 and PH horses, whereas a second lineage founded DOM2 and provided the source of domestic horses during at least the past ~4000 years, with minimal contribution from the Botai-Borly4 lineage [95% confidence interval (CI) = 2.0 to 3.8%].

The limited Botai-Borly4 ancestry among DOM2 members concurs with slightly significant negative D-statistics in the form of {[(DOM2_ancient,DOM2_modern), Botai-Borly4], donkey} for some DOM2 members, spanning a large geographical (Western Europe, Turkey, Iran, and Central Asia) and temporal range (from ~3318 to ~1143 years ago; fig. S28). This suggests that sporadic introgression of Botai ancestry into multiple DOM2 herds occurred until 1000 years ago. This gene flow was mediated not only through females, because 15 Botai-Borly4 individuals carried mitochondrial haplotypes characteristic of DOM2 matrilines (figs. S12 and S13), but also through males, given the persistence of Botai-Borly4–related patrilines within DOM2 (figs. S15 to S18).

PH are considered to be the last remaining true wild horses, which have never been domesticated (15). Our results reveal that they represent instead the feral descendants of horses first herded at Botai. It appears that their feralization likely involved multiple biological changes.

Metacarpal measurements in 263 ancient and 112 modern horses indicate that PH have become less robust than their Botai-Borly4 ancestors (Fig. 4A). One Botai individual likely showed limited unpigmented areas and leopard spots, as it was heterozygous for four mutations at the TRPM1 locus associated with leopard spotting and carried the ancestral allele at the PATN1 modifier (17, 18) (Fig. 4B). Individuals homozygous for TRPM1 mutations are generally almost completely unpigmented and develop congenital stationary night blindness (17). First maintained at Botai by human management, the haplotype associated with leopard spotting was likely selected against and lost once returning wild, leading to the characteristic PH Dun dilution coloration (19). Genomic regions with signatures of positive selection along the phylogenetic branch separating Borly4 and PH showed functional enrichment for genes associated, in humans, with cardiomyopathies (P ≤ 0.0496), melanosis and hyperpigmentation (P ≤ 0.0468), and skeletal abnormalities (P ≤ 0.0594) (table S18), suggesting that at least some of the morpho-anatomical changes associated with feralization were adaptive.

Fig. 4 Phenotypic and genomic changes associated with ferality.

(A) Indices of the robustness of the third metacarpal bone in various horse populations. Bd, breadth at the middle of the diaphysis; GL, maximal or greatest length. Kent and Kumkeshu-Kozhai represent populations of Kazakhstan from the Iron Age and Eneolithic (Tersek culture), respectively. (B) Genotyping information at the TRPM1 locus (chr1, chromosome 1) and the PATN1 modifier (chr3, chromosome 3) for Botai-Borly4 horses. The absence, heterozygosis, and homozygosis of alleles strongly associated with leopard spotting are depicted in white, dark gray, and red, respectively. Crosses indicate insufficient data. The causative long tandem repeat (LTR) insertion at the TRPM1 locus is indicated by the number of reads overlapping both flanks of the insertion site. (C) Individual-based genetic loads. The purple circle shows the PH specimen from the 19th century.

Additionally, significantly negative D-statistics in the form of {[(DOM2,PH), archaic], donkey} previously suggested that the extinct, archaic lineage formed by ~5100- to 42,700-year-old horses from Taymyr and Yakutia contributed to the genetic ancestry of modern domesticates (20, 21). Although we could confirm such D-statistics (fig. S29), almost all other D-statistics in the form of {[(DOM2,Botai-Borly4), archaic], donkey} were not different from zero (fig. S30). This indicates selection against the archaic ancestry between ~4977 and ~118 years ago (the time interval separating the youngest Borly4 individual and the earliest PH sequenced). Alternatively, the PH lineage admixed with a divergent population of horses, both unrelated to the archaic lineage and the ghost population that contributed ancestry to Duk2, because D-statistics revealed Duk2 as closer to Borly4 than to PH (fig. S31).

Lastly, although the genetic load of PH and Botai-Borly4 genomes was equivalent until ~118 years ago, it drastically increased in modern animals (Fig. 4C). This accumulation of deleterious variants was thus not associated with PH feralization but with the recent introgression of deleterious variants from modern domesticates and demographic collapse, which hampered purifying selection.

That none of the domesticates sampled in the past ~4000 years descend from the horses first herded at Botai entails another major implication. It suggests that during the third millennium BCE, at the latest, another unrelated group of horses became the source of all domestic populations that expanded thereafter. This is compatible with two scenarios. First, Botai-type horses experienced massive introgression capture (22) from a population of wild horses until the Botai ancestry was almost completely replaced. Alternatively, horses were successfully domesticated in a second domestication center and incorporated minute amounts of Botai ancestry during their expansion. We cannot identify the locus of this hypothetical center because of a temporal gap in our data set throughout the third millennium BCE. However, that the earliest DOM2 member was excavated in Hungary adds Eastern Europe to other candidates already suggested, including the Pontic-Caspian steppe (2), Eastern Anatolia (23), Iberia (24), Western Iran, and the Levant (25). Notwithstanding the process underlying the genomic turnover observed, the clustering of ~4023- to 3574-year-old specimens from Russia, Romania, and Georgia within DOM2 suggests that this clade already expanded throughout the steppes and Europe at the transition between the third and second millennia BCE, in line with the demographic expansion at ~4500 years ago recovered in mitochondrial Bayesian Skylines (fig. S14).

This study shows that the horses exploited by the Botai people later became the feral PH. Early domestication most likely followed the “prey pathway,” whereby a hunting relationship was intensified until reaching concern for future progeny through husbandry, exploitation of milk, and harnessing (7). Other horses, however, were the main source of domestic stock over the past ~4000 years or more. Ancient human genomics (26) has revealed considerable human migrations ~5000 years ago involving Yamnaya culture pastoralists of the Pontic-Caspian steppe. This expansion might be associated with the genomic turnover identified in horses, especially if Botai horses were better suited to localized pastoral activity than to long distance travel and warfare. Future work must focus on identifying the main source of the domestic horse stock and investigating how the multiple human cultures managed the available genetic variation to forge the many horse types known in history.

Supplementary Materials

Materials and Methods

Figs. S1 to S34

Tables S1 to S18

References (27174)

References and Notes

  1. See supplementary materials.
Acknowledgments: We thank the British Institute of Persian Studies in Tehran and the National Museum of Iran for providing access to the material from Iran; the Archaeometry Laboratory of the University of Tehran; the staff of the Danish National High-Throughput DNA Sequencing Center; C. Gamba and C. McCrory Constantz for technical support and/or discussions; and L. Frantz, D. Bradley, and G. Larson for critical reading of the manuscript. Funding: This work was supported by the Danish Council for Independent Research, Natural Sciences (4002-00152B); the Danish National Research Foundation (DNRF94); Initiative d’Excellence Chaires d’attractivité, Université de Toulouse (OURASI); the Publishing in Elite Journals Program (PEJP-17), Vice Rectorate for Graduate Studies and Scientific Research, King Saud University; the Villum Fonden miGENEPI research project; the European Research Council (ERC-CoG-2015-681605); the Taylor Family-Asia Foundation Endowed Chair in Ecology and Conservation Biology; the Innovation Fund of the Austrian Academy of Sciences (ÖAW); the Austrian Federal Ministry of Agriculture, Forestry, Environment, and Water Management; and the Russian Science Foundation (16-18-10265-RNF). Author contributions: L.O. conceived the project and designed research; I.J.O., V.Z., and A.K.O. designed and carried out field archaeological work; C.G., A.F., and N.K. performed ancient DNA laboratory work, with input from L.O.; P.L. and L.O. designed and coordinated computational analyses; K.H., P.L., M.S., A.A., and L.O. performed computational analyses; N.Be., K.M., P.W.S., V.P., A.K., G.M., N.Ba., L.L., V.O., J.K., B.B., S.U., D.E., S.L., M.M., H.D., A.M., A.L., V.M., V.Z., and L.O. provided and/or collected samples; O.B.-L., S.A., A.H.A., K.A.S.A.-R., E.W., and L.O. provided reagents, measurements, and material; C.G., A.F., K.H., and L.O. prepared figures and tables, with input from L.O.; C.G., A.F., K.H., P.L., I.J.O., A.K.O., and L.O. wrote the supplementary information; and A.K.O. and L.O. wrote the paper, with input from all other coauthors. Competing interests: The authors declare no competing interests. Data and materials availability: The archaeological material from Iran analyzed in this study was part of the collections of the Osteology Department of the National Museum of Iran. A subset of the morphological data described in this study was collected at the UMR 7041-ArScAn CNRS. Individual genome sequence data are available at the European Nucleotide Archive (accession no. PRJEB22390). Alignments underlying analyses are available on DRYAD at

Stay Connected to Science

Navigate This Article