Widespread Origins of Domestic Horse Lineages

See allHide authors and affiliations

Science  19 Jan 2001:
Vol. 291, Issue 5503, pp. 474-477
DOI: 10.1126/science.291.5503.474


Domestication entails control of wild species and is generally regarded as a complex process confined to a restricted area and culture. Previous DNA sequence analyses of several domestic species have suggested only a limited number of origination events. We analyzed mitochondrial DNA (mtDNA) control region sequences of 191 domestic horses and found a high diversity of matrilines. Sequence analysis of equids from archaeological sites and late Pleistocene deposits showed that this diversity was not due to an accelerated mutation rate or an ancient domestication event. Consequently, high mtDNA sequence diversity of horses implies an unprecedented and widespread integration of matrilines and an extensive utilization and taming of wild horses. However, genetic variation at nuclear markers is partitioned among horse breeds and may reflect sex-biased dispersal and breeding.

The domestication of the horse has profoundly affected the course of civilization. Horses provided meat, milk, and enhanced transportation and warfare capabilities that led to the spread of Indo-European languages and culture and the collapse of ancient societies (1, 2). Horse remains become increasingly common in archaeological sites of the Eurasian grassland steppe dating from about 6000 years ago, suggesting the time and place of their first domestication (3–5). Two alternative hypotheses for the origin of the domestic horse from wild populations can be formulated. A restricted origin hypothesis postulates that the domestic horse was developed through selective breeding of a limited wild stock from a few foci of domestication. Thereafter, domestic horses would have been distributed to other regions. Under this hypothesis, domestication is a complex and improbable process requiring multigeneration selection on traits that permit stable coexistence with humans. Another alternative could be that domestication involved a large number of founders recruited over an extended time period from throughout the extensive Eurasian range of the horse. In this multiple origins scenario, horses may have been independently captured from diverse wild populations and then increasingly bred in captivity as wild numbers dwindled. Consequently, early domestic horses may not represent a stock highly modified by selective breeding.

These two hypotheses for the origin of the domestic horse make distinct predictions with regard to genetic variation in maternally inherited mtDNA. The restricted origin hypothesis predicts that mitochondrial diversity of the horse should be limited to a few founding lineages and those added subsequently by mutation. In contrast, a multiple origins hypothesis predicts diversity greater than that typically found in a single wild population and divergence among lineages that well precedes the first evidence of domestication.

Phylogenetic analysis of 37 different mtDNA control region sequences from domestic horses deposited in GenBank, 616 base pairs (bp) in length (6), revealed at least six divergent sequence clades (clades A to F, Fig. 1A). After correcting for multiple hits and ignoring indels, the mean divergence observed between sequences was 2.6% (range: 0.2 to 5%). The average divergence between donkey (Equus asinus) and horses was 16.1% (range: 14.3 to 19.1%). Assuming that horses diverged from the lineage leading to extant stenoid equids (zebras and asses) at least 2 million years ago (Ma), as the fossil record suggests (7), or about 3.9 Ma, according to molecular data (8), we can estimate an average rate of equid mtDNA sequence divergence of 4.1% or 8.1% per million years. Therefore, modern horse lineages coalesce at about 0.32 or 0.63 Ma, long before the first domestic horses appear in the archaeological record (4). Even clade D, having a more recent coalescence time, has a mean sequence divergence of 0.8% (range: 0.2 to 2.0%), which predicts an origin at least 0.1 Ma. These results show that domestic horse lineages have an ancient origin. Thus, given the 6000-year origin suggested by the archaeological record, numerous matrilines must have been incorporated into the gene pool of the domestic horse.

Figure 1

Mitochondrial control region sequence trees. (A) Neighbor-joining tree of modern horse haplotypes based on 616 bp of mitochondrial control region sequence. Letters A to F indicate sequence clades consistently supported with different tree-building methods. Support values are indicated at nodes when found in at least 50% of 1000 bootstrap neighbor-joining trees, in a consensus tree of 1000 steps based on the quartet puzzling algorithm, and in a 50% majority rule consensus tree of 26,113 most parsimonious trees. (B) Neighbor-joining tree of modern and ancient horse haplotypes based on 355 bp of control region sequence. Bootstrap support is indicated at nodes if found in more than 50% of 1000 bootstrap trees. Letters A to F correspond to sequence clades in (A). The sequence identification codes refer to GenBank accession numbers, except for the sequences obtained in this study. Modern horses are indicated with the prefix EC, late Pleistocene sequences from permafrost deposits near Fairbanks, Alaska, with a dot and the prefix Pleist, and archaeological sequences from northern Europe with a triangle and the prefix Anc. The sequence obtained from Przewalski's horses is indicated with an asterisk. For some archaeological samples, only partial sequences were obtained: Anc4, 262 bp were sequenced, most similar to sequences in clade C; Anc5, 241 bp, identical to sequences related to clade B; Anc7, 259 bp, identical to sequences in clade A; and Anc8, 223 bp, most similar to sequences in clade A.

To expand the representation of modern and ancient breeds, we sequenced 355 bp of the left domain of the mtDNA control region in 191 horses from 10 distinct breeds (9), including some that are very old such as the Icelandic pony, Swedish Gotland Russ, and British Exmoor pony. A Przewalski's horse was also sequenced. We found 32 different sequences, and a search of GenBank provided 38 additional haplotypes for the same region. We compared all these sequences with those obtained from DNA isolated from long bone remains of eight horses preserved frozen in Alaskan permafrost deposits from a locality near Fairbanks, Alaska, dated 12,000 to 28,000 years ago (10,11). Additionally, we sequenced DNA of eight horse remains from archaeological sites in southern Sweden and Estonia, dated to 1000 to 2000 years ago (11).

The additional sequences affirm the ancient and diverse origin of domestic horse mtDNA lineages (Fig. 1B). Six of eight permafrost sequences cluster in a group ancestral to modern sequences (Pleist2, Pleist3, Pleist4, Pleist6, Pleist7, and Pleist8), possibly representing a sister taxon of the domestic horse or a lineage not present in modern domestic horses (Fig. 1B) (12). However, the other two permafrost sequences cluster with those of clade C (Pleist1 and Pleist5). These sequences differ by as little as 1.2% from modern counterparts. Similarly, four complete and four incomplete sequences found in archaeological remains are distributed throughout the tree defined by modern horse sequences and are closely related to them (Fig. 1B). Finally, in the most primitive (4) and chromosomally distinct horse (13), the Przewalski's horse, only a single haplotype has been found [EC13B, Fig. 1B; cf. (14)]. The low variability of the Przewalski's horse is not unexpected because the captive population was founded from only 13 individuals (14). This haplotype is not directly ancestral to any sequence cluster. Therefore, because sequences from ancient specimens and Przewalski's horses are very similar to those in modern horses, our results contradict the possibility that a high mutation rate explains the haplotype diversity observed in modern horses. Moreover, modern horse sequences do not define monophyletic groups with respect to wild progenitors, as would be expected if they were founded from a limited wild stock (15–17). The lack of well-supported phylogenetic clades and the presence of numerous matrilines whose divergence exceeds that between late Pleistocene and modern sequences suggest a massive and unprecedented retention of ancestral matrilines. The observation that the sequences obtained from the late Pleistocene Alaskan horses cluster in two discrete groups indicates that the diversity of mtDNA lineages in single natural populations might have been limited. Consequently, the high diversity of matrilines observed among modern horses suggests the utilization of wild horses from a large number of populations as founders of the domestic horse. A single geographically restricted population would not suffice as founding stock.

Although the initial founding of the domestic horse involved incorporation of multiple matrilines, the development of phenotypically distinct breeds may be a different process characterized by a limited founding stock and restricted breeding (18). Consistent with limited genetic exchange among breeds, we found that haplotype frequencies differed significantly in all but one of 45 pairwise breed comparisons (exact test, P < 0.05) (19). However, the sequences found in the 10 sampled modern and ancient breeds never defined monophyletic groups as might be expected if they were derived from a limited founding stock (Fig. 2A). Additionally, genetic diversity within breeds is high; on average, 7.4 haplotypes occurred per breed (range: 3 to 9), and the nucleotide diversity per breed averaged 0.022 (range: 0.012 to 0.027), comparable to that found in large wild ungulate populations (20). Moreover, genetic diversity within breeds is not a recent phenomenon, because Viking Age horse bones 1000 to 2000 years old and from a restricted area in northern Europe also show a diversity of control region lineages (Fig. 1B). The high haplotypic diversity of ancient horse breeds and of a Viking Age population suggests that domestic horse populations were founded by a diversity of matrilines that, as suggested by the archaeological record, was augmented by trade (21, 22).

Figure 2

Genetic diversity in horse breeds. (A) Distribution of haplotypes found in 10 horse breeds and in horses of the Orient from GenBank. This distribution is superimposed on a neighbor-joining tree of modern horse haplotypes based on the same 355 bp of the control region sequence, as in Fig. 1B. Each one of the observed haplotypes is indicated by a colored symbol corresponding to its position in the relationship tree. (B) Unrooted neighbor-joining tree of horses from 10 breeds as in (A) based on the proportion of shared alleles between individuals.

High levels of mtDNA variation within and between horse breeds may reflect a bias toward females in breeding and trade. Consequently, we assessed variation in 15 hypervariable, biparentally inherited, microsatellite loci (23–27). The observed microsatellite heterozygosity was moderately high for all breeds, varying from 49.4% ± 5.7% to 62.6% ± 5.8%, and the mean number of alleles per locus varied from 3.6 ± 0.3 to 4.5 ± 0.4. Allele frequencies differed significantly between breeds (exact test,P < 0.01). The microsatellite divergence between breeds was more marked than observed with mtDNA sequences. Individuals from the same breed generally clustered together in a neighbor-joining tree based on allele sharing distance (Fig. 2B), and 95% of individuals could be classified correctly to breed with an assignment test (28). These results show that maternal gene flow has dominated the genetic exchange between breeds and/or that female effective population size within breeds is larger than that of males. A sex bias in ancient breeding is consistent with modern breeding practices in which select breeding males are used as stud for 15 to 20 or more females (29).

Wild horses were widely distributed throughout the Eurasian steppe during the Upper Paleolithic [35,000 to 10,000 years before present (B.P.)], but in many regions, they disappeared from the fossil record about 10,000 years ago (3, 4). Horse remains became increasingly frequent in archaeological sites of southern Ukraine and Kazakhstan starting about 6000 years ago, where limited evidence from bit wear on teeth suggests that some horses could have been ridden (29, 30). By the beginning of the Iron Age, wild horse populations had declined, and today, only one putative wild population, the Przewalski's horse, remains (4). Therefore, a scenario consistent with the archaeological record and genetic results posits that, initially, wild horses were captured over a large geographic area and used for nutrition and transport. As wild populations dwindled because of exploitation or environmental changes (31), increased emphasis was placed on captive breeding, allowing for multiple matrilines of a single ancestral species to be integrated into the gene pool of domestic horses. This contrasts with previous notions of domestication as a complex multigeneration process that begins with relatively few individuals selected for behavioral characteristics, such as docility and sedentary habits, as a prerequisite to coexistence with humans.

Domestic species such as dogs, cattle, sheep, and goats were established several thousand years before the horse was domesticated (4). The geographic spread of these species was a likely result of their expansion from a limited number of domestication centers. In the horse, the extensive and unparalleled retention of ancestral matrilines suggests that widespread utilization occurred primarily through the transfer of technology for capturing, taming, and rearing wild caught animals (29, 30). In contrast, the export of domesticated horses from a geographically restricted center of origin would have resulted in a more limited diversity of matrilines. Consequently, in the history of the domestic horse, transfer of technology rather than selective breeding may have been the critical innovation leading to their widespread utilization. Moreover, the high value of horses in primitive societies (1, 2) placed a premium on the rapid acquisition of this technology from neighboring communities.

  • * To whom correspondence should be addressed. E-mail: carles.vila{at}


View Abstract

Navigate This Article