Genetic Evidence for an East Asian Origin of Domestic Dogs

See allHide authors and affiliations

Science  22 Nov 2002:
Vol. 298, Issue 5598, pp. 1610-1613
DOI: 10.1126/science.1073906


The origin of the domestic dog from wolves has been established, but the number of founding events, as well as where and when these occurred, is not known. To address these questions, we examined the mitochondrial DNA (mtDNA) sequence variation among 654 domestic dogs representing all major dog populations worldwide. Although our data indicate several maternal origins from wolf, >95% of all sequences belonged to three phylogenetic groups universally represented at similar frequencies, suggesting a common origin from a single gene pool for all dog populations. A larger genetic variation in East Asia than in other regions and the pattern of phylogeographic variation suggest an East Asian origin for the domestic dog, ∼15,000 years ago.

Archaeological finds from Mesolithic sites around the world indicate that the dog was the first domestic animal (1). Its origin from wolves is well established from genetic as well as behavioral and morphological data (1–3), but apart from this, available clues give no clear picture of its origin. Interpretation of the archaeological record is problematic because of the difficulty in discriminating between small wolves and domestic dogs (4, 5); however, the earliest finds believed to be from domestic dogs are a single jaw from 14,000 years before the present (yr B.P.) in Germany (5, 6) and an assemblage of small canids from 12,000 yr B.P. in Israel (7, 8). This indicates an origin from Southwest Asia, where the first farm animals are believed to have originated (9), or Europe. On the other hand, one osteological feature diagnostic of dogs is also found among Chinese wolves, suggesting an East Asian origin (4,10). On the basis of the morphology and size of early archaeological finds, an origin from the large North Eurasian or North American wolves seems unlikely (10–12). An origin from several different wolf populations could explain the extreme morphological variation among dog breeds. To determine whether dogs were domesticated in one or several places, and the approximate place and time of these events, we examined the structure of mtDNA sequence variation among domestic dogs worldwide.

We analyzed the genetic variation in 582 base pairs (bp) of mtDNA in 654 domestic dogs from Europe, Asia, Africa, and Arctic America and in 38 Eurasian wolves (13, 14) (tables S1 and S2; fig. S1). It has previously been shown that domestic dogs originate from at least four female wolf lines (2,15). Phylogenetic analysis of our data assigned the dog sequences into the same four phylogenetic groups (clades A, B, C, and D) and to a fifth “group,” clade E, consisting of an isolated haplotype (Fig. 1 and fig. S2) (14). These groups were interspersed by wolf sequences and were approximately equidistant from a cenancestor of all wolves and dogs. Therefore, we conclude that the domestic dog population originates from at least five female wolf lines. Furthermore, although not separated from clade A by wolf sequences, a sixth group, clade F, is suggested by the separation distance. In the following analyses we will treat these six groups separately. Clade A included three wolf haplotypes found in China and Mongolia; clade B contained three wolf haplotypes, two found in East Europe (one of which is identical to a dog haplotype) and one in Afghanistan (fig. S2). This suggests an origin of clade A in East Asia and of clade B in Europe or Southwest Asia. However, wolves are extremely mobile, resulting in little geographic partitioning of mtDNA haplotypes; Vila et al. found identical sequences in a Bulgarian and a Saudi Arabian wolf, as well as in a Mongolian and another Saudi Arabian wolf (16). Thus, the finding of a few wolf haplotypes closely related to domestic dog types is not in itself a sufficient basis for establishing the location of domestication. Therefore, the pattern of intraspecific genetic variation among dogs worldwide is crucial for understanding the origin of the dog.

Figure 1

Phylogenetic tree of all dog (unlabeled) and wolf (open squares) haplotypes (14). Six clades (Ato F) of dog haplotypes are indicated. Branch lengths are according to the indicated scale; the branch leading to the outgroup (coyote) was reduced by 50%. The nucleotide substitution model (HKY + Γ + I, with α = 0.5960 and I = 0.7367) for our data set was optimized using a hierarchical maximum likelihood (ML) approach (21). A starting tree was calculated under the HKY + Γ + I model using the neighbor-joining method and further searched by 106 tree bisection and reconnection (TBR) iterations under the minimum evolution criteria (22). A ML ratio test showed that alternative rooting points were possible. The root giving the most homogeneous rate of evolution, as suggested by ingroup midpointing, was chosen as our best estimate. The branching orders in the individual clades A to F were imploded because many alternative pathways existed, as displayed in the minimum-spanning networks in Fig. 2. This homoplasy also affected our bootstrap values (clade A and B: <50%; clades C, D, and F: 65 to 95%). However, excluding wolf sequences, clades B to F showed high bootstrap support (>74%), and smaller data sets have previously given >50% for clade A (2,15).

A total of 71.3% of dogs had haplotypes belonging to clade A, and 95.9% had types belonging to clades A, B, or C (Table 1). Clade A was represented in all geographic regions, and clades B and C in all regions except America. Thus, these three clades constitute a common source for a very large proportion of the mtDNA genetic variation in all domestic dog populations. Furthermore, the frequencies of clades A, B, and C were similar in all regions (Table 1). This suggests that, unless a very effective gene flow occurred along the Eurasian continent, the major present-day dog populations had a common origin from a single gene pool containing clades A, B, and C. Moreover, there was no clear division of the main morphologic types of dog (spitz, mastiff, greyhound) or of large and small breeds among the three main clades, except for a lack of greyhounds in clade C (table S1). This suggests that the extreme morphologic variation among dog breeds is not the result of geographically distinct domestications of wolf. Because haplotypes of clades D, E, and F were found only regionally in Turkey, Spain, and Scandinavia; Japan and Korea; and Japan and Siberia, respectively (table S1), we concentrated our analyses on the major clades A, B, and C.

Table 1

Number and proportion of individuals; and number of haplotypes and unique haplotypes for the phylogenetic clades A, B, and C in different regions. Haplotypes are defined by substitutions only, disregarding indels.

View this table:

If an ancestral population and a derived population (formed from a subset of the genetic types of the ancestral population) are compared, the number of haplotypes and the nucleotide diversity are expected to be higher in the ancestral population. A rough measure of the amount of genetic variation is given by the mean pairwise sequence distance. For clade A it was 3.39 (SD = 0.13) substitutions in East Asia, 2.28 (SD = 0.23) in Southwest Asia, and 2.97 (SD = 0.08) in Europe. Furthermore, we found a larger number of haplotypes in East Asia than in Southwest Asia and Europe (Table 1). When corrected for sample size by resampling with replacement, there were 20.2 (SD = 2.4) haplotypes among 51 East Asian dogs, which is significantly more than the 16 haplotypes found among the 51 Southwest Asian dogs (P < 0.05; 1000 replicates), while there was no significant difference between Southwest Asia and Europe. Notably, of the 44 types found in East Asia, 30 were unique to this region, i.e., more than the total number of types in Europe (Table 1).

The pattern of genetic variation in the different populations can be studied in more detail in a minimum-spanning network (Fig. 2A and fig. S3). In clade A, East Asian haplotypes were distributed throughout the network, while for Europe and Southwest Asia, parts of the network, largely the same in the two populations, were empty. To further investigate the larger genetic variation in East Asia, and the possibility of an East Asian rather than Southwest Asian or European origin of the dog, we compared the eastern (East) and the western (West) parts of the world, defined here as the areas east and west of a line from the Himalayas to the Ural mountains. In clade A, 13 haplotypes were found in both East and West, while 35 were unique to East and 23 to West (Fig. 2B). Furthermore, in East, 19 of the unique haplotypes were at least two steps from a haplotype found in West, and 8 were at a distance of three or more steps, whereas in West only 3 unique haplotypes were two steps and not a single type three or more steps from haplotypes found in East. Importantly, East had a considerably larger proportion of individuals with the unique haplotypes (51.5%) than had West (28.1%), and the different geographic subregions showed the same pattern (table S3). Together, these data indicate that the haplotypes of clade A in the western part of the world originate from the introduction of a subset of East Asian types, from which the types unique to West have later developed. Notably, the haplotypes at least three steps from western types were from Thailand (n = 2), Cambodia (n = 1), Tibet (n = 2), China (n = 2), and Japan and Korea (n = 1), showing a large divergence throughout East Asia.

Figure 2

Minimum-spanning networks showing genetic relationships among mtDNA dog haplotypes of phylogenetic clades A, B, and C. Haplotypes (circles) are separated by one mutational step, ignoring indels. Black dots are hypothetical intermediates. Uncolored squares are wolf haplotypes. (A) Haplotypes found in East Asia, Europe, and Southwest Asia are indicated in separate networks with orange, blue, and green, respectively. The sizes of colored circles are proportional to haplotype frequency in the respective populations. Small uncolored circles denote haplotypes not found in the regional population. Subclusters of clade A discussed in the main text, three in the East Asian and one in the European network, are marked by red lines. (B) Haplotypes shared between and unique to East and West, respectively. Circles denote haplotypes found in both East and West (white), unique to West (blue), unique to West and two steps from Eastern types (dark blue), unique to East (orange), unique to East and two steps from Western types (red), and unique to East and three or more steps from Western types (red with bold lining).

Similarly, a greater number of clade B haplotypes were found in East than in West. The central haplotype and two others were common to East and West, while seven types were unique to East but only three to West (Table 1 and Fig. 2B). Furthermore, the proportion of individuals with the unique haplotypes was much higher in East (41.2%) than in West (6.8%) (table S3), and the mean pairwise distance was larger in East Asia (0.93 substitutions, SD = 0.17) than in Europe (0.45, SD = 0.14) and Southwest Asia (0.36, SD = 0.11). Clade C showed less variation but resembled clades A and B in that West had only shared types, whereas East had two unique haplotypes (Table 1 and Fig. 2). In conclusion, >95% of all sequences belonged to the three phylogenetic clades A, B, and C, which were found universally at similar frequencies, indicating an origin from a common gene pool for all dog populations worldwide. The larger genetic variation in East Asia and the distribution of haplotypes in the different geographic regions indicate that clade A originated in East Asia and that the haplotypes in Europe and Southwest Asia derive from a subset of the East Asian types. Similarly, a larger genetic variation and proportion of individuals with unique haplotypes in East Asia indicates an East Asian origin also for clade B, giving a total of >88% of the sequences probably deriving from East Asia; for clade C the pattern is less clear, but an East Asian origin for this clade is also possible.

The time of origin for the dog clades can be estimated from the mean genetic distance in each clade to the original wolf haplotype, and the mutation rate. The substitution rate was estimated at 7.1% (SD = 0.4%) per million years for the analyzed 582-bp region, from the mean genetic distance between the dog and wolf haplotypes and the coyote types in the tree (Fig. 1) (14), and the assumption of a divergence time between wolves and coyotes of 1 million years, on the basis of the fossil record (17).

In a domestication event with a subsequent population expansion, a starlike phylogeny, with the founder haplotype in the center and new haplotypes distributed radially, would be expected. Fu'sFS test (18) for clades A, B, and C in East Asia (−20.0, −6.6, and −0.50, respectively) showed a significant signal of population expansion for clades A and B (P < 0.01). The networks of clades B and C are starlike, indicating an origin from a single wolf haplotype (Fig. 2A). In contrast, clade A has a complicated pattern without an easily identifiable central node. A distance of up to 11 substitutional steps between haplotypes would indicate that clade A is older than clades B and C and derives from an initial domestication of wolves. However, instead of a single central node, there are several subclusters with starlike shape, suggesting that clade A may have originated from several wolf haplotypes. However, this data set does not provide the resolution necessary to determine the exact number of founding wolf haplotypes in clade A. The approximate age of clade A, assuming a single origin from wolf and a subsequent population expansion, is calculated from the mean pairwise distance between East Asian sequences (3.39 substitutions, SD = 0.13) and the mutation rate to 41,000 ± 4,000 years. If, instead, we assume several origins, we identify three reasonably defined subclusters that could be used to estimate the age of clade A (Fig. 2A); the mean genetic distances to their nodes (0.45, 0.65, and 1.07 substitutions with SD = 0.13, 0.09, 0.27, respectively) give estimates of 11,000 ± 4,000, 16,000 ± 3,000, and 26,000 ± 8,000 years, respectively. Assuming single wolf haplotypes as founders of clades B and C, the mean distances among East Asian sequences to the nodes (0.54 and 0.71 substitutions, SD = 0.08 and 0.10) give estimated ages of 13,000 ± 3,000 and 17,000 ± 3,000 years for clades B and C, respectively.

Thus, our mtDNA data suggest a first origin of domestic dogs either ∼40,000 years ago, forming only clade A, or ∼15,000 years ago, possibly involving all the three clades A, B, and C. However, the oldest subcluster of clade A in Europe (as determined from the mean genetic distance from haplotypes unique to the western part of the world to the nodal haplotype shared with East Asia; 0.39 substitutions, SD = 0.09) is estimated to be only 9,000 ± 3,000 years old (Fig. 2A). An origin of 40,000 years ago for clade A would therefore imply a long isolation in East Asia of dogs before they spread to the rest of the world. Circumstantial evidence therefore indicates a simultaneous origin in East Asia ∼15,000 years ago for clades A and B, and possibly also clade C.

In the context of the archaeological record, this seems to be a probable scenario. There is no certain evidence for domestic dogs in late Paleolithic China, but in the earliest Neolithic, finds are numerous, dating back to 7,500 yr B.P. (4, 19). Considering the relatively limited amount of archaeological work done in East Asia, the lack of late Paleolithic finds does not exclude a much earlier origin of domestic dogs in East Asia. The earliest Southwest Asian finds dated at ∼12,000 yr B.P. are from unspecified small canids (7, 8), and remains with typical dog morphology appear only by 9,000 yr B.P. (4, 11). The German find from 14,000 yr B.P. consists of a single jaw fragment (6), and there is a considerable temporal gap to later European finds, which appear by ∼9,000 yr B.P. (4, 5, 12). The earliest North American finds are dated at 8,500 yr B.P. (4,20). An East Asian origin is supported by a morphological feature of the jaw diagnostic of domestic dogs and also found in some Chinese wolves but generally not in other wolves (4, 10).

In conclusion, the archaeological record cannot define the number of geographical origins or their locations, but suggests the date at 9,000 to 14,000 yr B.P., while our mtDNA data indicate a single origin of domestic dogs in East Asia ∼15,000 or 40,000 yr B.P. We conclude that a synthesis of available data points to an origin of the domestic dog in East Asia ∼15,000 yr B.P. In this event, clade A would have had several origins from wolf haplotypes, and the first domestication of wolves would not have been an isolated event, but rather a common practice in the human population in question.

Supporting Online Material

Material and Methods

Figs. S1 to S3

Tables S1 to S3

  • * To whom correspondence should be addressed. E-mail: savo{at}

  • Present address: Department of Biology, University of Konstanz, 78457 Konstanz, Germany.


View Abstract

Stay Connected to Science

Navigate This Article