Linkage Disequilibrium and Recombination in Hominid Mitochondrial DNA

See allHide authors and affiliations

Science  24 Dec 1999:
Vol. 286, Issue 5449, pp. 2524-2525
DOI: 10.1126/science.286.5449.2524


The assumption that human mitochondrial DNA is inherited from one parent only and therefore does not recombine is questionable. Linkage disequilibrium in human and chimpanzee mitochondrial DNA declines as a function of the distance between sites. This pattern can be attributed to one mechanism only: recombination.

For many years, it has been accepted that mitochondria are inherited exclusively from the mother in mammals and that their inheritance is therefore clonal. This assumption has been used extensively to date events in human prehistory, including the age of our last common female ancestor, “Eve,” and the spread ofHomo sapiens in Asia and Europe (1). However, mitochondria contain the enzymes necessary for homologous recombination (2), and there are at least two routes by which strict maternal inheritance of mitochondrial DNA (mtDNA) could be circumnavigated: (i) Paternal mitochondria enter the egg at fertilization (3, 4), and (ii) there are copies of mtDNA sequences in the nuclear genome (5) that could be transferred back to the mtDNA. Two reports have presented population genetic evidence of recombination in mtDNA (6, 7). Although doubts were raised about the quality of the data and the conclusions reached in one of these reports (8), subsequent analyses of newer data have reconfirmed the original result (9).

Recombination can also be detected by considering the relation between linkage disequilibrium (LD) and distance. As the distance between sites increases, the effect of recombination on LD should increase, whether recombination is a reciprocal or a nonreciprocal (gene conversion) process. Recombination should therefore manifest itself as a significant decline in LD with distance (10–13).

To measure disequilibrium, we used the Hill and Robertson measure (14)Embedded Image(1)where A and a represent alleles at one locus, B and b represent alleles at the other locus, and pXY is the frequency of the XY genotype. r2 can range from 0 to 1, from complete linkage equilibrium to complete disequilibrium, respectively. We only calculated LD values for pairs of sites at which both alleles were segregating at a frequency greater than 10% (in no case were more than two different nucleotides present at a site). There were two reasons for this restriction: (i) Alleles at higher frequencies tend to be older and therefore more likely to show evidence of past recombination events, and (ii) values of r2 = 1 often occur by chance for alleles with low frequencies (15).

We analyzed both sequence and restriction fragment length polymorphism (RFLP) data for complete human mitochondrial genomes. To test whether LD declines with distance, we calculated Pearson's correlation coefficient, ρ, between LD and the distance between sites. To assess significance, we randomized the positions of sites and then recalculated the value of ρ (11, 13, 16). This was repeated 5000 times; the significance is given as the proportion of random ρ values that are more negative than the observed value. For the complete human mitochondrial sequences, we restricted analysis to synonymous variation, because selection can generate LD and there is no evidence of selection on synonymous codon usage in human mitochondrial sequences (7). In the RFLP data, it is generally not possible to determine whether the loss of a site is due to a synonymous or nonsynonymous change, so all variants in protein, tRNA, and ribosomal RNA coding sequences were analyzed. The human mitochondrial genome is a circular molecule 16,569 base pairs (bp) in length, so the longest distance between pairs is 8284 bp.

In humans, we have five independently collected data sets: 45 complete mtDNA sequences of diverse geographical origin (17) and four RFLP data sets spanning the whole genome, from 147 individuals from around the world (18), 86 Finnish and Swedish individuals (19), 153 Native Siberians (20), and 167 Native Americans (21). The relation between LD and the distance between sites, for four of these data sets, is shown in Fig. 1. In each case, there is an evident decline in LD with increasing distance, and in each case, the decline is significant. For example, Fig. 1A shows the 91 pairwise LD values for the 14 synonymous sites in the DNA sequence data set, plotted as a function of the distance between sites. The relation with distance is highly significant (P = 0.012). In the worldwide RFLP data set, there were only four sites segregating mutations at greater than 10%; there is a negative correlation between LD and distance (ρ = −0.365), but it is nonsignificant (P = 0.190). These patterns are consistent with those observed for nuclear sequences undergoing recombination (10– 13) and with population genetic predictions for the decay of LD due to recombination (15, 22, 23).

Figure 1

The relation between LD and the distance between sites in the human mitochondrial genome for four data sets. (A) Synonymous variants from 45 complete mtDNA sequences from individuals of diverse origins: 18 Caucasian, 13 Japanese, 2 African, and 12 individuals of unknown origin (17). Sites analyzed are 4985, 6455, 7028, 9540, 10,873, 11,251, 11,299, 11,467, 12,372, 12,705, 13,617, 14,783, 15,043, and 15,301. In this data set, there were no sites with multiple alleles (sites with more than two nucleotides) segregating. RFLP data sets: (B) Swedish and Finnish individuals (19). Sites analyzed are 7025, 10,394, 12,308, 13,366, 15,606, and 15,925. (C) Native Siberian individuals (20). Sites analyzed are 1715, 5176, 7933, 8391, 10,394, 10,397, and 13,262. (D) Native American individuals (21). Sites analyzed are 663, 5176, 10,394, 10,397, and 13,262. In the Native American and Siberian data sets, RFLP sites 10,394 and 10,397 could be caused by a single mutation. However, they segregate at different frequencies, so it seems likely that they are due to different mutations; however, the results remain qualitatively unchanged if either site is excluded (Siberians: excluding 10,394, ρ = −0.705, P = 0.001; excluding 10,397, ρ = −0.485, P = 0.026. Native Americans: excluding 10,394, ρ = −0.583, P = 0.089; excluding 10,397, ρ = −0.653, P = 0.063).

For chimpanzees (Pan troglodytes), we used synonymous variants from the reduced form of nicotinamide adenine dinucleotide dehydrogenase subunit 2 (ND2) locus and data from the control region (CR) from 16 individuals of the subspecies verus from western Africa (24). There are about 3800 bp between the two sequences. There were 45 polymorphic sites and 990 pairwise comparisons. Again, there was a significant decline in linkage disequilibrium as distance between pairs of sites increased (ρ = −0.147; P = 0.0076). The proportion of pairs of sites with LD values significantly different from zero, at the 5% level, was much higher within the ND2 and CR sequences (166 out of 593 pairwise comparisons) than between them (46 out of 396) (Fisher's exact test, P < 10−6).

We have demonstrated that LD declines as a function of the distance between sites in humans and chimpanzees. This is expected if there is genetic recombination but hard to explain in any other way; sequencing errors would tend to obscure the effect, not generate it. The fact that we obtained significant evidence for recombination in five out of the six hominid data sets we analyzed makes it no longer reasonable to assert that mitochondria do not undergo genetic recombination, particularly because the test we used is not powerful even in a panmictic population (25) and patterns of LD with distance can easily be hidden by LD generated by population subdivision (26).

There are three possible routes by which recombination could occur: (i) paternal leakage, (ii) recombination with copies of mtDNA sequences in the nuclear genome, and (iii) recombination in heteroplasmic individuals. The last possibility is unlikely; not only would two mutations have to occur within one individual and be maintained in heteroplasmy until recombination had occurred, but all four haplotypes (that is, AB, Ab, aB, and ab) would have to have descendants in our sample for us to detect the recombination event. It is difficult to assess the relative likelihood of the other pathways; paternal leakage is known to occur (3, 4), although in mice, there are efficient mechanisms for destroying paternal mtDNA (4), and there are mtDNA pseudogenes in the nucleus (5).

Many inferences about the pattern and tempo of human evolution (1) and mtDNA evolution (27) have been based on the assumption of clonal inheritance. These inferences will now have to be reconsidered.

  • * To whom correspondence should be addressed. E-mail: a.c.eyre-walker{at}


Stay Connected to Science

Navigate This Article