Technical Comments

Comment on “Late Pleistocene human skeleton and mtDNA link Paleoamericans and modern Native Americans”

See allHide authors and affiliations

Science  20 Feb 2015:
Vol. 347, Issue 6224, pp. 835
DOI: 10.1126/science.1260617

Abstract

Chatters et al. (Reports, 16 May 2014, p. 750) reported the retrieval of DNA sequences from a 12,000- to 13,000-year-old human tooth discovered in an underwater cave in Mexico’s Yucatan peninsula. They propose that this ancient human individual’s mitochondrial DNA (mtDNA) belongs to haplogroup D1. However, our analysis of postmortem damage patterns finds no evidence for an ancient origin of these sequences.

Until recently, evidence of long-term DNA preservation has largely been confined to remains from temperate climate zones or the permafrost. The study of Chatters et al. (1) appeared almost simultaneously with another report of ancient DNA preservation in the tropics by Gutiérrez-García et al. (2), who amplified and sequenced short fragments of mitochondrial DNA from Holocene and Late Pleistocene rodent jawbones excavated at a terrestrial cave also located on the Yucatan Peninsula. The sequences they retrieved differ from all living representatives of the same genus, lending credibility to their results. Chatters et al. also used polymerase chain reaction (PCR) to amplify fragments of mitochondrial DNA and obtained sequences diagnostic for human mitochondrial haplogroup D1. The authors argue that the absence of this haplogroup in the members of the excavation team and laboratory personnel excludes the possibility that these sequences derive from contemporary contamination. However, human DNA is ubiquitous, and the absence of contamination cannot be proven. For example, Native American DNA may have entered the bone through contaminated equipment or may be present in reagents used in the laboratory. Human contamination can occur sporadically and may go undetected in control reactions (3).

Fortunately, the limitations of PCR-based approaches have been overcome in recent years. By sequencing DNA libraries on high-throughput platforms, full-length molecule sequences can now be reconstructed. These sequences have revealed a characteristic accumulation of C-to-T substitutions at the 5′ end as well as G-to-A substitutions at the 3′ end of ancient DNA sequences (and variation of these patterns depending on the exact protocol used for sample preparation) (4). These substitutions are caused by deamination of cytosine to uracil in single-stranded DNA overhangs and are largely absent in sequences from younger contamination, providing a means for authenticating ancient sequences (5). The exact frequencies of damaged-induced substitutions cannot be predicted from fossil age alone (6). However, deamination is known to proceed faster at higher temperatures (7). Substitution rates in the Yucatan sample should therefore exceed the 10% or more that have been detected in remains of younger age (500 to 6000 years) preserved in the temperate climate of Middle Europe (6, 8).

Surprisingly, even though the authors of the paper have generated some additional data using library-based techniques and high-throughput sequencing, they did not report any analysis of DNA damage patterns in their article. Upon our request, the authors made the primary data of their mitochondrial enrichments and sequencing experiment available (SRR1312068). After downloading the data, we trimmed sequencing adaptors [using AdapterRemoval (9)], aligned the reads to the revised Cambridge mitochondrial reference genome with the Burrows-Wheeler Alignment tool (10) [(“ancient” parameters (11)], and merged reads with identical start and end coordinates to remove PCR duplicates (using “bam-rmdup,” available at https://github.com/udo-stenzel/biohazard). Counting all differences to the reference genome along the sequences (Fig. 1A), we detect an increase of G-to-A substitutions toward the 3′ end and a smaller and more irregular pattern of C-to-T changes at the 5′ end of the sequences. These patterns are consistent with the presence of authentic ancient DNA. The depression of 5′ C-to-T substitutions can be explained by the use of a polymerase for library amplification that cannot copy across uracils (note that G-to-A substitutions arise through fill-in of uracil-containing 5′ overhangs during library preparation with a polymerase that bypasses uracils (4)).

Fig. 1 Sequences with substitutions characteristic for ancient DNA do not support haplogroup D.

(A) Ancient DNA substitution patterns in mitochondrial capture data from Chatters et al. (B) Sequences overlapping two positions diagnostic for haplogroup D and one position diagnostic for haplogroup C. Sequence alignments follow the visualization scheme of “samtools tview” (15), which depicts sequences aligned in reverse complement direction by commas and small letters. Putatively damage-derived substitutions are thus visible as G-to-A and C-to-T substitutions.

We next investigated whether reads that show patterns of ancient DNA damage carry substitutions in support of haplogroup D. This is true for only three out of 14 sequences that overlap two informative sites [C5178A and T16362C (12)] and that contain at least one G-to-A substitution within the last 10 positions in the alignment (Fig. 1B). These three sequences are of the maximum read length of 100 base pairs and exhibit several substitutions at their 3′ end that may be caused by unrecognized adaptor sequence or by sequencing error. Further inspection shows that they likely originate from a single molecule and were present in a total of 633 copies before duplicate removal. Interestingly, the remaining 11 sequences, which do not support the presence of haplogroup D, are present in fewer than 10 copies each, and two of these sequences share a variant indicative of haplogroup C at a neighboring position (C16327T) (Fig. 1B). We conclude that none of the sequences with putative ancient DNA damage support haplogroup D.

A histogram of the number of replicate sequences representing each unique mitochondrial sequence shows a bimodal distribution, with one population of sequences being highly replicated (more than 10,000copies) and another much less (less than 100 copies) (see Fig. 2A). In the supplementary materials of the paper, the authors report cross-contamination of the library with a different sample that was extracted in their laboratory during the same week. The observed copy number difference suggests that this contamination was introduced after initial amplification of the library. Indeed, sequences carrying a substitution indicative for haplogroup D (C5178A and T16362C) have a significantly higher copy number compared with sequences specific for haplogroup C (C16327T, T3552A, A9545G, A13263G, and T14318C) (Wilcoxon rank sum test, P = 0.0001). We also find that the low-copy-number sequences show patterns of ancient DNA damage, whereas high-copy-number sequences do not (Fig. 2C). The former are also significantly shorter (Fig. 2B).

Fig. 2 Two distinct populations of sequences with different characteristics are present.

(A) Histogram of copy numbers of sequences and density plot for copy numbers supporting haplogroup C and D. Copy number is given on a log scale, and density is plotted relative to log10(copy number). (B) Scatter plot of copy number of sequences against read length. Values are significantly positively correlated (Spearman rank correlation, rho = 0.366, P < 2.2 × 10–16). (C) Ancient DNA substitution frequencies in high-copy versus low-copy sequences.

Taken together, our analyses suggest at least two contamination events: first, contamination of the bone or sampling equipment with Native American DNA, which was amplified by PCR and recovered in the DNA library; and second, cross-contamination with library molecules from a different sample, which is indeed ancient and carries haplogroup C. In conclusion, we find no evidence for the preservation of authentic ancient sequences in the Yucatan sample. Instead, our results reinforce the notion that PCR-based approaches provide little power for verifying the authenticity of sequences obtained from ancient human samples (13). Sequence data generated with library-based approaches can provide positive evidence for the authenticity of ancient DNA if cross-contamination of libraries is avoided (14). Importantly, as demonstrated here, such data also allow for critical evaluation of published results by independent researchers. Thus, in our opinion, PCR-based methods should be abandoned when studying ancient human remains.

References

View Abstract

Navigate This Article