Targeted Retrieval and Analysis of Five Neandertal mtDNA Genomes

See allHide authors and affiliations

Science  17 Jul 2009:
Vol. 325, Issue 5938, pp. 318-321
DOI: 10.1126/science.1174462


Analysis of Neandertal DNA holds great potential for investigating the population history of this group of hominins, but progress has been limited due to the rarity of samples and damaged state of the DNA. We present a method of targeted ancient DNA sequence retrieval that greatly reduces sample destruction and sequencing demands and use this method to reconstruct the complete mitochondrial DNA (mtDNA) genomes of five Neandertals from across their geographic range. We find that mtDNA genetic diversity in Neandertals that lived 38,000 to 70,000 years ago was approximately one-third of that in contemporary modern humans. Together with analyses of mtDNA protein evolution, these data suggest that the long-term effective population size of Neandertals was smaller than that of modern humans and extant great apes.

Currently, DNA sequences determined from multiple Neandertals are restricted to short fragments [120 to 360 base pairs (bp)] of the hypervariable regions (HVRs) of mitochondrial DNA (mtDNA) (1, 2). These data have demonstrated that Neandertal mtDNA is distinct from that of modern humans (3). However, collecting sequence data from the rest of the mtDNA genome has proven difficult due to numerous technological difficulties (4). Recently, the complete mtDNA genome sequence of a ~38,000-year-old Neandertal individual from Vindija Cave, Croatia, was determined by high-throughput shotgun sequencing from total DNA extract (5). A similar approach has been used to recover complete mtDNA sequences from permafrost-preserved mammoths and a human (6, 7). However, the amount of shotgun sequencing needed to retrieve complete mtDNA sequences is prohibitive for most ancient bone specimens due to the high fraction of environmental DNA that they contain. For example, only 0.001% of DNA sequences determined from typical well-preserved Neandertal specimens are derived from mtDNA (table S1). Thus, a simple shotgun approach would require hundreds or thousands of high-throughput pyrosequencing runs to recover a single Neandertal mitochondrial genome (table S1). Direct polymerase chain reaction (PCR) is also poorly suited for retrieving complete Neandertal mtDNA genomes, because DNA extracted from the fossils is so fragmented that hundreds of overlapping amplicons would be necessary, either requiring highly multiplexed primer mixes that present severe difficulties for avoiding modern human contamination, or many parallel amplification reactions that consume large amounts of precious ancient DNA extracts (8).

We have developed a method—primer extension capture (PEC)—that directly isolates specific DNA sequences from complex libraries of highly degraded DNA (Fig. 1). PEC uses 5′-biotinylated oligonucleotide primers and a DNA polymerase to capture specific target sequences from an adaptor-ligated DNA library. It combines the high specificity of PCR primers with the numerous advantages of a library sequencing approach, including immortalization through reamplification from adaptor priming sites (9) (fig. S1), contamination control with project-specific barcodes (5, 10), access to very short fragments predominant in ancient extracts (11), and quantification of the number of unique ancient DNA molecules, which is necessary to identify nucleotide misincorporations (1012).

Fig. 1

Primer extension capture (PEC). (i) 5′-Biotinylated oligonucleotide primers (PEC primers) are added to a 454 library [in which the A and B adaptor molecules carry a project-specific barcode (10)] and are allowed to anneal to their respective target sequences. (ii) A single Taq DNA polymerase extension step is performed, resulting in a double-stranded association between primer and target that includes the 5′ adaptor sequence. (iii) Excess PEC primers are removed by spin column purification, and the biotinylated primer:target duplexes are captured by streptavidin-coated magnetic beads. The beads are washed stringently above the melting temperature of the PEC primers, to ensure that templates upon which extension occurred will preferentially remain associated with the primers. (iv) Captured and washed targets are eluted from the beads, amplified with adaptor priming sites, and subjected either to a second round of extension and capture or directly to 454 emulsion PCR (full details are in the SOM).

We used PEC to recover the entire Neandertal mtDNA genome [supporting online material (SOM)] of five individuals from four sites across the geographic range of Neandertals (Fig. 2 and table S2). One individual (Vindija 33.25) from Vindija Cave, Croatia, is undated but was found in an older stratigraphic layer than the previously sequenced bone (5) (Vindija 33.16), which was dated to ~38,000 years before present (yr B.P.) (3). Two individuals (Feldhofer 1 and 2, the former being the Neandertal type specimen) come from Kleine Feldhofer Grotte, Neander Valley, Germany, and are dated to ~40,000 yr B.P. (13). One individual (Sidron 1253) comes from El Sidron Cave, Spain, and is dated to ~39,000 yr B.P. (14), and one (Mezmaiskaya 1) comes from Mezmaiskaya Cave, Russia, and is dated to 60,000 to 70,000 yr B.P. (15).

Fig. 2

mtDNA phylogeny of Neandertals and their close relatives. The top left panel gives the localities and estimated ages of the analyzed Neandertals. The phylogeny was estimated with a Bayesian approach (30) under a GTR+I+Γ model of sequence evolution. Modern humans are represented by the revised Cambridge reference sequence and a collection of 53 sequences sampled from major language groups. Nodes marked with an asterisk (*) have a posterior probability of 1.0. For clarity, support for major partitions within modern humans is not shown. Estimates of dN/dS for 12 concatenated protein-coding genes are indicated below the branch.

A Look at the Method

Science is working with the Journal of Visualized Experiments to provide online videos documenting methods underlying selected papers. In this video, author Adrian Briggs demonstrates the primer extension capture (PEC) technique for directly isolating specific DNA sequences from complex libraries of highly degraded DNA.

Have comments on this feature? Please let us know in a quick online survey.

We generated between 170,330 and 521,680 sequence reads per individual on the 454 FLX platform and processed them with a mapping assembly program (SOM), taking the Vindija 33.16 mtDNA sequence (5) as a reference. Between 18.2 and 40.2% of sequences were identified as mitochondrial, representing a raw target enrichment of 3640- to 80,400-fold (table S1). Nontarget sequences were enriched in sequence motifs from the 3′ ends of PEC primers (fig. S2), suggesting the occurrence of some false priming events. All five individuals showed continuous coverage across the entire mtDNA genome with an average unique read depth of 18.0 to 56.3× (fig. S3). As reported previously for Neandertal mtDNA (5), average fragment lengths were short (51.3 to 79.3 bp, fig. S4), coverage was positively correlated with GC content (fig. S5), and bases differing from the consensus were dominated by C/G→T/A substitutions (fig. S6), consistent with deamination of cytosine (10, 11, 16) clustered toward the ends of fragments (10, 11) (fig. S7). However, the rate of C/G→T/A misincorporations varied among the fossils and was negatively correlated with average fragment length (fig. S8), likely reflecting differential hydrolytic damage suffered by DNA in different depositional environments. To determine if sequencing coverage in each assembly was sufficient to overcome misincorporations and other sequencing errors, we calculated the proportion of reads at each position that matched the assembly consensus base (fig. S9). Apart from a single ambiguous position (see fig. S10 for resolution of this position), support across the 82,865 aligned positions (five 16,565-bp mtDNA genomes) was high (mean = 98.5%, minimum 60.0%). At all 43 positions with support lower than 75%, we observed a minor base consistent with deamination of cytosine. Sites that differed from the Neandertal reference had coverage (mean = 32.0×) and support (mean = 98.7%) similar to those that matched the reference (mean coverage = 36.5×; mean support = 98.5%). None of these sites had support lower than 78% (table S3), and their locations were not correlated with PEC primer sites (all P > 0.50, Fisher’s exact test; table S4). These data suggest that the five consensus sequences are unlikely to have been affected by sequencing or base damage errors, because such errors would tend to create sequence differences that would be poorly supported and located in regions of low coverage.

We quantified the level of modern human DNA contamination in each of our assemblies by counting Neandertal versus contaminating fragments at all positions (124 to 130 positions per individual) where the Neandertal was found to differ from a worldwide panel of 311 aligned modern human mtDNA genomes (5). From 876 to 2969 fragments overlapping these positions per Neandertal assembly, we estimated that modern human contamination ranged from 0.2 to 1.4% (table S5). At these levels, contamination is highly unlikely to affect the determination of consensus sequences. Finally, the PEC-derived Neandertal mtDNA genomes were compared to PCR-amplified HVR sequences from the same individuals (13, 17) (SOM and fig. S11). Aligned sequences were identical between the two methods for every individual except Feldhofer 1, where some previously postulated errors for PCR-derived positions (13) were confirmed as such by PEC.

Comparison among the six complete Neandertal mtDNA genomes revealed a total of 55 variable positions across 16,565 aligned nucleotides (Table 1). On average, the six Neandertal mtDNAs differ by 20.4 substitutions. We contrasted Neandertal mtDNA diversity with variation among modern humans as represented by the revised Cambridge reference sequence (18) and a previously published worldwide sample of 53 individuals (19). Variation among the Neandertals was approximately one-third of that estimated for modern humans worldwide, approximately half that of individuals from non-African populations, and similar to that of the nine Europeans in this sample (Table 1). When compared to a broader survey of 30 modern Europeans (Table 1 and SOM), Neandertals had 37% lower mtDNA diversity. Because the Neandertal sequences stem from several distinct time points spanning thousands of years, we used coalescent simulations (20, 21) to assess the magnitude of any bias introduced by the temporal sampling and found that it may cause up to a ~20% overestimate of Neandertal genetic diversity (table S6) relative to a sample drawn from a single point in time. Thus, the observation that mtDNA diversity is lower in Neandertals than in modern humans appears to be conservative with respect to sampling over time.

Table 1

Mitochondrial DNA variation in Neandertals and modern humans.

View this table:

Estimation of phylogenetic relationships confirmed the reciprocally monophyletic status of Neandertal and modern human mtDNAs (Fig. 2). Among the six Neandertal mtDNAs, the sequence from Mezmaiskaya is more divergent from the other five (44.4 mean pairwise differences) than the latter are among themselves (8.4 mean pairwise differences). As observed previously (5), the Neandertal branch in the phylogeny appears shorter than that of modern humans. Simulations show that this asymmetry is well within the expected range of stochastic variation given that the Neandertal mtDNA sequences are at least 38,000 years old (fig. S12).

With a Bayesian approach (22) calibrated with the fossil ages of the five dated Neandertal sequences and a human-chimpanzee divergence of ~6 million years (see SOM), we estimated the time to the most recent common Neandertal mtDNA ancestor to be ~109,800 yr B.P. [84,630 to 138,500 yr B.P., 95% highest posterior density (HPD); table S7]. This is ~80% of the mean coalescent time estimated for modern human mtDNA with the same approach (mean time to the most recent common ancestor = 136,100 yr B.P.; 94,930 to 178,700, 95% HPD). Assuming that positive selection has not influenced Neandertal mtDNA variation, a recent coalescent time despite a ~38,000-year truncation of the Neandertal lineage suggests that the effective population size of Neandertals (Ne) was small and probably included fewer than 3500 females (mean Ne = 1476; 268 to 3510, 95% HPD). The recovery of identical mtDNA sequences from Vindija in Croatia and the Neander Valley in Germany, which are more than 850 km apart, supports the inference of a small effective population size in Neandertals. Given the temporal sampling and a constant effective size of 1500, the probability of sampling at least two identical mtDNA genomes among six is reasonably high (P = 0.25). In contrast, only one pair of identical sequences was identified among the 30 Europeans, as expected given the recent population expansion of modern humans (23).

The observation that the most divergent mtDNA genome was retrieved from the oldest and easternmost individual analyzed (Mezmaiskaya 1) raises the possibility that genetic structure existed between Neandertals from different geographic regions and/or time points (1). Unaccounted-for population structure within the sample of Neandertals could inflate the estimate of effective population size (24). We therefore used PEC to retrieve a partial mtDNA genome (13,568 bp, SOM) from a second, more recent individual from Mezmaiskaya Cave [Mezmaiskaya 2, ~41,000 yr B.P. (15); table S2]. This mtDNA sequence unambiguously groups with the five complete mtDNAs from the western sites (fig. S13). Thus, no obvious phylogeographic structure can be discerned among the seven mtDNAs analyzed. Analysis of additional individuals from different times and locations will be necessary to examine broader patterns of temporal and/or geographic variation in Neandertal genetic diversity (1).

Low mtDNA diversity may reflect a low effective population size of Neandertals over a large part of their ~400,000-year history. Alternatively, it might be typical only of late Neandertals, who may have experienced a reduction in population size due to direct or indirect influences from modern humans expanding out of Africa. Of relevance for this question is the observation that the first complete Neandertal mtDNA genome had a higher rate of protein evolution (measured as the ratio of per site amino acid replacement to per site silent substitutions, dN/dS) than humans and great apes (5). This pattern is consistent with a long-term reduction in the effective population size of Neandertals, because purifying selection is expected to be less effective at purging weakly deleterious mutations in small populations (25). We tested this by partitioning dN/dS into polymorphic and fixed differences along both the Neandertal and modern human lineages. dN/dS for fixed differences was about twice as high along the Neandertal lineage (0.168) as along the modern human lineage (0.082; Fig. 2) or the lineages of chimpanzees and bonobos (dN/dS < 0.08; table S8). Furthermore, a likelihood model (26) in which dN/dS is higher in Neandertals versus the other lineages best explains patterns of protein evolution across all groups (fig. S14). Although this difference appears pronounced, our estimate is based on relatively few amino acid substitutions overall, and lineage-specific variation in dN/dS provides only an indirect assessment of long-term effective population size. Nonetheless, patterns of mtDNA protein evolution are consistent with the notion that the effective population size of Neandertals was small not only among late Neandertals but over a longer period of their existence. Clearly, the recovery of nuclear sequences from multiple Neandertals will be necessary to arrive at a more complete picture of their population history. Encouragingly, PEC and other targeted resequencing approaches (2729) now make this feasible.

Supporting Online Material

Materials and methods

Figs. S1 to S14

Tables S1 to S8


Appendix 1

References and Notes

  1. We thank P. Johnson, M. Knapp, A.-S. Malaspinas, M. Meyer, J. Sullivan, M. Slatkin, and anonymous reviewers for comments; the Croatian Academy of Sciences and Arts and the Berlin-Brandenburg Academy of Sciences for logistic and scientific support; and the Presidential Innovation Fund of the Max Planck Society for financial support. The government of the Principado de Asturias funded excavations at the El Sidron site. C.L.-F. was supported by the Spanish Ministry of Education and Science and J.M.G. by an NSF international postdoctoral fellowship (OISE-0754461). Sequences are deposited at the EBI (European Bioinformatics Institute) nucleotide database with the following accession numbers: Neandertal 1 (Feldhofer 1), FM865407; Neandertal 2 (Feldhofer 2), FM865408; Sidron 1253, FM865409; Vindija 33.25, FM865410; Mezmaiskaya 1, FM865411.
View Abstract

Navigate This Article