Molecular Coproscopy: Dung and Diet of the Extinct Ground Sloth Nothrotheriops shastensis

See allHide authors and affiliations

Science  17 Jul 1998:
Vol. 281, Issue 5375, pp. 402-406
DOI: 10.1126/science.281.5375.402


DNA from excrements can be amplified by means of the polymerase chain reaction. However, this has not been possible with ancient feces. Cross-links between reducing sugars and amino groups were shown to exist in a Pleistocene coprolite from Gypsum Cave, Nevada. A chemical agent, N-phenacylthiazolium bromide, that cleaves such cross-links made it possible to amplify DNA sequences. Analyses of these DNA sequences showed that the coprolite is derived from an extinct sloth, presumably the Shasta ground sloth Nothrotheriops shastensis. Plant DNA sequences from seven groups of plants were identified in the coprolite. The plant assemblage that formed part of the sloth's diet exists today at elevations about 800 meters higher than the cave.

The polymerase chain reaction (PCR) has opened up new possibilities in ecology, archaeology, and paleontology by making it possible to retrieve DNA sequences from several previously untapped sources (1). One such source is feces, from which amplified DNA sequences allow identification of the species or individual from which the droppings originated as well as aspects of the diet and parasitic load of the animal (2). Large amounts of ancient feces (coprolites) can be found in certain dry caves or rock shelters—for example, in the southwestern United States (3). To investigate whether it may be possible to amplify DNA from such material, we have analyzed coprolites from Gypsum Cave about 30 km. east of Las Vegas, Nevada. These have been attributed to the Shasta ground sloth, Nothrotheriops shastensis(4), which became extinct about 11,000 years ago (5). Initial experiments showed that extraction protocols developed for contemporary fecal material did not yield amplifiable DNA from the Gypsum Cave coprolites (6). Therefore, to investigate the general state of macromolecular preservation in the coprolites, we performed chemical analyses.

Several samples were removed from a large fecal bolus. One was dated by accelerator mass spectrometry to 19,875 ± 215 years (Ua11835). Another sample was subjected to pyrolysis gas chromatography–mass spectrometry (Py-GC-MS). The pyrolysate (Fig. 1A) was dominated by products related to polysaccharides (cellulose and hemicellulose) and lignin. Minor pyrolysis products derived from proteins were also observed. Although some oxidation of lignin is evident, the relative abundance of polysaccharide products, and especially the hemicellulose derivative, suggest a relatively high degree of chemical preservation of ligno-cellulose (7). The large amount of syringol derivatives demonstrates that angiosperm lignin dominates in the coprolite, whereas the presence of vinylphenol can be attributed to monocotyledonous lignin.

Figure 1

(A) Total reconstructed ion chromatogram of the pyrolysate (610°C for 10 s) from the coprolite. (B) Partial desorption head space GC-MS total ion chromatogram for the volatile compounds within the coprolite. Reverse triangles, pyrolysis products of polysaccharides; triangle, hemicellulose marker; diamonds, products derived from amino acids; G, 2-methoxyophenol (guaiacol); S, 2,6-dimethoxyphenol (syringol), where side chains are attached at position 4 of the aromatic ring; P0, phenol; P1, methylphenol; P2, dimethylphenol; open circles, alkanones; closed circles,n-alkanes. Maillard reaction products (alkylated pyrazines, furaldehyde, and furanone derivatives) are indicated by chemical structures. Asterisks indicate products (acetophenone, benzoic acid, phenol, benzaldehyde, 2- methoxylphenol, naphthalene) that may be derived from plastic vials used for sample storage or that represent degradation products of lignin. Several additional pyrazines observed on the chromatogram are not labeled. The analytical conditions for Py-GC-MS are as described in (27). Conditions used for head space GC-MS are as described in (8). Note that the GC conditions of pyrolysis and head space analyses are different. Thus, there is no direct correlation between distribution and elution time of similar components in (A) and (B).

The volatile components of another sample of the bolus were analyzed by head space GC-MS (8) (Fig. 1B). Alkylpyrazines, furanones, and furaldehydes, all products of the condensation of carbonyl groups of reducing sugars with primary amines (the Maillard reaction) (9) were detected. Specifically, the alkylpyrazines are formed through condensation of dicarbonyl compounds with amino acids (9, 10), and the furan products are indicative of advanced stages of the Maillard reaction, which may result in extensive cross-linking of macromolecules (9) such as proteins (11) and nucleic acids (12). Maillard Products have been suggested to be a prominent component in ancient DNA extracts (13) and were recently found in ancient Egyptian plant remains (8).

Because these analyses indicated the presence of Maillard products, N-phenacylthiazolium bromide (PTB), a reagent that cleaves glucose-derived protein cross-links (14), was added to the extraction mixtures to release DNA that might be trapped within sugar-derived condensation products. Extractions were done with and without the addition of PTB (15) and amplifications of a 153–base pair (bp) DNA segment of the mitochondrial 12Sribosomal RNA (rRNA) gene were attempted (16). From all extracts with added PTB, PCR products were observed, whereas no PCR products were observed in the absence of PTB (Fig. 2). Furthermore, higher concentrations of PTB resulted in slightly stronger PCR products.

Figure 2

Agarose gel electrophoresis of a 153-bp (with primers) mitochondrial amplification for the 12S rRNA gene. Lanes: 1 and 12, 1-kb ladder (Gibco BRL, Bethesda, Maryland); 2 and 11, PCR blanks; 3 and 4, 1 and 10 mM PTB added before organic extraction; 5 and 6, 1 and 10 mM PTB added after organic extraction but before silica purification; 7, extraction without PTB and without silica purification (note lack of primer dimers indicating inhibition of the enzyme); 8, extraction without PTB; 9 blank extraction with PTB; 10 blank extraction without PTB. See (15) for details. Numbers on side indicate sizes (base pairs) of marker bands.

Four sets of primers (6, 17) were used to determine the nucleotide sequence of a 572-bp segment of the 12S rRNA gene (16). The PCR products ranged in size from 153 to 273 bp and showed an inverse relationship of amplification strength to length typical of ancient DNA (18). The products were cloned and a minimum of 10 clones were sequenced (19). To ensure the authenticity of the sequence information to the greatest extent possible, a second extract was prepared (20) and the same four fragments were amplified, cloned, and sequenced. The sequence determined from the coprolite was aligned to homologous sequences from representatives of each family of extant edentates—two-toed and three-toed sloths, anteaters and armadillos, and one extinct ground sloth, Mylodon darwinii, previously sequenced from bones and soft tissue remains (6). The dung sequence differs at 41 to 55 positions from the extant tree sloths and the mylodont sloth and at 88 positions from the other edentates. Phylogenetic analyses (Fig. 3) (21) confirm that the DNA sequence derived from the Gypsum Cave coprolite belongs to a sloth, presumably N. shastensis (Megatheriidae), whose bones were found in the cave (4).

Figure 3

Phylogenetic reconstruction of 12SrDNA sequences from the coprolite and representatives of modern edentates—an anteater (Tamandua tetradactyla), armadillo (Cabassous unicinctus), two-toed sloth (Choloepus didactylus), and three-toed sloth (Bradypus variegatus)—and from the extinct ground sloth Mylodon darwinii. The anteater is used as an outgroup. Numbers on internal branches refer to quartet puzzling support values (21).

To analyze the diet of the Gypsum Cave ground sloth, we treated DNA extracts with PTB and amplified a 183-bp fragment of the chloroplast gene encoding the large subunit of ribulose-bisphosphate carboxylase (rbcL) (16). We cloned the PCR product and sequenced the inserts of several clones. Ancient DNA contains chemical modifications that may cause substitutions in the amplified sequences, but such errors are unlikely to occur at the same position in sequences that are amplified independently (22). Therefore, we repeated the amplification and cloning twice with a different extract. We compared each unique insert with the approximately 2300 rbcL sequences deposited in the GenBank database (release 1.4.11; 24 November 1997) by means of the BLAST program (23) and noted the order and family of GenBank sequences displaying zero, one, and two differences from the clone. We classified clones with three or more differences as unassigned. We sequenced clones until 18 consecutive clones resulted in no new family assignments. In total, we sequenced 72 clones (Fig. 4).

Figure 4

DNA sequences of clones from a fragment of the rbcL gene amplified from the coprolite. Dots indicate identity with the sequence on top of the alignment. Ambiguous bases are indicated by standard abbreviations and dashes indicate deletions; × indicates clones that match data bank sequences with three or more differences; j indicates clones that represent putative jumping PCR events. Numbers in the left column represent clone number, extract number, and PCR number.

Fifteen clones were found to be identical to GenBank sequences from the families Capparaceae, Poaceae, Liliaceae, and Euphorbiaceae (Table 1). Eleven clones were found to differ at one position from sequences derived from those four families. In addition, two clones differed at one position from Rubiaceae and Chenopodiaceae sequences. Finally, seven and nine clones, respectively, displayed two differences from sequences belonging to the families Capparaceae and Liliaceae, and two clones displayed two differences from sequences in the orders Lamiales and Scrophulariales, which are closely related and sometimes regarded as one order (24). Whenever a group of related sequences was assigned to a taxon within a single family, we considered that family as the putative source of the clones. When a group of related clones matched species within different families belonging to one order, we considered that order as the putative source of the clones. Finally, when clones matched data bank sequences from different orders, we considered the plant as unassigned, most likely because no member of that order has yet been sequenced. With these criteria, 46 of 72 clones could be assigned to taxa in the database. Of the remaining 26 clones, most are closely related to the groups of sequences described above and carry substitutions observed in only one clone or in a few clones from one and the same amplification. These are likely due to damage in the template. Three clones appear to represent recombination products between other sequences in the sample and are therefore likely to be artifacts created during the PCR—a phenomenon previously observed in ancient DNA (25). Thus, analysis of the rbcL sequences suggests that the coprolite contains the remains of at least seven plants ingested by the Pleistocene ground sloth. Five of these can be assigned to an order or a family. These are, in order of frequency among the clones: Capers and mustards (Capparales), lilies and allies (Liliaceae including Agavaceae), grasses (Poaceae), borages and mints (Lamiales/Scrophulariales), and saltbushes (Chenopodiaceae). Currently there are two plants that cannot be identified. To investigate whether other fragments of the rbcL gene yield different plant identifications, we amplified three additional rbcL gene fragments that partially overlap with the initial fragment and we sequenced a total of 116 clones. The three taxa that were observed in three or more clones in the first fragment (Capparales, Poaceae, Liliaceae) were also identified from these clones, whereas the taxa observed in only one or two clones were not seen. Furthermore, the families Vitaceae (grape), Hydrophyllaceae (“water leaf family”), and Malvaceae (mallows) were identified, each from single clones. Thus, whereas the sequencing of additional clones may reveal plants that are rare in the bolus, the identification of plants whose DNA dominates quantitatively in the coprolite appears to be achieved reliably with the 183-bp fragment (Fig. 4). Nevertheless, it may be advisable to verify identification of the dominating plants by using an additional set of primers because a particular primer may select against a particular plant species as a result of substitutions in the primer sites.

Table 1

Number of clones matching families in the database at zero, one, or two differences. Numbers in parentheses after family names indicate number of genera in that family matching the assigned clones in the database. Sequence cluster designations refer to Fig. 4. Final column gives the assigned family or order [nomenclature according to (28)]. NID, no sequence in database matches clones.

View this table:

Table 2 indicates the six families and two orders of plants identified from the coprolite. Of these, Capparales and Liliaceae, make up 24 and 19%, respectively, of the 188 rbcL clones sequenced; the other five are rare in the bolus. It is interesting to note that all taxa identified on the basis of the DNA sequences are represented by contemporary genera in the Southwest. However, two taxa [Liliaceae (yucca and agave) and Vitaceae] (Table 2) do not occur in the vicinity of the cave today (elevation about 580 m). Of these, Liliaceae, most likely represented byYucca spp. and Agave spp., is common in the sample and is the likely source of the monocot signal observed in the Py-GC-MS analysis. These genera, as well as all other taxa observed in the sample, are now common in high-elevation desert scrub (above about 1370 m) on the Spring Range, about 50 km west of Gypsum Cave, and on the Las Vegas Range, about 30 km north-northeast of the cave. At 19,875 ± 215 years before present (B.P.), this sample dates to the last glacial maximum and it is reasonable to assume that yucca, now found only at higher altitudes, would then have been common around the cave. The recovery of DNA attributable to the grape family (Vitaceae) is also of interest. The only representative of this family in the Mojave Desert is the wild grape (Vitis), an obligate hydrophile that occurrs around springs and streams. The closest known paleosprings of relevant age are in Las Vegas Valley about 20 km to the southwest. Alternatively, Las Vegas Wash, located only 10 km south of the cave, may have experienced perennial flow during the glacial maximum. Thus, the ground sloth may have visited water sources a substantial distance from the cave.

Table 2

Plant families and orders found by molecular and macroscopic evidence and possible genera based on the low-elevation, glacial-age paleoecological record from southern Nevada (29). Partial list of flora present for Gypsum Cave as well as Spring Range are given. Molecular evidence is given as the number of clones assigned to that taxon (total = 97). Macroscopic evidence indicates the presence of Agave utahensis and Yucca brevifolia (1), Poaceae (2), Boraginaceae (Lithospermumtype) (3), Atriplex confertifolia (4), Ephedrasp. (5). Dashes indicate samples not found.

View this table:

A macroscopic examination of the same bolus identified five plant species (Table 2), one of which was not seen by the molecular analysis. Conversely, four of the taxa identified from the DNA sequences were not seen in the morphological analyses. Of the plants seen by only one approach, most are rare in the sample and thus may have been missed because of stochastic effects. However, the notable exception is capers and mustard, which is the most frequent taxon among the clones sequenced (24%); yet it is not seen in the macroscopic analysis. This may be a result of the lack of macroscopically distinctive remains left in the bolus. It is doubtful that organs other than seeds of capers and mustard species would retain their morphological attributes after mastication and digestion. Thus, the molecular evidence may be able to detect plants that are difficult to identify morphologically. A further advantage of the molecular approach is that it uses defined criteria for identification of plants that can be criticized and improved, especially as the DNA sequence data banks increase in size. For example, the taxonomic identification would probably become more accurate if relevant species from the area were sequenced. An additional advantage of the molecular approach to scatology is that DNA sequences can be used to identify the defecating species and to study its phylogeny.

DNA amplification from the coprolite was possible only after sugar-derived cross-links had been resolved by PTB. To date, 18 coprolites that range in age between 10,900 and 31,410 B.P. and are derived from various extinct species have been extracted with and without PTB. Whereas none of these has yielded amplification products when PTB was not used, six (from Gypsum Cave, Rampart Cave, and Steven's Cave in the southwestern United States and from Ultima Esperanza Cave in southern Chile) have yielded animal DNA sequences when treated with PTB (26). If the chemical state of preservation of biomolecules in fossils is better understood, reagents such as PTB can hopefully be used to make additional types of samples available for DNA amplification.

  • * Present address: Shell E & P Technology Company, 3737 Bellaire Boulevard, Houston, TX 77025, USA.


View Abstract

Navigate This Article