Genomic exploration of the diversity, ecology, and evolution of the archaeal domain of life

See allHide authors and affiliations

Science  11 Aug 2017:
Vol. 357, Issue 6351, eaaf3883
DOI: 10.1126/science.aaf3883

Archaeal diversity and evolution

Archaea are prokaryotes that make up a third branch of the tree of life. Knowledge of archaeal biological diversity and their role in evolution has rapidly expanded in the past decade. Despite the discovery of previously unknown groups and lineages, few lineages have been well studied. Spang et al. review the diversity of Archaea and their genomes, metabolomes, and history, which clarifies the biology and placement of recently discovered archaeal lineages.

Science, this issue p. eaaf3883

Structured Abstract


Archaea have been recognized as a major domain of life besides Bacteria and Eukarya for about 40 years. Much of the pioneering research on archaeal organisms was dedicated to studying cellular machineries governing basal cellular processes such as transcription and translation and on elucidating physiological characteristics of the few archaeal strains that could be cultured. Although environmental surveys had started to reveal a plethora of uncultured archaeal lineages during the 1990s, in-depth knowledge about their diversity, ecology, and evolution remained limited. Recent developments in the field of metagenomics and single-cell genomics, however, have allowed the reconstruction of archaeal genomes directly from environmental samples without prior cultivation. Altogether, the use of these cultivation-independent approaches has led to the discovery of a multitude of previously unidentified archaeal lineages, some of which represent completely new branches in the archaeal tree. In this Review, we provide an overview of the currently recognized archaeal diversity, summarize new findings on the metabolic potential of recently described archaeal lineages, and discuss these data in light of archaeal evolution.


Up until about a decade ago, all archaea were assigned to one of two major clades: the Crenarchaeota, which mostly comprise extreme thermophiles, or the Euryarchaeota, which mainly included methanogens and halophiles. Since then, our knowledge on archaeal diversity has increased rapidly. Now, Archaea include at least four major supergroups, the Euryarchaeota and the TACK, Asgard, and DPANN archaea, all of which comprise several different, potentially phylum-rank clades (see the figure). The lineages of these groups are not restricted to extreme habitats, as was once thought common for archaeal species; rather, archaea are widespread and occur in all thinkable environments on Earth, where they can make up substantial portions of the microbial biomass.

In line with their vast diversity, comparative genomics analyses reveal that Archaea are metabolically versatile and are characterized by different lifestyles. Recently discovered archaeal lineages include mesophiles and (hyper-)thermophiles, anaerobes and aerobes, autotrophs and heterotrophs, a large diversity of putative archaeal symbionts, as well as previously unknown acetogens and different groups of methanogens (see the figure). In fact, both the genetic potential to use the ancient Wood-Ljungdahl carbon fixation pathway and indications for methanogenic traits as well as the ability to anaerobically oxidize methane and other short hydrocarbons have now been found in various lineages outside the Euryarchaeota—the phylum comprising traditional methanogens and methane oxidizers. So far, little is known about the actual physiology of these archaea in their environmental niches or about their potential syntrophic relationships with other organisms, but recent findings highlight the importance and wide occurrence of these metabolic regimes in a wide diversity of archaea from anaerobic environments. Furthermore, these findings support hypotheses that suggest that all extant archaea evolved from an anaerobic autotrophic ancestor that used the Wood-Ljungdahl pathway and may have been able to obtain energy through methanogenesis.

Last, the investigation of informational processing and cellular machineries have revealed that genomes of Asgard archaea, which affiliate with eukaryotes in the tree of life (see the figure), encode proteins that they only share with eukaryotes. Excitingly, these proteins are functionally enriched for membrane bending, vesicular biogenesis, and trafficking activities, suggesting that eukaryotes evolved from an archaeal host that contained some key components that governed the emergence of eukaryotic cellular complexity after endosymbiosis. It will be exciting to determine the function of these proteins, which may be involved in species-species interactions in extant members of the Asgard archaea.


The wealth of novel archaeal genomic data represents a treasure trove for generating testable hypotheses on the metabolic and cellular features of these archaea and will thus help to unveil their thus far poorly characterized biology. Recent findings emphasize the importance of investigating members of the archaeal domain of life in order to obtain a more comprehensive view of microbial ecology, symbiosis, and metabolic interdependencies involving archaeal partners, and of evolution of life on Earth in regard to the deep roots of archaea as well as our microbial ancestry.

The archaeal tree of life.

Schematic phylogenetic tree including major archaeal groups for which genomic data are currently available. The currently recognized groups include the Euryarchaeota (green branches) and the TACK, Asgard, and DPANN archaea (tan, yellow, and pink branches, respectively). Eukaryotes (gray branch) are suggested to have emerged from the Asgard archaea upon endosymbiosis with an alphaproteobacterial partner (the mitochondrial endosymbiont). For each archaeal lineage, key characteristics regarding metabolism and lifestyle are depicted.


About 40 years ago, Archaea were recognized as a major prokaryotic domain of life besides Bacteria. Recently, cultivation-independent sequencing methods have produced a wealth of genomic data for previously unidentified archaeal lineages, several of which appear to represent newly revealed branches in the tree of life. Analyses of some recently obtained genomes have uncovered previously unknown metabolic traits and provided insights into the evolution of archaea and their relationship to eukaryotes. On the basis of our current understanding, much archaeal diversity still defies genomic exploration. Efforts to obtain and study genomes and enrichment cultures of uncultivated microbial lineages will likely further expand our knowledge about archaeal phylogenetic and metabolic diversity and their cell biology and ecological function.

This year will mark the 40th anniversary of the publication of Woese and Fox’s article in which Archaea were described as a primary division of life, on a par with Bacteria and Eukarya (1). Before this landmark study, all life had been assigned to either of two major groups of organisms, the eukaryotes or prokaryotes. Whereas eukaryotes contain internal compartments and a nucleus, prokaryotes included organisms without intracellular structures. On the basis of comparisons of the small subunit of the ribosomal RNA of a wide range of eukaryotes and prokaryotes, Woese and Fox documented that the latter was composed of two fundamentally different types of organisms and suggested a tripartite tree of life that included three major domains: the Bacteria, Archaea, and Eukarya (2). However, this proposal was not immediately accepted by the scientific community. First, it clashed with the then-accepted eukaryote-prokaryote dichotomy. Second, it was based on results obtained by molecular sequence data alone, which contrasted with established approaches to infer taxonomy by using morphological traits (3). The construction of phylogenetic trees was still in its infancy and primarily focused on protein sequence comparisons (4). Last, most of the identified Archaea were sampled from extreme environments [reviewed in (5)], leading to the presumption that Archaea just represented an extreme group of bacteria with minor evolutionary relevance.

However, further evidence for the distinctiveness of Archaea has subsequently been obtained. For example, archaeal cell walls lack peptidoglycan and are thus fundamentally different from those of Bacteria (6). Furthermore, archaeal DNA-dependent RNA polymerase was found to resemble that of eukaryotes (7). Also, it was found that in contrast to bacterial membranes, archaeal membranes comprised isoprene-based lipids attached to glycerol-1-phosphate via ether bonds (8). All these findings yielded mounting support for the uniqueness of Archaea, whose independent evolutionary status was widely accepted in 1996, when the first archaeal genome was published (9).

Yet, the study of the diversity and ecological relevance of the archaea was hampered by the lack of archaeal organisms in culture. This changed with the establishment of methods for the recovery of small-subunit ribosomal RNA (rRNA) sequence data directly from environmental samples, which revolutionized our understanding of microbial diversity and revealed that Archaea are present in most environments on Earth [reviewed in (10)]. Clearly, the previous notion that members of the Archaea are predominantly extremophiles had to be abandoned. More recently, the rise of cultivation-independent genomics approaches, such as metagenomics and single-cell genomics, has enabled the generation of genomic data from the uncultivated microbial majority (Fig. 1) (11, 12). Genome-resolved metagenomics has enabled the reconstruction of near complete microbial genomes (also “genomic bins”) from complex environmental samples, including many previously unknown archaeal genomes (13, 14). Here, we review how the use of these cultivation-independent genomics approaches has expanded our knowledge on the diversity, ecology, and evolution of Archaea.

Fig. 1 16S rRNA tree revealing the archaeal diversity.

What archaea are out there? This phylogenetic tree based on an updated 16S rRNA gene data set (33), shows the currently recognized archaeal diversity and emphasizes our ignorance regarding the physiology, ecology, and evolution of the archaeal domain of life. Whereas genomes (shaded in dark teal) and metagenomes (shaded in pink) have recently been obtained for many of these archaeal groups, various archaeal lineages are only represented based on 16S rRNA sequence data (light teal). Cultured representatives only exist for very few archaeal groups, and information regarding metabolic and physiological properties is also sparse. Sequences were retrieved from the National Center for Biotechnology Information nr database, aligned with MAFFT-LINSi (96), trimmed with trimal (97), and analyzed by using IQ-TREE (98). Branch support values above 85 are indicated by dots. Scale bar indicates the number of substitutions per site.

The expanding archaeal tree

The recent discovery of many archaeal lineages has changed the shape of the phylogenetic tree of Archaea and expanded our knowledge of their diversity. For over a decade (1990–2002), Euryarchaeota and Crenarchaeota were the only recognized archaeal phyla (Fig. 2A). Therefore, the majority of 16S rRNA gene sequences detected in environmental surveys were initially assigned to either of these two phyla (2). Between 2002 and 2011, however, several new archaeal phyla were proposed on the basis of phylogenetic and genomic analyses: Korarchaeota, Nanoarchaeota, and the ammonia-oxidizing Thaumarchaeota (Fig. 2B) (1517). The archaeal candidate phylum Aigarchaeota was proposed in 2011 (18) and comprises together with Thaum-, Cren-, and Korarchaeota the archaeal “TACK” superphylum (or “Proteoarchaeota”) (Fig. 2B) (19, 20) . During the past 5 years, our knowledge of archaeal diversity has increased further (Fig. 2C). Now, TACK archaea comprise three additional archaeal lineages of high taxonomic rank: Geoarchaeota (21, 22), Bathyarchaeota (2325), and Verstraetearchaeota (26). Furthermore, two new archaeal superphyla—the Asgard superphylum (13, 27, 28) as well as the DPANN superphylum—were proposed (Fig. 2C and Box 1) (14, 29). Currently described members of the Asgard archaea belong to the phyla Loki-, Thor-, Odin-, and Heimdallarchaeota. DPANN archaea comprise at least nine different archaeal phylum-level clades (14), including the previously characterized Nanoarchaeota (16). Yet, it is debated whether all proposed DPANN lineages are monophyletic and form a deep-branching archaeal superphylum (20, 30) and whether Altiarchaea are part of the DPANN (31, 32). Last, various lineages affiliating with Euryarchaeota have been discovered (Fig. 1) (33), and genomic data has now been obtained for several of these (Figs. 2C and 3).

Fig. 2 The expanding archaeal diversity.

Schematic depiction of how the shape of the archaeal tree of life has expanded over the years, revealing the major impact of cultivation-independent genomics during the past 5 years. (A) Between 1990 and 2002, only two archaeal phyla were known (Euryarchaeota and Crenarchaeota). (B) Additional phyla were identified between 2002 and 2011, but the phylum-status of Nanoarchaeota was still controversial. (C) Since 2011, various additional archaeal lineages of high taxonomic rank have been discovered, propelled by advances in sequencing technologies and the use of metagenomic and single-cell genomic techniques.

Box 1

Glossary box

  • AAG: Archaea of the ancient archaeal group. Heimdallarchaeota include lineages previously assigned to AAG.

  • Amo: Ammonia monooxygenase (Box 2).

  • ANME: A paraphyletic group of anaerobic methane-oxidizing euryarchaeota (Box 2).

  • AOA: Ammonia-oxidizing archaea (Box 2).

  • AOM: Anaerobic oxidation of methane (Box 2).

  • ARMAN: Archaeal Richmond Mine acidophilic nanoorganisms (Fig. 1).

  • Autotrophy: Ability of an organism to produce complex organic compounds from inorganic compounds such as CO2.

  • Chemolithotrophy: Ability to conserve energy from the oxidation of inorganic compounds.

  • Chemoorganotrophy: Use of organic substrates for energy conservation.

  • Clade: Group of organisms that includes a common ancestor and all its descendants.

  • Contigs: Contiguous stretches of DNA of different length obtained upon assembly of sequence reads. These generally represent subregions of the genomic DNA of an organism.

  • DHVE: Refers to deep-sea hydrothermal vent euryarchaeota (Fig. 1).

  • DPANN: A proposed archaeal superphylum comprising Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanohaloarchaeota, and Nanoarchaeota. More recently, additional candidate phyla such as Woesearchaeota, Pacearchaeota, and Micrarchaeota and possibly the Altiarchaeota were suggested to be part of this group.

  • DSEG: Refers to deep-sea euryarchaeotal group (Fig. 1).

  • Environmental surveys: Profiling of microbial community composition via isolation of total nucleic acids from environmental samples and subsequent sequencing, amplification, and analysis of 16S and 18S rRNA genes. Amplification is based on polymerase chain reactions by using primer sequences that target conserved regions of 16S/18S rRNA genes.

  • Genomic bin: Composite genome comprising metagenomic contigs that are inferred to be derived from the same genome or species; also referred to as metagenome-assembled genome (MAG).

  • Halophile: An organism that is dependent on high sodium chloride concentrations for growth and survival. Adaptation to salt content varies between organisms (ranges from 0.3 to more than 5 M).

  • Heterotrophy: The requirement of complex organic compounds as source for cellular carbon.

  • Horizontal gene transfer (HGT): The lateral acquisition of genetic material from cells other than the parent cell, as opposed to vertical transmission from parent to offspring.

  • Hyperthermophile: An organism that is adapted to temperatures above 80°C.

  • LACA: Last archaeal common ancestor.

  • MBG-B: Archaea of the marine benthic group B (also DSAG), now known as Lokiarchaeota (Fig. 1).

  • MBG-D: Archaea of the marine benthic group D (Fig. 1).

  • Mcr: Methyl–coenzyme M reductase (Box 2).

  • MG-II and MG-III: Two distinct marine groups belonging to the Euryarchaeota (Fig. 1). MG-II archaea are now referred to as Thalassoarchaea.

  • MCG: Archaea of the Miscellaneous Crenarchaeotal Group, now referred to as Bathyarchaeota (Fig. 1).

  • Metagenomic binning: The clustering of metagenomics contigs into taxonomic bins based on similarities in nucleotide composition profiles and/or read coverage.

  • MSBL1: Archaea of the Mediterranean Sea Brine Lakes 1 group (Fig. 1).

  • Mesophile: An organism that thrives best at moderate temperatures between 20° and 45°C.

  • Metagenomics: The generation of DNA sequence data directly from environmental samples.

  • Methane seep: Locations at the seafloor that are characterized by the release of hydrocarbons dominated by methane.

  • Methylotrophy: A metabolic strategy in which organisms use organic compounds without C–C bonds as electron donors and carbon sources.

  • Monophyletic group: A set of organisms, which forms a clade and thus shares a common ancestor and includes all of its descendants.

  • Paraphyletic group: A set of organisms, which forms a clade and shares a common ancestor but does not include all of its descendant lineages.

  • SA1: Refers to a group of euryarchaeota, originally detected in brine-seawater interface samples from Shaban Deep, Red Sea.

  • TACK: A proposed archaeal superphylum comprising Thaumarchaeota, Aigarchaeota, Crenarchaeota, and Korarchaeota, in addition to the tentative phyla Geoarchaeota, Bathyarchaeota, and Verstraetearchaeota. TACK is also referred to as Proteoarchaeota.

  • Thermophile: Organisms that thrive best at temperatures between 45° and 80°C.

  • TMCG: Archaea from the terrestrial miscellaneous crenarchaeota group, now also referred to as Verstraetearchaeota (Fig. 1).

  • SAGMEG: Refers to the South-African Gold Mine Miscellaneous Euryarchaeal Group, now named Hadesarchaea (Fig. 1).

  • Syntrophy: A metabolic process in which the degradation of a substrate gets energetically favorable through the cooperation of two different organisms—a product generated by one partner is consumed by the other partner.

  • WLP: Wood-Ljungdahl carbon fixation pathway (Box 2).

  • WSA2/Arc1: A group of archaea first identified in a clone library (“WSA”) from a methanogenic sulfate-reducing core (Fig. 1) and later detected in an anaerobic sludge digester and described as Arc1; now referred to as Methanofastidiosa.

Box 2

Definitions and terminology for archaeal key metabolic strategies discussed in this review.

Metagenomic approaches have revealed that archaea are important players in the nitrogen and carbon cycles because they are capable of ammonia oxidation, methanogenesis, and methane oxidation. In the following, we introduce the key enzymes and terminology associated with these metabolic regimes as well as the Wood-Ljungdahl carbon fixation pathway (WLP) in archaea.

Aerobic ammonia oxidation is the oxidation of ammonia to nitrite and represents the first step of nitrification, a fundamental part of the nitrogen cycle [reviewed in (37)]. The key enzyme for the oxidation of ammonia is ammonia monooxygenase (Amo). A specific group of archaea—the ammonia-oxidizing archaea (AOA), which belong to the Thaumarchaeota and several groups of ammonia-oxidizing bacteria (AOB)—encode amo genes and are capable of aerobic ammonia oxidation.

Methanogenesis and anaerobic oxidation of methane (AOM) is the consumption or production, respectively, of methane through simultaneous energy conservation. These processes are globally important and specific to archaea (35, 100). Although most methanogens can produce methane by the reduction of carbon dioxide with hydrogen and formate, some can use alternative methylated substrates. However, about two-thirds of all methane is generated through acetoclastic methanogenesis, in which methane is generated from the splitting of acetate (100). Current estimates of AOM flux indicate that an estimated 7 to 25% of the global methane emissions may be consumed by “Anaerobic Methane Oxidizing Euryarchaeota” (ANME) through a reversal of the methanogenesis pathway, which also includes the key enzymatic complex referred to as methyl–coenzyme M reductase (Mcr) (39). Therefore, methanogens and ANME share several key metabolic enzymes, which may trace back to the earliest nodes in the tree of archaea (84).

The WLP [or acetyl–coenzyme A (CoA) pathway] represents a means for the fixation of two molecules of carbon dioxide to acetyl-CoA and/or acetate with the simultaneous generation of energy (101). Two molecules of CO2 are converted to a cofactor-bound methyl group and CO and subsequently combined to generate acetyl-CoA by the key enzyme complex carbon monoxide dehydrogenase/acetyl-CoA synthase (ACS/CODH) (102). The WLP has been suggested to represent the oldest carbon fixation pathway on Earth (103) and is used for the generation of acetate in acetogenic bacteria (101). In methanogenic archaea, the archaeal version of the WLP is used for autotrophic growth and methanogenesis by using acetate as substrate.

Fig. 3 The genomic tree of Archaea and discussed key characteristics.

This phylogenetic tree shows the current diversity of archaea for which genomes (including single-cell and metagenome assembled genomes) have been obtained. Furthermore, on the outer ring, this figure highlights key characteristics of Archaea (discussed in this Review) that are important for understanding the evolution of the physiology and metabolism in this domain of life. The tree is based on a maximum-likelihood analysis [IQ-TREE (98) with LG+C60+F, 1000 ultrafast bootstraps] of a concatenated alignment of at least 11 of 15 ribosomal proteins (14) with 2611 positions [MAFFT-LINSi (96), trimal (97)]. The final tree was rooted with the DPANN archaea. Additionally, (1) a strictly anaerobic haloarchaeon has recently been enriched; (2) Thermoplasmatales grow aerobically, but some members lack heme/copper-type cytochrome/quinol oxidases (99); and (3) one representative of Bathyarchaeota encodes reverse gyrase. Branch support values above 90 are indicated by dots. Scale bar indicates number of substitutions per site.

The current taxonomy of recognized archaeal groups is inconsistent because of the lack of clear genome-based guidelines for the classification of microorganisms (34). Future efforts that use well-defined criteria based on phylogenomic and comparative genomic frameworks will help to establish a robust taxonomy of the archaeal domain of life.

Metabolic potential of previously unidentified archaeal lineages

The availability of genomic data of uncultivated archaea has resulted in groundbreaking discoveries that provide a new perspective on the diversity of metabolic traits in Archaea. In particular, our knowledge on the role of archaea in the cycling of nitrogen and carbon has been extended (35). For instance, pioneering metagenomics studies revealed that Thaumarchaeota are key players of the global nitrogen cycle (Box 2) (36, 37). More recently, genome analyses of diverse uncultivated archaeal lineages (Fig. 1) [reviewed in (38)] have provided a first glimpse into the metabolic diversity of archaea and revealed among others the presence of genes involved in methanogenesis and/or anaerobic oxidation of methane (AOM) and the Wood-Ljungdahl carbon fixation pathway (WLP) (Box 2).

ANME and Syntrophoarchaea

ANME lineages (Box 2) grow on methane through reverse methanogenesis and comprise at least three distinct paraphyletic lineages (39). They have been found in a wide range of anoxic habitats—including methane seeps, hydrothermal vents, and marine water columns—but are particularly widespread in the sulfate-methane transition zone (SMTZ), which marks the transition between upper sulfate-rich and deeper methane-rich sediment layers. Because AOM yields very little energy, ANME often form syntrophic consortia with bacterial partners that serve as electron sinks and allow the coupling of methane oxidation to the reduction of sulfate (40), nitrate (41), or manganese and iron (42). Although the coupling of AOM to sulfate reduction may be mediated by direct electron transfer (43, 44), members of the ANME-2d (such as Methanoperedens) transfer nitrite to a bacterial partner during AOM coupled to nitrate reduction (45). Whereas ANME are slow-growing and challenging to cultivate, cultivation-independent approaches have provided insights into their metabolic potential (4649). The enrichment and subsequent genome sequencing of organisms closely related to ANME1 archaea, referred to as Syntrophoarchaea (Fig. 3), has shown that these archaea oxidize butane rather than methane (50). Butane oxidation is enabled through the channeling of electrons to sulfate-reducing partner bacteria and involves genes coding for the key enzyme of methanogenesis/methane oxidation (butyl coenzyme M reductase) (Fig. 4 and Box 2). This indicates that coenzyme M reductases may have a broader substrate spectrum than was assumed previously and that the mere presence of mcr genes in genomes of uncultivated organisms is not sufficient to infer its ability to engage in methane transformations.

Fig. 4 The complex evolutionary history of methyl–coenzyme M reductase (Mcr).

This phylogenetic tree depicts the diversity of the key enzyme (Mcr) for methanogenesis (taxa shaded in blue) and anaerobic methane oxidation (taxa shaded in green), which was recently also implicated in the oxidation of butane in Syntrophoarchaea (taxa shaded in lavender). It is based on maximum-likelihood analyses [IQ-TREE (98) with LG+C60+F, 1000 ultrafast bootstraps], which were performed on a concatenated McrA and B alignment of 981 positions [MAFFT-LINSi (96) combined with trimal (97)]. MrcA and B subunits derived from the same gene cluster were concatenated. Branch support values above 85 are indicated with dots. Scale bar indicates number of substitutions per site.

Thermoplasmatales-related lineages

The Thermoplasmatales are characterized by an acidophilic and thermophilic lifestyle and chemolitho- or chemoorganoheterotrophic growth. Environmental surveys have revealed various archaeal groups distantly related to Thermoplasmatales (Fig. 1) and together comprising the Thermoplasmata. These previously unidentified members occur in a wide range of environments, including acid mine drainage systems, anoxic marine sediments, black smokers, and hydrothermal vent fields, but also in intestinal tracts of animals and the marine water column, where they sometimes represent abundant community members (33). In agreement with their taxonomic diversity and widespread distribution, the various Thermoplasmata sublineages represent a metabolically versatile group of Archaea that includes both anaerobic heterotrophs, organisms with the ability to respire oxygen, and previously unknown methanogens (Fig. 3). Recently identified Thermoplasmata lineages include the aerobic (photo-)heterotrophic Thalassoarchaea and MG-III archaea from oceanic waters (51, 52), the anaerobic heterotrophic MBG-D archaea from sediment environments (53, 54), and the Methanomassiliicoccales (Fig. 3) (55). In contrast to other Thermoplasmata, members of the Methanomassiliicoccales encode the key-enzyme for methanogenesis (Fig. 4) and generate methane from the hydrogen-dependent conversion of methanol, methylated amines, and dimethyl sulfide (55, 56).

Deeply branching Euryarchaeota

Several recent studies have reported genomic data from deeply branching Euryarchaeota. One of these, the Hadesarchaea, represents a group of Archaea previously referred to as SAGMEG (Box 1) that thrives in subsurface environments, including marine and terrestrial sediments (33). Reconstructed genomes of these archaea revealed that they may be capable of anaerobic CO oxidation but lack key genes for methanogenesis and the WLP (Fig. 3) (57). MSBL1 (Mediterranean Sea Brine Lakes 1) archaea are closely related to Hadesarchaea, and single-cell genomes from brine pools of the Red Sea indicate that these archaea are sugar-fermenters capable of carbon fixation via the WLP or the reductive tricarboxylic acid cycle (58). Similar to Hadesarchaea, the MSBL1 archaea lack genes for methanogenesis (Fig. 3) (58).

MSBL1 and Hadesarchaea are sister groups to WSA2/Arc1, a candidate methanogen clade referred to as Methanofastidiosa that is distributed in sediments, contaminated groundwater, and bioreactors (Fig. 3) (59). Metabolic reconstructions from draft genomes of eight Methanofastidiosa lineages provided support for their potential for methanogenesis through methylated thiol reduction. However, these archaea lack key genes for the WLP and have the ability to use more conventional substrates (Box 2), indicating a previously unknown but restricted methanogenic metabolism (Fig. 3) (59).

Last, Lazar et al. described the discovery of the Theionarchaea (Fig. 3), which appear to be widespread in various anaerobic sediments and subsurface environments (54) and are characterized by the presence of the WLP and the absence of genes for the Mcr complex (Fig. 3). Altogether, these findings highlight the patchy distribution of genes for methanogenesis and the WLP in different archaeal lineages (Fig. 3) (56).


Halobacteria (referred to as Haloarchaea hereafter) represent an archaeal clade characterized by extreme salt tolerance that has been suggested to be derived from methanogenic ancestors (60, 61). Massive amounts of gene influx from bacterial sources may have led to their drastic change in lifestyle (61) [but see also (62)]. Despite that most Haloarchaea are heterotrophs and have the ability to respire oxygen (Fig. 3), some members of this group display a facultative anaerobic lifestyle and can use alternative electron acceptors (such as nitrate or dimethyl sulfoxide) during anaerobic respiration or switch to fermentative growth [(63) and references therein]. The discovery of a strictly anaerobic haloarchaeon growing on acetate or pyruvate with the concomitant reduction of sulfur (63) indicates that within Haloarchaea, metabolic diversity may be higher than presumed before.

Furthermore, enrichment cultures and genomes of SA1 archaea revealed that this group represents methanogenic halophilic archaea that seem to branch basally to Haloarchaea (Fig. 1) (64). These archaea, referred to as Methanonatronarchaeia, have a methyl-reducing heterotrophic methanogenic lifestyle similar to that of Methanomassiliicoccales (64).

Previously unidentified members of the TACK superphylum

Marine subsurface environments are often dominated by members of Bathyarchaeota (formerly MCG), which comprises an extremely diverse archaeal clade (>17 different sublineages) that forms a sister-group to Thaum- and Aigarchaeota (Fig. 3) (25, 65, 66). Single-cell genomes (53) and genomic bins (2325) from different subgroups of this phylum have revealed a highly variable metabolic potential. Interestingly, most members of this phylum appear to encode proteins with a potential role in acetate production using the archaeal version of the WLP (24, 25). Some subgroups appear able to ferment organic substrates and perform homoacetogenesis, which was previously thought to be restricted to Bacteria (24). Furthermore, analyses of genomic data indicates that some of these Archaea have the ability to grow heterotrophically on substrates such as peptides, cellulose, chitin, aromatic compounds (24, 25, 53, 66), and fatty acids (23) and therefore may be able to switch between heterotrophic and autotrophic lifestyles. Bathyarchaeota genomes reconstructed from a metagenome from coal-bed methane wells revealed the presence of the WLP pathway as well as genes encoding Mcr subunits and genes for methanogenesis from methyl sulfides, methanol, and methylated amines suggestive of methylotrophic methanogenesis (23). However, it remains to be determined whether these Bathyarchaeota are indeed capable of methanogensis or AOM, or have the ability to anaerobically oxidize alternative substrates such as butane in Syntrophoarchaea (50).

Verstraetearchaeota (formerly TMCG) (Fig. 1), whose members occur in anoxic environments with high methane fluxes, comprises a recently described archaeal phylum affiliating with the TACK superphylum (Fig. 3) (26). Remarkably, all five metagenomic bins obtained for this lineage encode a complete Mcr complex and the genetic repertoire to perform methanogenesis from methanol, methanethiol, and methylamine (26). Hence, the metabolic potential of these Archaea is similar to that of Methanofastidiosa and Methanomassiliicoccales, which is in agreement with the monophyletic clustering of McrAB of these groups (Fig. 4).

Asgard archaea

Asgard archaea currently comprise the Lokiarchaeota (27), Thorarchaeota (28), Heimdallarchaeota, and Odinarchaeota (13) and have unveiled insights into eukaryogenesis. Lokiarchaeota, formerly known as MBG-B/DSAG, often represent abundant community members in deep-sea sediments (Fig. 1) (33, 67). In contrast, Thorarchaeota and Heimdallarchaeota (Fig. 1) seem to represent rare members of sediment microbial communities (13). Currently identified Odinarchaeota were found in sediments of hot springs or hydrothermal vent systems with temperatures ranging from about 60° to 70°C. Both Thor- and Lokiarchaeota encode key enzymes of the WLP and may be capable of acetogenesis (28, 68). Furthermore, these genomes show the potential for heterotrophic growth, suggesting that they are also able to use the WLP as electron acceptor during fermentation of organic substrates (28). Interestingly, the genetic repertoire for the WLP appears to be absent from genomic bins of Odin- and Heimdallarchaeota (Fig. 3). Prospective analyses of the metabolic features of Asgard archaea will certainly reveal a more detailed picture of their functional potential and ecological role.

DPANN superphylum and Altiarchaea

The DPANN superphylum represents a phylogenetically diverse archaeal superphylum that includes environmental lineages such as DHVE-5, DHVE-6, and DSEG (Fig. 1) [(33) and references therein]. DPANN archaea occur in diverse habitats, ranging from hypersaline lakes, marine and lake waters, sediments, and acid mine drainage to hot springs (14, 16, 29, 31, 32, 6974). Because of the small cell size of many members of this group, DPANN archaea have been overlooked by conventional analyses, and cultivated representatives are only available for Nanoarchaeota, which grow attached to a crenarchaeal host cell (16, 69). “ARMAN” archaea—the Parvarchaeota and Micrarchaeota—were among the first DPANN lineages to be explored at the genomic level (Fig. 3) (73). Although these organisms may be host-associated, their genomes revealed the potential to perform central steps in carbon metabolism, lipid degradation, and oxidative phosphorylation. Analyses of several near-complete genomes of Woese-, Pace-, and Aenigmarchaeota as well as Diapherotrites revealed that these have a reduced genomic repertoire and lack major anabolic and catabolic pathways (14, 29). These findings indicate that some of these archaea have a symbiotic and/or host-dependent lifestyle, whereas others may be capable of a saccharolytic or fermentative growth (14). Members of the Nanohaloarchaeota are found in hypersaline lakes and also seem to be free-living and may have an aerobic heterotrophic or even photoheterotrophic lifestyle (70, 71) or grow anaerobically by fermentation (72). Last, the investigation of genomes of biofilm-forming Altiarchaea suggested that these archaea are able to grow autotrophically on carbon dioxide and potentially also on acetate, formate, or carbon monoxide (31, 32). Although Altiarchaea are nonmethanogenic, it was shown that their genomes encode a modified archaeal WLP (31).

Uncovering the evolution and deep roots of the archaea

The role of horizontal gene transfer in archaeal evolution

Horizontal gene transfer (HGT) represents the major source of gene gain and thus innovation in archaeal and bacterial genomes [reviewed in (75)]. HGT seems to have had a variable impact on the evolution of different archaeal lineages being particularly prevalent in Haloarchaea, Thaumarchaeota, MG II/III, Sulfolobales, Methanosarcinales, and Methanomassiliicoccus (30, 61, 7679) . Gene acquisitions from bacteria have contributed to the adaptation of archaea to new environments and lifestyles, aiding the diversification of this domain of life (75, 77, 80). To what extent these gene acquisitions occurred over a short period of time or represent a continuous process is currently under debate (62, 77, 79). HGT rates seem to be asymmetrical, with bacteria-to-archaea transfers appearing to be at least five times more frequent than vice versa (77, 78). This observed asymmetry may be caused by the bacterial overrepresentation over archaea in most environments or by different mechanistic, selective, and adaptive biases (77, 80). However, methodological biases such as an underrepresentation of genomic data from archaea as compared with bacteria can currently not be excluded. Improvements of methods to detect HGT as well as the inclusion of genomic data from previously unknown archaeal lineages will likely reveal a clearer picture of the prevalence and directionality of HGT between Archaea and Bacteria.

The root of the archaeal tree

The “coming of age” of the archaeal tree (Fig. 3) represents an opportunity to readdress long-standing questions about the early evolution of the Archaea and the nature of the last archaeal common ancestor (LACA). In this regard, it is essential to confidently place the root position in the archaeal tree and determine the phylogenetic position of DPANN archaea. Because the use of a distant outgroup to root phylogenetic trees can cause long-branch attraction artifacts, fast-evolving or compositionally biased sequences, such as those of DPANN lineages, could artificially emerge as a monophyletic basal group [(30, 81) and references therein].

Recent studies have reported varying positions of the root. For instance, the reconstruction of phylogenetic trees by using both conserved protein markers as well as 38 new markers placed the root between Euryarchaeota and the TACK superphylum (20). However, a two-step approach (based on only two domains of life at a time) suggested that the root of the archaeal tree was placed inside Euryarchaeota, between a clade comprising Thermococcales, Methanococcales, Methanobacteriales, Methanopyrales, and the TACK superphylum and a clade comprising the rest of the Euryarchaeota (81). Yet, both of these approaches relied on using a distant outgroup (Bacteria), which may have caused phylogenetic artifacts. Sophisticated Bayesian models to root the archaeal tree without the need for an outgroup retrieved a root between TACK archaea and a clade comprising Euryarchaeota/DPANN (82). However, the data set analyzed in this study was relatively small and comprised solely 16S and 23S rRNA gene sequences, which are known to be compositionally biased (83). A consensus unrooted archaeal multigene supertree, rooted with a newly developed model of genome evolution, placed the root between the DPANN superphylum and all other Archaea and supported the monophyly of Euryarchaeota (30). Future analyses using alternative phylogenomic approaches and more comprehensive data sets will likely provide a more definite answer on this matter.

The nature of the last common ancestor of the archaea

Reconstructions of ancestral gene sets have indicated that LACA may have been an anaerobic chemolithoautotroph that used the WLP for carbon fixation and may have been able to obtain energy through methanogenesis [(30, 81, 8486) and references therein]. The recent discovery of the key genes for the WLP (ACS/CODH) and methanogenesis (Mcr) in archaeal lineages outside of the Euryarchaeota (Fig. 3 and Box 2) and the patchy distribution of terminal oxidases in currently available genomes (Fig. 3) supports this hypothesis [(56) and references therein]. Mcr is now known to be encoded by various Euryarchaeota lineages as well as by members of TACK, such as Bathyarchaeota and Verstraetearchaeota (Fig. 3) (23, 26). However, phylogenetic analyses of Mcr indicate that protein sequences may reflect functional similarity rather than phylogenetic relatedness of the respective species and implicate a complex evolutionary history of this gene cluster involving HGT and gene losses (Fig. 4) (50, 56, 64). For example, the bathyarchaeal Mcr homologs are most closely related to those present in Syntrophoarchaea (50), whereas those of Verstraetearchaeota, Methanomassiliicoccales, Methanonatronarchaeia, Methanofastidiosa, and ANME1 seem to share a common ancestry, despite the distant evolutionary relationship of these lineages (Fig. 4) (23, 26, 64). Therefore, Mcr homologs in TACK may have been acquired horizontally from euryarchaeal lineages rather than representing an ancestral feature. In-depth analyses of the complete machinery of methanogenesis across different archaeal lineages, a solid rooted tree, and a robust placement of DPANN are needed to determine the likelihood for the presence of methanogensis in LACA (30, 80). The WLP has now been detected in all major archaeal superphyla, including TACK and Asgard, Euryarchaeota, and putative deep-branching members of the DPANN (Altiarchaeales) (Fig. 3). Similar to Archaeoglobales, many of the lineages encoding the archaeal version of the WLP lack mcr genes (Figs. 3 and 4), indicating that the WLP is not necessarily linked to methane metabolism [reviewed in (56)]. The broad distribution of WLP-related genes in archaea is in agreement with hypotheses that suggest its presence in LACA (30, 84, 86).

Inferences of optimal growth temperatures have suggested that LACA was a thermophilic or hyperthermophilic organism (30, 87) and may have encoded reverse gyrase, a genetic marker for hyperthermophily (Fig. 3) (80, 85). The adaptation to moderate environments but also to aerobic lifestyles likely occurred independently in different archaeal lineages and were aided by HGT from Bacteria (30, 80, 86). However, these hypotheses need to be evaluated further by taking into account alternative positions of the root and previously unidentified, potentially mesophilic archaeal groups such as the Bathyarchaeota, Verstraetearchaea, DPANN, and Asgard archaea.

The archaeal ancestry of eukaryotes

The expansion of genomic data from archaea has forced us to rethink the evolutionary transitions of life, particularly with respect to the origin of eukaryotes. Archaea have featured prominently in various eukaryogenic hypotheses, in particular in symbiogenic scenarios that suggest that eukaryotes evolved from a merger of an archaeal host cell and an alphaproteobacterial endosymbiont [reviewed in (88)]. Early on, Archaea and eukaryotes were shown to share a common ancestry to the exclusion of Bacteria (“Three Domains tree”) (Fig. 5A) (2, 89). However, better genomic taxon-sampling combined with the use of more sophisticated phylogenomic tools [(19, 90) and references therein] yielded trees in which eukaryotes branched from within the archaeal domain of life (“Two Domains tree”) (Fig. 5B) (91). These analyses recovered a monophyletic clade comprising eukaryotes and the archaeal TACK superphylum, a topology that was in agreement with the presence of several eukaryotic signature proteins in TACK archaea (Fig. 5D) (19). Subsequently, the discovery of Asgard archaea has provided further evidence for symbiogenic scenarios and the origin of eukaryotes from an archaeal host (13, 27). First of all, phylogenomic analyses of marker gene data sets place Asgard as a sister group of eukaryotes (Fig. 5C) (13). Furthermore, genome analyses of these archaea revealed that they encode a multitude of proteins previously regarded as specific to eukaryotes (13, 27), including several components involved in membrane bending and vesicle-formation processes (Fig. 5D) (92, 93). Intriguingly, Thorarchaeota were shown to encode yet additional components of the eukaryotic membrane trafficking machinery, including proteins with Sec23/24 and with TRAPP domains (Fig. 5D). The identification of gene clusters consisting of β-propeller and α-solenoid domain proteins represents the most important discovery (13) because these protein domains are the basic building blocks of eukaryotic coat proteins essential for vesicle biogenesis and transport [reviewed in (94)]. In light of these findings, it seems plausible that these key components in the archaeal ancestor of eukaryotes were involved in mediating the relationship with a syntrophic or symbiotic partner and provided the building blocks for the subsequent evolution of the trafficking machinery and compartmentalized nature of eukaryotic cells.

Fig. 5 Archaea and the changing tree of life.

(A to C) Schematic depiction of the relationship of Archaea, Bacteria, and Eukaryotes according to the (A) Three and (B) Two Domains topology, updated with (C) the Asgard archaea. The Three Domains tree suggested that Archaea, Bacteria, and Eukaryotes represent primary domains of life. In contrast, a Two Domains topology [(B) and (C)] is more compatible with the view that eukaryotes represent a secondary domain of life derived from a merger of two prokaryotes: an archaeal host and an alphaproteobacterial symbiont. (D) A schematic tree of the archaea and their relationship with the eukaryotes emphasizing the placement of eukaryotes with Asgard archaea and showing the distribution of exclusive sequential patterns (ESPs) in the different lineages. The presence or absence pattern of ESPs is based on previous studies and was updated by taking into account publicly available metagenome-assembled genomes.

Summary and future perspectives

Advances in cultivation-independent genomics approaches have started to reveal the unprecedented genomic diversity of the archaeal domain of life. Novel genomic data has provided insights into the metabolic repertoire and evolution of members of the Archaea and their relationship with Eukaryotes. Recent estimates indicate that about half of the existing archaeal diversity still defies genomic characterization (Fig. 1) (95), suggesting that the coming years will witness additional major discoveries. Many of the unexplored archaeal lineages are hiding in little-investigated and low-biomass environments and may represent low abundant members of microbial communities. Continued efforts in genome sequencing and culturing of these archaea will be crucial to study the role that archaea play in their natural habitats, their interactions with other organisms, and their function in biogeochemical cycles and Earth’s ecosystems. Last, these data will provide a treasure trove for analyses to identify the lifestyle and metabolic potential of the archaeal ancestor and hopefully to reconstruct life’s earliest evolutionary trajectories.

References and Notes

  1. Acknowledgments: We thank the anonymous reviewers for providing constructive comments that helped improve the clarity of this manuscript. We also acknowledge all those researchers that have contributed to this field and whose literature could not be cited because of space restrictions. The work in the T.J.G.E. laboratory is supported by grants from the European Research Council (ERC Starting Grant 310039-PUZZLE_CELL), the Swedish Foundation for Strategic Research (SSF-FFL5), and the Swedish Research Council (VR grant 2015-04959).
View Abstract

Stay Connected to Science

Navigate This Article