The Symbiodinium kawagutii genome illuminates dinoflagellate gene expression and coral symbiosis

See allHide authors and affiliations

Science  06 Nov 2015:
Vol. 350, Issue 6261, pp. 691-694
DOI: 10.1126/science.aad0408

Symbionts are adapted to work with corals

Many corals have formed mutualistic associations with dinoflagellate symbionts, which are thought to provide nutrients and other benefits. To examine the underlying genetics of this association, S. Lin et al. sequenced the genome of the endosymbiont dinoflagellate Symbiodinium kawagutii. The genome includes gene number expansions and encodes microRNAs that show complementarity to genes within the coral genome. Such microRNAs may be involved in regulating coral genes. Furthermore, coral and S. kawagutii appear to share homologs of genes encoding specific nutrient transporters. The findings shed light on how symbiosis is established and maintained between dinoflagellates and corals.

Science, this issue p. 691


Dinoflagellates are important components of marine ecosystems and essential coral symbionts, yet little is known about their genomes. We report here on the analysis of a high-quality assembly from the 1180-megabase genome of Symbiodinium kawagutii. We annotated protein-coding genes and identified Symbiodinium-specific gene families. No whole-genome duplication was observed, but instead we found active (retro)transposition and gene family expansion, especially in processes important for successful symbiosis with corals. We also documented genes potentially governing sexual reproduction and cyst formation, novel promoter elements, and a microRNA system potentially regulating gene expression in both symbiont and coral. We found biochemical complementarity between genomes of S. kawagutii and the anthozoan Acropora, indicative of host-symbiont coevolution, providing a resource for studying the molecular basis and evolution of coral symbiosis.

Dinoflagellates are alveolates, with the mostly parasitic apicomplexans as their closest relatives (fig. S1A). Members of the genus Symbiodinium are essential photosynthetic endosymbionts in coral reefs (1). Dinoflagellates show enigmatic genetic and cytological characteristics, including permanently condensed chromosomes and a high proportion of diverse methylated nucleotides, and often feature large nuclear genomes (up to 250 Gb) (2). We report a 0.935-Gbp assembly of the 1.18-Gbp genome of Symbiodinium kawagutii (figs. S1B and S2), a Clade F strain originally isolated from a Hawaiian reef ecosystem (3). A high-quality S. kawagutii genome assembly corresponding to ~80% of the genome was achieved from ~151-Gbp Illumina genome shotgun sequence (~130x genome coverage) (tables S1 to S4 and fig. S3). Genome annotation revealed 36,850 nuclear genes, with 68% occurring in families (1.69 genes per family) (table S5). Only ~9% (3280) of S. kawagutii genes were in tandem arrays (1279 clusters) (table S6), with 2 to 10 repeats (76% being ≤4 repeats) per array. The genome encodes the common metabolic pathways expected for typical photosynthetic eukaryotes (fig. S4 and table S7), and we found genes involved in sexual reproduction, cyst formation and germination, and telomere synthesis (table S8). The telomeric motif (TTTAGGG)n was identified at the ends of scaffolds and was also detected by fluorescence in situ hybridization (fig. S1B).

Globally, our analysis revealed extensive genomic innovation in dinoflagellates. A total of 25,112 gene families were clustered from the genomes of S. kawagutii and eight other species representing higher plants, chlorophytes, rhodophytes, diatoms, phaeophytes, alveolates, and cnidarians. S. kawagutii has 12,516 gene families, of which 7663 were gained in the ancestor of Symbiodinium (Fig. 1A and table S9). These genes were enriched in 62 metabolic gene ontologies (table S10). When the gene families were normalized to z scores to balance the effect of different total gene numbers, 96 gene families had shrunk (table S11) and 265 gene families had expanded in Symbiodinium (table S12). The LINE-1 reverse transcriptase (a retroelement) is the most highly expanded family.

Fig. 1 Comparative genomic analysis between S. kawagutii and other eukaryotes.

(A) (Left) Predicted pattern of gain or loss of gene families across eukaryotes shown on a phylogenetic tree inferred from genome data. Numbers on branches indicate the number of gene families gained (+) or lost (–); those at the left of the nodes are bootstrap values supporting the tree topology. (Right) K-means clustering of gene families based on number of members. Columns represent gene families, and rows are species of eukaryote. (B) Synteny between regions of the genomes of S. kawagutii and S. minutum. (C) TTTG, TTTT, and (TATG)2 are the top three motifs enriched in the S. kawagutii upstream regions and are potential novel promoter elements. (D) Promoter enrichment scores for selected motifs as a function of distance upstream from the start codon.

Our synteny and homology analysis showed no evidence of whole-genome duplication, because little collinearity within S. kawagutii genome was observed (table S13). Instead, the S. kawagutii genome shows evidence of transposon propagation, in particular long terminal repeat (LTR) retrotransposons and DNA transposons (table S14), which contributes to differences between S. kawagutii and S. minutum (Fig. 1B). Furthermore, protein domains linked to transposons (table S15) may lead to proliferation of these protein domains in the genome; in the case of cytosine methyltransferase, the resulting expansion of the gene family may help explain the extensive DNA methylation seen in dinoflagellates; in the case of retroelements such as reverse transcriptase and integrase, the expanded families may increase the frequency of transcript retrotranspostion into the genome. In keeping with this latter, we found numerous genes with a full (62 genes) or partial (5506 genes) dinoflagellate spliced leader (DinoSL) (4) in their 5′ untranslated region (table S16). The 22-nucleotide (nt) dinoSL is trans-spliced to the 5′ end of all mRNAs, and its presence in the genome is a signature of retrotranscript insertion (5); this thus represents an efficient mechanism for gene family expansion. Last, horizontal gene transfer (HGT) may also contribute to S. kawagutii genome innovation. Conservatively, we found 56 potential HGT genes, 41 of which had best Basic Local Alignment Search Tool (BLAST) hits to marine bacteria (table S17 and fig. S5).

The partial genome (~41%) of S. minutum (6) allowed some comparative genomic studies. S. minutum and S. kawagutii have similar genome sizes and gene numbers (table S5) and show some genomic collinearity (Fig. 1B and table S18) and gene ontology profiles (table S19 and fig. S6A). MUMmer (Maximal Unique Matches) alignment data showed that 2.17% (20.4 Mb) of the S. kawagutii genome matched to S. minutum, and only 5.92% (36.5 Mb) of the S. minutum genome matched to S. kawagutii. This divergence was confirmed by the reciprocal mapping of their raw reads (fig. S6B and tables S5 and S20), implying that these two species are more diverged than usually assumed. Yet both genomes showed expansion of gene families involving cargo transport and stress responses (fig. S6C and tables S21 and S22), which may reflect the shared symbiotic lifestyles.

The transcriptional machinery of dinoflagellates does not contain the typical eukaryotic TATA box promoter element (2). Instead of a TATA-box binding protein (TBP), dinoflagellates express a TBP-like factor that has a stronger affinity to TTTT than to TATA (7). A global search of the S. kawagutii genome 1000-bp (base pair) region upstream of putative start codons revealed 564 conserved motifs, which were grouped into 108 clusters on the basis of sequence similarities (table S23). About 92% of these were located within 100 bp upstream of the start codon. The motifs with the most conserved positions are remnants of SL (Fig. 1C). Motifs TTTT and TTTG were found in the upstream regions of 34,524 and 35,348 genes, respectively (94% and 96% of the gene repertoire). Curiously, although both are part of the SL, the TTTG motif has a position consistent with that of the SL, whereas TTTT tends to be further upstream (Fig. 1D). This suggests that the TTTT may serve as a core promoter motif replacing the TATA box used by other eukaryotes. The TTTT is typically 30 bp upstream from a potential transcriptional start site (fig. S7). The next most highly enriched motif, (TATG)2, was associated with only 257 promoters and is thus more likely to be a binding target of specific regulators.

Sequences from the genome and purified small RNAs predicted 367 and 354 mature microRNAs (miRNAs), respectively (3), with 102 of the latter (table S24) retained after stringent filtering and structural analysis (fig. S8A). We matched 255 of the genome-predicted miRNAs to 99 of the small RNA-based miRNAs. The mature miRNA candidates varied in length from 21 to 24 nt, with most (91.5%) containing 22 nt. Northern blot analysis revealed a decreased expression level for some miRNA in cultures grown at 35°C instead of 25°C (Fig. 2A); of these, scaffold270_6017 is predicted to target heat shock proteins 90 and 70 (table S25), consistent with an expected up-regulation in translation of these thermal stress proteins. A bias toward uridine (U) (47.9%) and against guanine (G) (5.0%) was observed at the 5′-end ultimate nucleotide, as was in Arabidopsis thaliana (8). Interestingly, of the 102 mature miRNA sequences, 49 were similar to animal miRNAs, 11 to plant miRNAs, and 1 to viral miRNAs (3). S. kawagutii thus has a miRNA reservoir dominated by miRNAs with considerable sequence identity to those found in animals (fig. S8B).

Fig. 2 miRNA in S. kawagutii and potential target genes in S. kawagutii and coral.

(A) Northern blot analysis of Symbiodinium miRNAs. Most miRNAs are 22 nucleotides as assessed by the migration of DNA oligonucleotides. (B) A putative protein-protein interaction network of miRNA target genes. The network is a cluster of highly connected nodes (genes) showing miRNA-target interactions (red lines) and protein-protein interactions (gray lines). Node colors represent different KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO functional annotations (see table S28). (C) GO categorization of predicted miRNA target genes in both S. kawagutii and coral.

We identified 1 perfect (plant-type) and 6026 partial (animal-type) complementarity miRNA targets in the S. kawagutii genome. Among the genes with partial complementarity, 381 were potentially targets of miRNAs with higher expression levels (read counts > 1000), suggesting that these genes are more likely to be regulated by miRNAs. In total, 2557 of all the potential target genes were annotated with known functions (table S25) with enrichments in biological processes of carbohydrate metabolism, transcription regulation, and biosynthesis of amino acids and antibiotics (tables S26 and S27). miRNA targets are often clustered in networks of interacting proteins (3), in which some genes are targeted by many miRNAs and some miRNAs target multiple genes (Fig. 2B, fig. S9, and table S28). In addition, the S. kawagutii genome harbors small RNA-degrading nucleases 1 and 3, one of which is itself a miRNA target (table S25). Thus, the evidence for miRNA-based gene regulatory machinery is robust and extensive, complementary to the limited transcriptional regulation documented in dinoflagellates (2).

We identified a double-stranded RNA (dsRNA)–gated channel protein Systemic RNA Interference Deficiency–1 (SID-1) (9) required for systemic RNA interference in animals (10). SID-1 sequences in Symbiodinium and Cnidaria are similar, suggestive of horizontal gene transfer (fig. S10) (3). Pathogen-to-host miRNA transfer has been shown to silence host immunity genes in plants (11). We identified 1514 coral genes (table S29) (6.4% of the total number) as potential targets of S. kawagutii miRNAs; these had similar molecular functions [Gene Ontology (GO) slim category] as targets in S. kawagutii (Fig. 2C), which were enriched in GO categories related to protein modification and regulation of transcription and cell growth (table S30). This suggests that transferred miRNAs might regulate similar processes in symbiont and host.

Recognition of Symbiodinium by the host cells is mediated primarily through binding of Symbiodinium high-mannose glycans by lectins on the coral cell surface (12, 13) (Fig. 3). We found a glycan biosynthesis pathway in S. kawagutii lacking several enzymes catalyzing the final steps of the common glycan biosynthesis pathway (fig. S11). This altered pathway is predicted to produce a (GlcNAc)5(Man)5(Asn)1 glycan that carries abundant free mannose branches and terminal mannose-mannose units available for lectin binding, consistent with previous findings (14). However, the enzymes involved in mannose-rich glycan biosynthesis differ between S. kawagutii and S. minutum (table S31), suggesting that variations in the glycoprotein structure may tune the host recognition specificity.

Fig. 3 Schematic summary of host recognition mechanisms and cargo transport.

Small circles represent the genes from coral (sky blue fill) and from S. kawagutii (pale pink fill), with predicted miRNA target genes marked with red outlines. Only genes considered directly related to symbiosis are shown. All the S. kawagutii transporter families shown have members that are computationally predicted to be plasma membrane proteins (see table S35).

Other Symbiodinium genes may also be related to symbiosis (table S32). These include homologs of nodulation factors involved in establishing the symbiosis between legume and the nitrogen-fixing bacteria rhizobia, as well as cell surface proteins with a role in pathogen infection or host recognition. Furthermore, some S. kawagutii and S. minutum proteins share homology with Plasmodium falciparum proteins involved in the interaction between the parasite and its host (15).

To assess their role in symbiosis, we compared genes encoding transporters in S. kawagutii with those in Acropora digitifera, the only sequenced coral species (16) (Fig. 3 and table S33). Remarkably, nearly half of the numerous transporters (>300) in S. kawagutii are shared by this coral (table S34). Both dinoflagellate and coral genomes encode transporters of C (bicarbonate), N, P, and trace metals, as well as carbon-concentrating mechanism enzymes, and many of these are lacking in the two nonsymbiotic cnidarians Hydra magnipapillata and Nematostella vectensis (table S35). Most of the 49 carbonic anhydrases (CA) genes in the S. kawagutii genome are cytoplasmic, suggesting that cytoplasmic CA is critical for CO2 acquisition; only two δ-CAs are predicted to be localized at the plasma membrane and one ß-CA in the thylakoids.

The S. kawagutii genome also contains the complete biosynthesis pathways of all the standard amino acids except lysine and histidine and could potentially supply nine of the amino acids that A. digitifera cannot produce (fig. S12 and table S35). Interestingly, the majority of the S. kawagutii transporters, as well as some of their coral cognates, are potential miRNA targets (red circles, Fig. 3).

Symbiodinium-to-coral translocation of photosynthates is critical for reef growth, with either glycerol (17) or glucose (18) translocated. The S. kawagutii genome contains 12 genes encoding glycerol-3-phosphate dehydrogenase, an enzyme essential for glycerol production in yeast (19), as well as a plasma membrane aquaporin with glycerol transport ability and low- and high-affinity glucose transporters (Fig. 3 and table S35). However, the A. digitifera genome (16) contains only the transporter for glucose and not for glycerol. This suggests that glucose can be exported to the coral cells, whereas glycerol is exported only to the symbiosome, possibly as an osmolyte (20).

S. kawagutii possesses a large ensemble of genes potentially conferring tolerance to thermal stress and ultraviolet irradiation, including expanded gene families encoding heat shock proteins and DNA repair/recombination proteins (table S22). There is also a large set of antioxidant genes, including the large thioredoxin gene family, a diverse set of genes for (Cu/Zn-, Mn/Fe-, Ni-dependent) superoxide dismutases (SOD), and ascorbate peroxidases (APx). We found a Ni-dependent SOD, rarely reported for marine algae, consistent with the abundant high-affinity nickel transporter genes in this species (fig. S6C), as well as six genes encoding xanthine dehydrogenase/oxidase, which catalyzes the oxidation of xanthine to uric acid in purine metabolism (table S35). Uric acid forms crystalline deposits in Symbiodinium that function as an N reserve (21), and it is a potent antioxidant.

Unexpectedly, the S. kawagutii genome lacks the genes of the four major photoprotector mycosporine-like amino acids (MAA) biosynthesis enzymes: dehydroquinate synthase (DHQS), O-methyltransferase (O-MT), ATP-grasp, and nonribosomal peptide synthetase (NRPS). Their loss may thus represent a coevolution of S. kawagutii with its host, because all four genes are found in A. digitifera, one of which shows a close relationship with that in other dinoflagellates (fig. S13).

This study provides a portrait of a symbiotic dinoflagellate genome, with insights into genome evolution and regulation of gene expression in dinoflagellates and the molecular basis of coral-Symbiodinium symbiosis. Our results are a stepping-stone to understanding how the genetic complementarity between anthozoans and Symbiodinium can explain host specificity (1, 22) and to determining the molecular mechanisms responsible for coral bleaching.

Supplementary Materials

Materials and Methods

Figs. S1 to S13

Tables S1 to S35

References (2356)

References and Notes

  1. Materials and methods are available as supplementary materials on Science Online.
  2. Acknowledgments: This work was supported by Natural Science Foundation of China grants K16110 and K16044 (to S.L.), U.S. National Science Foundation grant OCE-0854719 (to S.L. and H.Z.), U.S. NIH awards AI056034 and AI073806 (to D.A.C. and N.R.S.), National Science and Engineering Research Council of Canada grant 171382-03 (to D.M.), and various funds to BGI-Shenzhen, Shenzhen [State Key Laboratory of Agricultural Genomics, Guangdong Provincial Key Laboratory of core collection of crop genetic resources research and application (2011A091000047), Shenzhen Engineering Laboratory of Crop Molecular design breeding, and China National GeneBank-Shenzhen]. The genomic sequences and the annotated genes of S. kawagutii, as well as RNA sequencing data (unigenes), are available in our Symka Genome Database in Xiamen University:; these data have also been deposited into the National Center for Biotechnology Information Short Read Archive (SRA) under accession number SRA148697.
View Abstract

Stay Connected to Science

Navigate This Article