The 160-Kilobase Genome of the Bacterial Endosymbiont Carsonella

See allHide authors and affiliations

Science  13 Oct 2006:
Vol. 314, Issue 5797, pp. 267
DOI: 10.1126/science.1134196


Previous studies have suggested that the minimal cellular genome could be as small as 400 kilobases. Here, we report the complete genome sequence of the psyllid symbiont Carsonella ruddii, which consists of a circular chromosome of 159,662 base pairs, averaging 16.5% GC content. It is by far the smallest and most AT-rich bacterial genome yet characterized. The genome has a high coding density (97%) with many overlapping genes and reduced gene length. Genes for translation and amino acid biosynthesis are relatively well represented, but numerous genes considered essential for life are missing, suggesting that Carsonella may have achieved organelle-like status.

Many bacterial lineages have evolved mutually obligate endosymbiotic associations with animal hosts. In such cases, the bacteria typically produce essential nutrients that are rare in the host diet, and the animal produces specialized cells (bacteriocytes) where bacteria are confined and, like organelles, continuously maintained through vertical transmission across host generations. These bacteriocyte-restricted bacteria have distinctive genomic features, including massive reduction in genome size and biased nucleotide base composition (1). However, all cases of genome reduction appear to reach limits of about 400 kb and about 20% GC, which are believed to be the minimal limits for cellular organisms (Fig. 1).

Fig. 1.

Relationship between genome sizes and GC content of 358 complete genomes from Bacteria and Archaea: red indicates Carsonella; blue represents endosymbionts Buchnera, Blochmannia, Wigglesworthia, and Baumannia; yellow, other Bacteria; and green, Archaea. (Inset) A 4′,6′-diamidino-2-phenylindole–stained bacteriocyte of P. venusta. Tubular cells surrounding the host nucleus (center) are Carsonella.

Here, we present a genome that has evolved far beyond these limits. Carsonella ruddii (Fig. 1) is a bacteriocyte-associated γ-proteobacterial symbiont that appears to be present in all species of phloem sap-feeding insects, psyllids (2). We determined the complete genome sequence of C. ruddii strain Pv (Carsonella-Pv) of the hackberry petiole gall psyllid, Pachypsylla venusta, which has no other microbial symbionts (2). The genome of Carsonella-Pv is a single circular chromosome of 159,662 base pairs (bp), averaging 16.5% GC content. The assembly analysis, using a large excess of sequence data, did not reveal any other symbionts or plasmids. The genome size, which was further confirmed by long-range electrophoresis, is only about one-third that of the archaeal parasite Nanoarchaeum equitans (having the smallest fully sequenced genome to date, at 491 kb) (3) and that of an unsequenced Buchnera strain (having the smallest known bacterial genome, at about 450 kb) (4).

The genome has only 182 open reading frames (ORFs) (fig. S1A), which were classified into the clusters of orthologous groups (COGs). Notably, more than half of ORFs are devoted to only two categories, translation (34.6%) and amino acid metabolism (17.6%) (fig. S1B). In the latter, Carsonella retains many genes for biosynthesis of essential amino acids, as in Buchnera, the symbiont of aphids. Because both psyllids and aphids feed only on plant phloem sap that is poor in essential amino acids, the analogy of gene repertoires in Carsonella and Buchnera is an intriguing example of convergence. A remarkable feature of the genome is the total loss of genes for numerous categories, including cell envelope biogenesis and metabolisms of nucleotides and lipids (fig. S1B).

Another feature of this genome is an extremely high gene density. The protein-coding sequences and RNA genes (one 16S-23S-5S ribosomal RNA operon and 28 tRNA genes for all 20 amino acids) cover 97.3% of the genome, which is a gene density higher than those in known bacterial genomes. This density is attributable to numerous overlapping genes. Of 182 ORFs, 164 (90%) overlap with at least one of the two adjacent ORFs, and the average length of all 132 overlaps is 10.7 bases. The majority (92%) are tandem overlaps on the same strand, all of which are out of frame. Moreover, the average length of Carsonella ORFs (826 bp) is notably shorter than that of other bacteria. Indeed, a comparison of 89 orthologous ORFs conserved in Carsonella and in seven bacteriocyte-restricted endosymbionts revealed that the average length of the ORFs in Carsonella is 17.8 to 18.4% shorter than the average ORF lengths of the other endosymbionts.

This genome is by far the most streamlined studied to date. Its gene inventory seems insufficient for most biological processes that appear to be essential for bacterial life, and possibly the host bacteriocyte compensates. Although some psyllids possess additional secondary endosymbionts that might be a source of specialized gene products, no other symbionts are present in P. venusta, based on several lines of evidence [e.g., (2)]. The genome also lacks many genes for bacterium-specific processes. One of several possible explanations for the absence of these genes is that, as in the case of organelles (5), some genes were transferred from the genome of a Carsonella ancestor to the genome of a psyllid ancestor and are now expressed under control of the host nucleus.

Supporting Online Material

Materials and Methods

Fig. S1



View Abstract

Stay Connected to Science

Navigate This Article