Research Article

The Composite Genome of the Legume Symbiont Sinorhizobium meliloti

See allHide authors and affiliations

Science  27 Jul 2001:
Vol. 293, Issue 5530, pp. 668-672
DOI: 10.1126/science.1060966


The scarcity of usable nitrogen frequently limits plant growth. A tight metabolic association with rhizobial bacteria allows legumes to obtain nitrogen compounds by bacterial reduction of dinitrogen (N2) to ammonium (NH4 +). We present here the annotated DNA sequence of the α-proteobacteriumSinorhizobium meliloti, the symbiont of alfalfa. The tripartite 6.7-megabase (Mb) genome comprises a 3.65-Mb chromosome, and 1.35-Mb pSymA and 1.68-Mb pSymB megaplasmids. Genome sequence analysis indicates that all three elements contribute, in varying degrees, to symbiosis and reveals how this genome may have emerged during evolution. The genome sequence will be useful in understanding the dynamics of interkingdom associations and of life in soil environments.

Symbiotic nitrogen fixation is profoundly important for the environment. Most plants assimilate mineral nitrogen only from soil or added fertilizer. An alternative source powered by photosynthesis, rhizobia-legume symbioses provide a major source of fixed nitrogen. Evolution in diverse legumes of high protein content in seeds (e.g., soybean) and leaves (e.g., alfalfa) may reflect the ability of many plants in this taxon to obtain nitrogen from bacterial symbionts while growing in poor soils. Improved understanding of the rhizobia-legume symbiosis has implications for sustainable agriculture and ecosystem function. Sinorhizobium meliloti, the symbiont of alfalfa, is a focus of research both because of the symbiosis and because, as an α-proteobacterium, it is closely related to bacterial plant and animal pathogens including Agrobacterium andBrucella. Rhizobia infect roots and induce nodules, specialized organs where bacterial endosymbionts fix nitrogen within the plant cytoplasm. The bacteria and plant exchange signals during nodule development and establish an intimate metabolic exchange of bacterial fixed nitrogen for plant carbon compounds. We understand some symbiotic mechanisms, but how the microbe stimulates nodule organogenesis, how it invades the plant without triggering host defenses, and how and why the bacterium fixes nitrogen for the host rather than for its own metabolism attract considerable interest. Furthermore, for symbiosis to be a successful habit, the bacteria must maintain their populations in the soil and establish themselves competitively in the rhizosphere through adaptations that are little understood.

Sinorhizobium meliloti has been the subject of extensive genetic, biochemical, and metabolic research; this knowledge provides a solid foundation for genomic experimentation. We report here the complete and fully annotated nucleotide sequence of the S. meliloti strain 1021 genome (1), and an integrative analysis of the biology implied by the sequence (2). We also present the first global comparison between two rhizobial genomes, theS. meliloti genome and the recently reportedMesorhizobium loti genome (3). In addition to these two complete genomes, the 536-kb symbiotic plasmid ofRhizobium sp. NGR234 (4) and a 410-kb region of the chromosome of Bradyrhizobium japonicum(5) have been sequenced and annotated. Methods and detailed analyses of the S. meliloti chromosome, pSymA, and pSymB are reported concurrently (6–8).

General Features of the Genome

Main features of the genome are listed in Table 1. The S. meliloti genome consists of three replicons: one large replicon of 3.65 Mb and two smaller replicons, pSymA and pSymB, of 1.35 and 1.68 Mb, respectively (Fig. 1).

Figure 1

Linear representation of the S. meliloti genome (strain 1021). Each replicon is drawn to scale. First line: Coordinates relative to the sequence on Web site (2) (in megabases). Second line: Distribution of genes according to direction of transcription (+ strand above) and functional categories (blue, small molecule metabolism; green, macromolecule metabolism; orange, structural elements; yellow, cell processes; red, elements of external origin; pink, not classified regulators; gray, conserved hypothetical and unknown/hypothetical). Third line: Distribution of IS and phage-related sequences. Fourth line: Recently duplicated genes (those with at least 90% nucleotide identity over their entire length) includingrrn operons (green arrows). Duplications of differently named genes are matched by color. Because of space constraints, theSMa0753/Sma0758 duplication at 0.41 Mb is not shown. Loci where clusters of genes are reiterated are indicated by an asterisk. Fifth line: GC% variation along the replicons with mean value as a red dotted line.

Table 1

General features of the S. meliloti strain 1021 genome.

View this table:

Although one of the largest bacterial genomes (6.7 Mb) sequenced to date, the S. meliloti genome is somewhat smaller than the 7.6-Mb M. loti genome. We predict 6204 protein-coding genes from the S. meliloti genome sequence, compared with 6752 forM. loti (3). A function could be postulated for 59.7% of S. meliloti genes on the basis of database comparisons, whereas 8.2% of the S. meliloti gene products had no database match. The proportion of orphan genes was significantly higher on the megaplasmids than on the chromosome, with 11.5% on pSymA and 12.3% on pSymB (Table 1).

Contrary to expectations (9–11), the genome of S. meliloti is not highly reiterated. A limited number of genes appear to be recently duplicated, including several symbiotic genes (Fig. 1). However, the S. meliloti genome contains many ancient duplications, because 42% (2589) of S. meliloti genes belong to 548 paralogous families, ranging from 2 to 134 genes per family (2, 12). This high level of paralogy suggests that genome size has been little constrained during S. meliloti evolution, facilitating the acquisition of new adaptive functions for life in the soil and for symbiosis. This is illustrated by the rich set of transport and regulatory functions (see below).

Insertion sequence (IS) elements and phage sequences compose 2.2% of the S. meliloti genome, but their distribution varies (Table 1). Overall abundance is higher on pSymA, especially near symbiotic genes (Fig. 1), a feature similar to other rhizobial symbiotic plasmids and regions (3–5,13). This provides additional evidence that symbiotic regions are prone to DNA rearrangements (3). Twenty-one types of IS are identified on the S. meliloti genome: four are chromosome-specific, four are pSymA-specific, and one is pSymB-specific [see (2) for additional data on IS].

Replication, transfer, and maintenance of pSym megaplasmids. The unusual size of the megaplasmids raises the question of whether they are plasmids or chromosomes. pSymA and pSymB share plasmid features with Rhizobium sp. pNGR234aand Agrobacterium Ti and Ri plasmids:repABC genes were identified by sequence similarity, and a linked putative origin of replication was inferred from GC skew analysis. pSymA contained putative conjugative transfer genes (traACDG) and a putative oriT sequence, but lacked the traIRMBF and trbDJKLFH genes found on other rhizobial plasmids. Transfer experiments are required to determine whether pSymA is a transferable plasmid. pSymB lacks transfer genes, except for a paralog of the pSymA traA andoriT. Its lower G + C content (60.4%) compared with the other two replicons (Table 1) and its strikingly distinct codon usage [see Web site (2)] suggest an alien origin for pSymA.

No essential gene could be predicted on pSymA, consistent with previous data (14). However, essential genes are present on pSymB, including the arginine tRNA,Arg tRNA CCG; theminCDE cell division genes that may also be essential; and two candidate genes for asparagine synthesis (asn), one of which should be required for growth in minimal medium (8). Therefore pSymA is clearly plasmidlike, whereas pSymB has several chromosomal features.

Transport functions. Genes encoding transport systems constitute the largest (12%) class of genes in the S. melilotigenome. Most of these are ABC transporters (Fig. 2), as is the case in other bacterial genomes. Their relative abundance is particularly high (17.4%) on pSymB, where almost all are predicted to be import systems (8). Thus, pSymB plays a prominent role in importing small molecules. Rht transporters (hydroxylated amino acid efflux proteins) are unexpectedly abundant (12 members) in S. meliloti. No phosphoenolpyruvate sugar phosphotransferase (PTS) transport system was found, implying that sugars are transported and subsequently phosphorylated by cytoplasmic sugar kinases that are encoded by the chromosome and pSymB.

Figure 2

Distribution of transport and regulatory proteins by replicon. pSymA is blue, pSymB is yellow, chromosome is green.

Regulatory proteins. Regulatory genes make up a substantial fraction (8.7%) of the S. meliloti genome, especially the megaplasmids (Table 1). The LysR family (86 members) predominates, particularly on pSymA (Fig. 2). GntR regulators are more frequently found on megaplasmids, whereas the AsnC family is more common on the chromosome. Thus, each replicon has a distinct regulatory gene profile.

With only seven members, σ54-dependent transcriptional regulators constitute a small family in S. meliloti. A single “quorum-sensing” system (SMc00168, SMc00170) was found. We identified 36 response regulators and 37 histidine kinases, but no serine-threonine kinases. Thus far, S. meliloti encodes the most nucleotide cyclases (26 members) of any bacterial genome [see (7) for a detailed analysis]. We identified 14 putative RNA polymerase sigma factor genes, most belonging to the extracytoplasmic function (ECF) subfamily. Similarly toCaulobacter crescentus (15) and M. loti(3), S. meliloti lacks a rpoS.

Bacterial adhesion and surface structural elements. How rhizobia adhere to plant root hairs is poorly understood. We identified one putative adhesin (SMc01708), two agglutinin-like genes (SMc00638 and SMc00639), and an ABC transporter (SMa0950 to SMa0953) resembling the attA1A2BC attachment genes of A. tumefaciens (16). Two previously unknown pili were postulated: a type T pilus system similar to thevirB-encoded type IV system of Agrobacterium[see (6)], and one strikingly similar to theCaulobacter crescentus pilus (17), encoded by two sets of homologous genes (pilA/cpa) located on the chromosome and pSymA. Sinorhizobium mliloti lacks a type III secretion system, unlike Rhizobium NGR234 (4), M. loti (3), and B. japonicum (5). Therefore, use of type III secretion systems to infect plant cells is not a universal strategy among rhizobia and instead may play a role in host-specificity (18).

Sinorhizobium meliloti surface polysaccharides, including exopolysaccharides (EPSs), lipopolysaccharides (LPSs), capsular polysaccharides (CPSs), and cyclic β-glucans, encoded mainly by the chromosome and pSymB, are crucial for successful plant infection, possibly by suppressing plant defense responses (19). As many as 12% of pSymB genes may be involved in polysaccharide biosynthesis. We identified two new loci on the chromosome (7) and nine on pSymB (8). It will be interesting to find out the roles that these surface modification genes play in the interaction between the bacterium, the plants with which it comes in contact, and different soil environments.

Nodulation. In S. meliloti, nodulation genes required for the synthesis and export of Nod factors are located on pSymA. Our analysis sheds new light on the possible origin of these genes in S. meliloti.

We found two highly conserved duplications of nodgenes in the genome (Fig. 1). nodM is 99% identical at the nucleotide sequence level to glmS, encodingd-glucosamine synthetase, which suggests thatnodM emerged recently from duplication of the housekeeping chromosomal glmS. Each megaplasmid carries a copy ofnodPQ, which is 99% conserved at the nucleotide level (9) and is involved in the activation of sulfate to 3′-phosphoadenosine 5′-phosphosulfate for sulfation of Nod factors inS. meliloti. Vestiges of an IS element next to the pSymA copy of nodPQ (20) suggest that this copy arose from transposition of an ancestral pSymB copy. In addition to these duplications, we discovered that nodG is a paralog of the housekeeping chromosomal fabG. Overall, sequence analysis suggests that the S. meliloti nod genes have two distinct origins: horizontal gene transfer, mediated by import of pSymA from an unknown bacterium, and resident gene duplication.

Nitrogen fixation and nitrogen metabolism. Nitrogen metabolism is a prominent feature encoded by the S. melilotigenome, particularly pSymA. Whereas nitrogenase synthesis and activity require up to 20 nif genes in Klebsiella pneumoniae, only nine nif genes are found in theS. meliloti genome (nifA, nifB,nifHDKE, nifX, nifN, andnifS). Except for a likely nifS ortholog on the chromosome and a possible nifV gene (SMc02546), all of these genes are located on pSymA. Although nifQ, nifZ, and nifW are found in Rhizobium sp. NGR234 (4) and M. loti (3), no homologs were found in S. meliloti.

Besides nitrogen fixation genes, pSymA carries glutamate dehydrogenase (gdhA), a full subset of genes necessary for denitrification (nos, nor, and nap), and nitrate transport genes. The chromosome bears the knownntrBC, glnB, glnA, and glnTgenes; an alanine dehydrogenase (ald); the ammonium transporter amtB; the regulatory proteins ntrXY,glnE, glnK, and glnD; the GOGAT glutamate synthase system gltBD; and three previously unknown glutamine synthetase homologs. pSymB encodes a nitrate reductase (narB), two nitrate transporters (nrtA, SMb20436), and a single glutamine synthetase glnII.

Energy metabolism in relation to symbiosis. Sinorhizobium meliloti is an aerobic bacterium that must generate high levels of energy to support nitrogen fixation in the low-oxygen environment of the nodule. A previously characterized cytochrome c oxidase of the cbb3 type with high affinity for oxygen is encoded by two sets of duplicated fixNOQP genes on pSymA. Analysis revealed an additional, less-conserved copy of the fixNOQP cluster. Both pSymA and the chromosome carry a large NADH-ubiquinone dehydrogenase gene cluster that may enhance energy metabolism in symbiosis, possibly along with the fixNOQP-encoded cbb3 oxidase. pSymA also encodes two formate dehydrogenases (6).

Comparison of the S. meliloti genome to other rhizobial genomes. We compared (21) the predicted protein content of the S. meliloti genome with that of the recently sequenced M. loti genome (Fig. 3). Several conclusions emerged from this comparison: (i) Thirty-five percent ofM. loti genes have no ortholog in S. meliloti; (ii) the genetic information carried by pSymA or pSymB in S. meliloti is dispersed in the M. loti genome; (iii) theM. loti MAFF303099 symbiotic island contains, besides nodulation and nitrogen fixation genes, genes that have no ortholog in the S. meliloti genome. Similarly, a high proportion (54%) of the 536-kb Rhizobium sp. NGR234 symbiotic plasmid genes have no ortholog in S. meliloti and those which do are distributed over the three S. meliloti replicons [see figure on Web site (2)]. Altogether these observations indicate that rhizobia, despite their taxonomic relatedness and symbiotic habit, differ significantly in gene content and organization. It is not known whether different isolates of a particular species will likewise show a high degree of genetic diversity. Further work will be needed to determine whether conserved and varying genes relate to adaptations for particular plant rhizospheres, for other environmental conditions, or for other adaptations not yet defined.

Figure 3

Comparison of M. loti andS. meliloti predicted proteins. The M. lotigenome from bp 1 to 7 Mb is distributed along the x axis. In any given window along the x axis, the proportion within that window that has a significant match [see (21)] in theS. meliloti genome is displayed, and the color indicates the location of the match: blue for pSymA, yellow for pSymB, and green for the chromosome. White represents the proportion that has no global match to S. meliloti. Arrows indicate the M. loti symbiotic island.

Conclusion and Perspectives

Determination of the S. meliloti 1021 genome sequence shows that it has a composite architecture, consisting of three replicons with distinctive structural and functional features. We interpret this as a consequence of its recent emergence. Both structural and gene function analyses are consistent with the hypothesis that the two megaplasmids were acquired separately by an ancestor whose genome consisted of a single chromosome. pSymA was acquired more recently, in evolutionary terms, as indicated by its distinctive GC% and codon usage, its paucity ofRhizobium-specific intergenic mosaic elements (RIMEs) and ABC elements, and the specificity of its IS content. pSymB acquisition probably preceded that of pSymA or may have resulted from a chromosomal excision event. However, distinct features of pSymB, including gene specialization, low abundance of IS elements, and a high proportion of orphan genes (Table 1), argue against a chromosomal origin for pSymB.

It is tempting to speculate how acquisition of the megaplasmids by the ancestral rhizobium widened its metabolic capacities and environmental adaptability. The chromosome of S. meliloti is that of a typical aerobic, heterotrophic bacterium. Acquisition of pSymB considerably extended the metabolic capabilities of the microbe by allowing it to metabolize a large variety of small compounds encountered in the soil or in the plant rhizosphere. An increased capacity in synthesizing polysaccharides may also have significantly improved the colonization potential of these microbes. Finally, acquisition of pSymA led to the emergence of nodulation, as well as the bacterium's capacity to colonize the low-oxygen environment of the nodule. pSymA also expanded the capacity to metabolize nitrogen compounds under a variety of chemical forms, including molecular dinitrogen. Such speculation may offer new perspectives for microbial evolution and for identifying the origins of the rhizobium-legume symbiosis. The complete S. melilotigenome sequence and its detailed annotation creates opportunities for an expanded analysis of symbiotic nitrogen fixation by allowing researchers to focus on specific metabolic and regulatory circuits. Functional analyses of the S. meliloti genome will lead to further insights in understanding this and other rhizobium-legume symbioses.

  • Present address: Institut Curie, 26 rue d'Ulm, 75005 Paris, France.

  • Present address: Exelixis, Inc., 170 Harbor Way, Post Office Box 511, South San Francisco, CA 94083–0511, USA.

  • § Present address: Incyte Genomics, 3160 Porter Drive, Palo Alto, CA 94304, USA.

  • || Present address: Département de Biologie Moléculaire Sciences 2, Université de Genève, Geneva, Switzerland 1211.

  • Present address: Institute of BioAgricultural Sciences, Academia Sinica, Nankang, Taipei, Taiwan 11529.


View Abstract

Navigate This Article