Report

Complete Genome Sequence of Neisseria meningitidis Serogroup B Strain MC58

See allHide authors and affiliations

Science  10 Mar 2000:
Vol. 287, Issue 5459, pp. 1809-1815
DOI: 10.1126/science.287.5459.1809

Abstract

The 2,272,351–base pair genome of Neisseria meningitidis strain MC58 (serogroup B), a causative agent of meningitis and septicemia, contains 2158 predicted coding regions, 1158 (53.7%) of which were assigned a biological role. Three major islands of horizontal DNA transfer were identified; two of these contain genes encoding proteins involved in pathogenicity, and the third island contains coding sequences only for hypothetical proteins. Insights into the commensal and virulence behavior of N. meningitidis can be gleaned from the genome, in which sequences for structural proteins of the pilus are clustered and several coding regions unique to serogroup B capsular polysaccharide synthesis can be identified. Finally, N. meningitidis contains more genes that undergo phase variation than any pathogen studied to date, a mechanism that controls their expression and contributes to the evasion of the host immune system.

Neisseria meningitidis(meningococcus), a Gram-negative β-Proteobacterium (a class that includes Bordetella, Burkholderia,Kingella, and Methylomonas), is a cause of life-threatening invasive bacterial infections, especially in young infants. The major diseases caused by N. meningitidis, meningitis and septicemia, are a significant public health problem and are responsible for deaths and disability through epidemics in sub-Saharan Africa, and sporadic cases that are prevalent in many countries worldwide (1). There are five pathogenic N. meningitidis serogroups (A, B, C, Y, and W135) as determined by capsular polysaccharide typing (2). Some disease control has been achieved through vaccination, but its impact has been limited. The pattern of serogroup B disease is typically hyperendemic or sporadic and contrasts with the classically epidemic nature of the serogroup A disease. Strains of serogroup B are a particular problem because they are a major cause of invasive disease in Europe and the United States (3), and there is currently no effective vaccine. Sequencing the genome of strain MC58 [a serogroup B strain isolated from a case of invasive infection (4)] provides an efficient means of acquiring data relevant to the detailed molecular characterization of this pathogen. Preliminary comparison of its genome sequence to that of the unannotated genome sequence of the serogroup A strain Z2491 (5), as well as comparison of the gene complement of strain MC58 with that of Haemophilus influenzae (6), another pathogen responsible for meningitis and the first for which complete genome sequence data were available, also provides an opportunity to define a common subset of genes that may be responsible for the pathogenesis of this disease.

The complete genome sequence (GenBank accession number AE002098) was obtained by the random shotgun sequencing strategy (7). N. meningitidis strain MC58 has a genome size of 2,272,351 base pairs (bp) with an average G+C content of 51.5%. Base pair 1 of the chromosome was assigned within the putative origin of replication that was determined by the presence of a cluster of DnaA boxes, oligomer-skew (8), and G-C skew (9) analyses. The approach to genome annotation was novel in that it combined the results of open reading frame (ORF) prediction with whole-genome homology searches (10). In addition, experiments on a subset of ORFs showed that the products of 85 of these were potentially located on the surface of the meningococcus (11).

The genome contains four ribosomal RNA (rRNA) operons (16S-23S-5S) and 59 tRNAs with specificity for all 20 amino acids. The 2158 ORFs identified [Fig. 1; Web figure 1 and table 1 (12)] represent 83% of the genome, with an average size of 874 bp. Biological roles were assigned to 1158 ORFs (53.7%) with similarity to proteins of known function according to the classification scheme adapted from Riley (13). Three hundred and forty-five (16.0%) predicted coding sequences matched gene products of unknown function from other species, and 532 (24.7%) had no database match (www.tigr.org/tdb/mdb/mdb.html).

Figure 1

Circular representation of the N. meningitidis strain MC58 genome. Outer circle, predicted coding regions on the plus strand color-coded by role categories [Web figure 1 and table 1 (12)]: salmon, amino acid biosynthesis; light blue, biosynthesis of cofactors, prosthetic groups, and carriers; light green, cell envelope; red, cellular processes; brown, central intermediary metabolism; yellow, DNA metabolism; green, energy metabolism; purple, fatty acid and phospholipid metabolism; pink, protein fate and synthesis; orange, purines, pyrimidines, nucleosides, and nucleotides; blue, regulatory functions; gray, transcription; teal, transport and binding proteins; black, hypothetical and conserved hypothetical proteins. Second circle, predicted coding regions on the minus strand color-coded by role categories. Third circle, predicted coding regions on the plus strand color-coded by function in virulence (Tables 1 and 2): red, acquisition; cyan, colonization; magenta, evasion; yellow, toxins; green, unknown; dark green, putative phase-variable genes; blue, candidate vaccine genes that were expressed in vitro and whose products are potentially located on the surface of the meningococcus (11); black, hypothetical and conserved hypothetical proteins. Fourth circle, predicted coding regions on the minus strand color-coded by function in virulence. Fifth circle,Neisseria USSs. Sixth circle, N. gonorrhoeaeinverted repeats (33). Seventh circle, atypical nucleotide composition curve [the dinucleotide signatures analysis (29) is shown]. Eighth circle, percent G+C curve. Ninth circle, tRNAs. Tenth circle, ribosomal RNAs.

We have identified 234 families of proteins in strain MC58 (14), containing a total of 678 proteins (32% of the total number of ORFs). The largest family consists of 24 adenosine 5′-triphosphate (ATP)–binding subunits of ABC transporters. The extent of potential recent gene duplications in strain MC58 was estimated by identification of ORFs that are more similar to other ORFs within the strain MC58 genome than to ORFs from other genomes (15). This analysis revealed that out of 678 proteins present in families, 105 may have evolved through a process that involved duplication. Most of these are predicted to be involved in pathogenicity functions or transposition activities, or are of unknown function.

Strain MC58 contains 22 intact and 29 remnant insertion sequences (ISs). Most of the latter have transposase genes that contain multiple frame shifts, large deletions and/or premature termination codons, or lack associated inverted repeats. The IS families represented are IS3, IS5, IS30, and IS110 (16), and two groups belong to as yet unclassified families; one of these is related to IS1016 (17), and the second was first identified inSynechocystis sp. (18). The majority of putative functional ISs are closely related to IS4351 (19) (IS30 family), whereas the other putative functional ISs are related to IS1106A3 (20) (IS5 family). Three homologs of the pilin-associated invertases of N. gonorrhoeae(21) and Moraxella lacunata (22), which are closely related to transposases, are present. Two of these genes, pivNM-1A and pivNM-1B, are identical to one another and are similar to the N. gonorrhoeae sequence (42.4% identity), whereas the remaining gene, pivNM-2, is slightly more divergent (38.8% identity).

Neisserial DNA uptake signal sequences (USSs) for DNA transformation (23) play a role in recognition of homospecific DNA during transformation. In strain MC58, a total of 1910 USSs (5′-GCCGTCTGAA-3′) distributed throughout the genome have been identified, with an approximately equal number on each strand. Only four USSs are expected on the basis of genome sequence length and base composition. Forty-six percent of the USSs are found as inverted pairs located 3′ of ORFs, which could form stem-loop structures and function as transcriptional terminators.

Molecular phylogenetic studies have indicated that theNeisseria genus belongs to the β subgroup of the Proteobacteria. The β-Proteobacteria are closely related to the γ-Proteobacteria, which include Vibrio,Haemophilus, and Escherichia. The availability of complete genome sequences from several proteobacterial species (H. influenzae, Helicobacter pylori,Escherichia coli, Rickettsia prowazekii, andCampylobacter jejuni) allows an evaluation of the position of Neisseria within the Proteobacteria and a better understanding of recent evolutionary events in the Neisserialineage. Seventy-one percent of the ORFs in strain MC58 are most similar (24) to ORFs from γ-Proteobacteria (E. coli or H. influenzae), indicating that most of the genome is proteobacterial in nature and supporting the hypothesis that the β and γ subgroups share a recent common ancestor. Since the divergence of the β and γ subgroups, there must have been extensive gene loss in the H. influenzae lineage and/or gene addition and duplication in the E. coli lineage that has led to the much larger genome size in E. coli. Relative to the total number of genes in each genome, N. meningitidis is more similar to H. influenzae than toE. coli (Fig. 2). This could be due either to parallel loss of genes in N. meningitidis and H. influenzae and/or exchanges of genes between these lineages.

Figure 2

Comparison of the N. meningitidisstrain MC58 ORFs to that of other completely sequenced organisms. The sequences of all proteins from each completely sequenced genome were retrieved from the National Center for Biotechnology Information, TIGR, and the Caenorhabditis elegans (wormpep16) databases. AllN. meningitidis ORFs were searched against the ORFs from all other genomes with FASTA3 (24). The number of N. meningitidis ORFs whose highest similarity (p < 10−5) is to an ORF from a given species is shown in proportion to the total number of ORFs in that species. Abbreviations: AQUAE, Aquifex aeolicus; DEIRA,Deinococcus radiodurans; BACSU, Bacillus subtilis; MYCGE, Mycoplasma genitalium; MYCPN,Mycoplasma pneumoniae; MYCTU, Mycobacterium tuberculosis; BORBU, Borrelia burgdoferi; TREPA,Treponema pallidum; CHLPN, Chlamydia pneumoniae; CHLTR, Chlamydia trachomatis; ECOLI, Escherichia coli; HAEIN, Haemophilus influenzae; RICPR,Rickettsia prowazekii; HELPY, Helicobacter pylori; CAMJE, Campylobacter jejuni; SYNSP,Synechocystis sp.; THEMA, Thermotoga maritima; AERPE, Aeropyrum pernix; ARCFU, Archaeoglobus fulgidus; METJA, Methanococcus jannaschii; METTH, Methanobacterium thermoautotrophicum; PYRHO,Pyrococcus horikoshii; PYRAB, Pyrococcus abyssii; CELEG, Caenorhabditis elegans; YEAST, Saccharomyces cerevisiae.

Of the 2158 ORFs in serogroup B strain MC58, 1968 (91.2%) are similar (BLASTN P < 10−10) to ORFs in the serogroup A strain Z2491 (5) and are likely orthologs. Most of the 190 ORFs without similarity are hypothetical proteins. Comparison between the serogroup B strain MC58 and the serogroup A strain Z2491 genome sequences reveals a major inversion of 955 kb. The scale of this inversion is substantially greater than has been previously described in nonenteric bacteria (25) and also differs from the homologous recombination associated with rRNA operons in enteric bacteria (26). The other main differences between these two strains reside in the regions described below.

Lateral transfer of DNA between species is well documented and is often associated with the evolution of pathogenicity (27). Regions of DNA that have been obtained by lateral gene transfer are often characterized by atypical DNA composition relative to the rest of a genome (28). Therefore, percent G+C, dinucleotide signatures, and χ2 analyses (29) were used to identify such regions. Three major regions of atypical nucleotide composition were identified in the strain MC58 genome; these have been designated as putative islands of horizontally transferred DNA (IHTs) (Figs. 1 and 3). IHT-A consists of two subregions: IHT-A1 (NMB0066 to NMB0074) contains the genes of the serogroup B capsulation cluster and an adenine rRNA methylase. IHT-A2 (NMB0091 to NMB0100) contains two disrupted ORFs with similarity to an ABC transporter and a secreted protein, and eight hypothetical proteins and is flanked by two disrupted copies of IS1016 (17). IHT-B contains 24 hypothetical proteins (NMB0498 to NMB0521). IHT-C contains 30 ORFs (NMB1746 to NMB1775), several of which encode proteins that may have a role in virulence. These include genes encoding three toxin/toxin-related homologs; a protein known to be immunogenic (30); one intact and three fragmented proteins previously associated with bacteriophage (31); a protein similar to a virulence-associated protein from H. pylori (32); two different, apparently intact transposases that do not flank the region; and 19 hypothetical proteins. The transposases do not form a composite transposon based on their location. One transposase is similar to that of IS4351 (19) (of which there are 14 copies in the genome), and the other is pivNM-2. IHT-B and IHT-C are devoid of neisserial USSs, and all three IHTs lack an inverted repeat identified in N. gonorrhoeae (33) that is present in 163 copies in strain MC58, supporting their foreign origin (Fig. 1). The three IHTs in strain MC58 do not have the classical characteristics of “pathogenicity islands” (34) since transposase genes flank only IHT-A2, other regions are not flanked by inverted repeats, and none is adjacent to tRNA genes.

Figure 3

Structure of the putative islands of horizontally tranferred DNA (IHTs) in the N. meningitidisstrain MC58 genome. Empty boxes are hypothetical proteins and striped boxes are conserved hypothetical proteins. IHT-A1: NMB0066, adenine rRNA methylase ErmC; NMB0067 to NMB0070, capsule biosynthesis proteins SiaD, SiaC,SiaB, and SynX; NMB0071 to NMB0074, capsule export proteins CtrA, CtrB, CtrC, and CtrD. IHT-A2: NMB0097 and NMB0098, disrupted secreted protein and ABC transporter. IHT-C: NMB1747, tspB protein; NMB1750, PivNM-2; NMB1751, NMB1769, and NMB1770, transposases; NMB1753 and NMB1754, bacteriophage-related proteins; NMB1762, NMB1763, and NMB1768, toxin/toxin-related homologs.

Differences in IHTs between N. meningitidis serogroup A and serogroup B strains could suggest a different complement of putative virulence determinants. A comparison to the genome sequence of the serogroup A strain Z2491 (5) reveals that it contains only one IHT and that it is not present in the strain MC58 genome. Conversely, none of the strain MC58 IHTs are present in the strain Z2491 genome. In contrast with IHT-A1, the genes of the serogroup A capsulation cluster do not generate the same IHT signature, suggesting that the different component genes have been in Neisseriafor a longer evolutionary time.

A crucial factor in the commensal and pathogenic behavior of N. meningitidis is its capacity to obtain and synthesize nutrients essential for its survival [Web figure 2 (12)]. Genome analysis supports the observation that maltose and glucose are the only two sugars utilized for energy. The uptake and initial degradation of maltose may be accomplished by a system similar to that described forLactobacillus sanfranciscensis (35). In this bacterium, import of maltose is mediated by a maltose/H+symport, rather than the ABC transporter (MalEFGK) or PEP:PTS system (MalX) characterized in E. coli and many other bacteria. Metabolism of maltose to glucose proceeds by a maltose phosphorylase (NMB0390) distinct from other α-glucosidases. This gene is associated with sequences that encode proteins involved in both sugar metabolism (NMB0389 and NMB0391) and sugar transport (NMB0388), suggesting coordinate regulation of maltose catabolism with substrate availability.

Degradation of glucose, the amino acids serine, proline, and glycine, and the organic acids acetate, gluconate, glutamate, lactate, malate, oxaloacetate, and pyruvate is accomplished by way of an intact tricarboxylic acid (TCA) cycle, previously identified in biochemical studies of N. meningitidis (36), as well as the pentose phosphate and Entner-Doudoroff pathways. N. gonorrhoeae demonstrates metabolic diversity under different growth conditions. This bacterium alters glucose metabolism in response to changes of environmental pH (37), with glucose being channeled through glycolysis, the Entner-Doudoroff pathway, or the pentose phosphate pathway, depending on pH fluctuations in the growth medium.

The binding of iron by a number of high-affinity binding proteins in the human body constrains its availability (38), and successful pathogens have evolved mechanisms to acquire iron from the host (39). Genome analysis reveals that N. meningitidis has a large number of systems for scavenging iron, including previously recognized hemoglobin, transferrin, and lactoferrin binding proteins (38), as well as additional systems for iron acquisition, including siderophore acceptor and utilization homologs whose functions have not yet been investigated.

Although both H. influenzae and N. meningitidisreside in the nasopharynx and can cause meningitis, there are metabolic differences between these organisms identified in previous biochemical characterizations (40) and extended by genome analysis.H. influenzae lacks an intact TCA cycle and the Entner-Doudoroff pathway, has far fewer systems available for the import of iron, and has a smaller percentage of genes devoted to electron transport systems than N. meningitidis. H. influenzae does, however, have a larger number of transporters devoted to the import of amino acids and carbohydrates including glucose, xylose, ribose, fucose, mannonate, and galactose, suggesting that H. influenzae derives more energy through substrate-level phosphorylation than through elaborate electron-transfer systems. The significance of the metabolic differences between N. meningitidis and H. influenzae is not immediately apparent but may, in part, be responsible for the differential abilities of these two organisms to cause disease under varying physiological conditions in the human host.

When the meningococcus causes invasive disease in a susceptible individual, the process involves invasion of the respiratory tract epithelia and the underlying endothelia of the microvascular system, followed by systemic dissemination through the bloodstream. Efficient replication of meningococci and the elaboration of cell wall and other microbial molecules, such as lipopolysaccharide (LPS) or peptidoglycan, excite inflammation in the host tissues. The outcome is life-threatening septicaemia and metastatic spread to the meninges and cerebrospinal fluid (meningitis).

The genes in N. meningitidis known to be related to pathogenicity were classified into several major functional categories (Table 1). Genes encoding proteins involved in colonization of the human respiratory tract epithelium are present in strain MC58. These include genes for the type IV pilus, Opa proteins (four genes), and Opc. Pili are crucial to the niche specialization of the meningococcus because they mediate tropism specifically for human respiratory tract epithelia (41). Antigenic variation of pilus proteins is important to this function because it generates the polymorphisms that facilitate the capacity to colonize different individuals and niches within each host and to evade clearance mechanisms (42). In both the serogroup B strain MC58 and the serogroup A strain Z2491, the transcriptionally silent partial pilin-coding sequences (eight cassettes) and the expressed gene encoding the complete pilin protein are grouped together in a single locus. Recombination within these cassettes, by a process probably involving associated DNA repeat sequences, generates the antigenic variation of the polymerized pilin proteins. The clustered organization of sequences in N. meningitidis contrasts with that found in the closely related pathogen N. gonorrhoeae, in which the silent and expressed sequences are distributed throughout the genome (43). In the meningococcus, there is a single Sma/Cla repeat downstream of this locus, similar to that documented in N. gonorrhoeae (44), but the intergenic regions of the meningococcus differ in the number and distribution of RS1, RS2, and RS3 repeats (43). In particular, the intergenic region contains a relatively large number of RS3 repeats arranged as inverted pairs. These findings suggest that the N. meningitidis andN. gonorrhoeae use different mechanisms and sequence substrates for generating antigenic variation of the pilus.

Table 1

Putative pathogenicity genes in N. meningitidis strain MC58, listed by functional category. Systematic gene names start with NMB (N. meningitidisserogroup B), followed by a number indicating the gene position with regard to the origin of replication.

View this table:

In silico analysis of the genome sequence facilitates the identification of genes involved in surface interactions and virulence, based on specific searches for homologs of those that have been previously characterized in other pathogenic species. These include three large related proteins that contain homologies to adhesins, as well as toxin motifs and an additional type IV pilin protein (NMB0547) that is unrelated to the pilE/pilS system. This protein is distinct from the neisserial type II pilus protein previously recognized as an alternative pilus system in N. meningitidis (45).

The capsular polysaccharide and LPS that impede clearance and killing of meningococci by phagocytes are critical in the pathogenesis of invasive infection and have been previously characterized (46). In addition to the genes required for synthesis of the polysialic acid capsule of strain MC58 present in IHT-A1, there are differences compared to the serogroup A sequence in the adjacent capsule exporter genes (47). Whereas CtrA and CtrD share >95% identity in their deduced amino acids to their serogroup A homologs, the ctrB and ctrCgenes, organized in an operon between ctrA andctrD, encode proteins that have <85% amino acid identity with their homologs in the serogroup A genome. CtrB and CtrC are thought to be membrane-associated components of the capsule export apparatus (47). Such divergence may reflect specificity for the serogroup B (polysialic acid) or serogroup A (poly-2-acetamido-2-deoxy-d-mannopyranosyl phosphate) capsular polysaccharides.

Another mechanism that contributes to the evasion of the host immune system in meningococcus is phase variation that controls gene expression. Indeed, most of the recognized host-interactive factors are phase variable (48); these have been called “contingency loci” (49). The genetic basis for this variation depends on the evolution of iterative DNA motifs, especially homopolymeric tracts, to effect reversible, high-frequency molecular switching through slippage-like mechanisms (49). Although the mutation rate at these loci is high in all strains of N. meningitidis, pathogenic strains may be hypermutable at these loci due in part to genetic defects in the DNA mismatch repair process (50). The genome sequence shows that strain MC58 has a defect in the Dam methylase similar to those found in the hypermutable strains. The repeats typically associated with phase variation and their contexts have been analyzed (51) to reveal all potentially phase-variable genes in strain MC58 (Table 2). All genes previously recognized to be phase variable in meningococci were identified in strain MC58. The repertoire differs between strain MC58 and other reported strains, e.g., in the iron acquisition and LPS biosynthetic genes. A number of novel phase-variable surface proteins, restriction modification systems, and genes not previously associated with repeat mediated phase variation, e.g., toxin secretion systems, have been identified, as well as several hypothetical proteins. Genome analysis shows that strain MC58 has a far greater number of putative phase-variable genes than have been recognized in other organisms studied to date. In particular,H. influenzae contains only 14 identified phase-variable genes, most of which are involved in virulence.

Table 2

Potentially phase-variable genes in N. meningitidis strain MC58, listed by functional category.

View this table:

The complete genome sequence of N. meningitidis strain MC58 provides a new starting point for the study of the pathogenesis of meningitis. The identification of a large number of putative virulence determinants that were not previously identified in meningococci and the unprecedented number of phase-variable genes provide new insights into potential mechanisms of the pathogenesis of invasive disease. On the basis of the genome sequence of strain MC58, several long-awaited candidates for vaccination against the serogroup B meningococcus have been identified and characterized experimentally (11). The development of a vaccine against H. influenzae significantly reduced the incidence of infections caused by this organism. A vaccine against the serogroup B meningococcus holds similar promise for dramatically reducing the morbidity and mortality ofNeisseria-mediated meningitis.

  • * To whom correspondence should be addressed. E-mail: tettelin{at}tigr.org

  • Present address: Celera Genomics, 45 West Gude Drive, Rockville, MD 20850, USA.

REFERENCES AND NOTES

View Abstract

Navigate This Article