Research Article

Design and synthesis of a minimal bacterial genome

See allHide authors and affiliations

Science  25 Mar 2016:
Vol. 351, Issue 6280, aad6253
DOI: 10.1126/science.aad6253
  1. Four design-build-test cycles produced JCVI-syn3.0.

    (A) The cycle for genome design, building by means of synthesis and cloning in yeast, and testing for viability by means of genome transplantation. After each cycle, gene essentiality is reevaluated by global transposon mutagenesis. (B) Comparison of JCVI-syn1.0 (outer blue circle) with JCVI-syn3.0 (inner red circle), showing the division of each into eight segments. The red bars inside the outer circle indicate regions that are retained in JCVI-syn3.0. (C) A cluster of JCVI-syn3.0 cells, showing spherical structures of varying sizes (scale bar, 200 nm).

  2. Fig. 1 The JCVI DBT cycle for bacterial genomes.

    At each cycle, the genome is built as a centromeric plasmid in yeast, then tested by transplantation of the genome into an M. capricolum recipient. In this study, our main design objective was genome minimization. Starting from syn1.0, we designed a reduced genome by removing nonessential genes, as judged by global Tn5 gene disruption. Each of eight reduced segments was tested in the context of a seven-eighths syn1.0 genome and in combination with other reduced segments. At each cycle, gene essentiality was reevaluated by Tn5 mutagenesis of the smallest viable assembly of reduced and syn1.0 segments that gave robust growth.

  3. Fig. 2 Strategy for whole-genome synthesis.

    Overlapping oligonucleotides (oligos) were designed, chemically synthesized, and assembled into 1.4-kbp fragments (red). After error correction and PCR amplification, five fragments were assembled into 7-kbp cassettes (blue). Cassettes were sequence-verified and then assembled in yeast to generate one-eighth molecules (green). The eight molecules were amplified by RCA and then assembled in yeast to generate the complete genome (orange).

  4. Fig. 3 Classification of gene essentiality by transposon mutagenesis.

    (A) Examples of the three gene classifications, based on Tn5 mutagenesis data. The region of syn1.0 from sequence coordinates 166,735 to 170,077 is shown. The gene MMSYN1_0128 (lime arrow) has many P0 Tn5 inserts (black triangles) and is an i-gene (quasi-essential). The next gene, MMSYN1_0129 (light blue arrow), has no inserts and is an e-gene (essential). The last gene, MMSYN1_0130 (gray arrow), has both P0 (black triangles) and P4 (magenta triangles) inserts and is an n-gene (nonessential). Intergenic regions are indicated by black lines. (B) The number of syn1.0 genes in each Tn5-mutagenesis classification group. The n- and in-genes are candidates for deletion in reduced genome designs.

  5. Fig. 4 The three DBT cycles involved in building syn3.0.

    This detailed map shows syn1.0 genes that were deleted or added back in the various DBT cycles leading from syn1.0 to syn2.0 and finally to syn3.0 (compare with fig. S7). The long brown arrows indicate the eight NotI assembly segments. Blue arrows represent genes that were retained throughout the process. Genes that were deleted in both syn2.0 and syn3.0 are shown in yellow. Green arrows (slightly offset) represent genes that were added back. The original RGD1.0 design was not viable, but a combination of syn1.0 segments 1, 3, 4, and 5 and designed segments 2, 6, 7, and 8 produced a viable cell, referred to as RGD2678. Addition of the genes shown in green resulted in syn2.0, which has eight designed segments. Additional deletions, shown in magenta, produced syn3.0 (531,560 bp, 473 genes). The directions of the arrows correspond to the directions of transcription and translation.

  6. Fig. 5 Map of proteins in syn3.0 and homologs found in other organisms.

    Searches using BLASTP software were performed for all syn3.0 protein-coding genes against a panel of 14 organisms ranging from non-Mycoides mycoplasmas to Archaea. A score of 1e−5 was used as the similarity cutoff. From left to right, five classes (equivalog, 232 genes; probable, 58 genes; putative, 34 genes; generic, 84 genes; and unknown, 65 genes) proceed from nearly complete certainty about a gene’s activity (equivalog) to no functional information (unknown). White space indicates no homologs to syn3.0 in that organism.

  7. Fig. 6 Partition of genes into four major functional groups.

    Syn3.0 has 473 genes. Of these, 79 have no assigned functional category (Table 1). The remainder can be assigned to four major functional groups: (i) expression of genome information (195 genes); (ii) preservation of genome information (34 genes); (iii) cell membrane structure and function (84 genes); and (iv) cytosolic metabolism (81 genes). The percentage of genes in each group is indicated.

  8. Fig. 7 Comparison of syn1.0 and syn3.0 growth features.

    (A) Cells derived from 0.2 μm–filtered liquid cultures were diluted and plated on agar medium to compare colony size and morphology after 96 hours (scale bars, 1.0 mm). (B) Growth rates in liquid static culture were determined using a fluorescent measure (relative fluorescent units, RFU) of double-stranded DNA accumulation over time (minutes) to calculate doubling times (td). Coefficients of determination (R2) are shown. (C) Native cell morphology in liquid culture was imaged in wet mount preparations by means of differential interference contrast microscopy (scale bars, 10 μm). Arrowheads indicate assorted forms of segmented filaments (white) or large vesicles (black). (D) Scanning electron microscopy of syn1.0 and syn3.0 (scale bars, 1 μm). The picture on the right shows a variety of the structures observed in syn3.0 cultures.

  9. Fig. 8 Reorganization of gene order in segment 2.

    Genes involved in the same process were grouped together in the design for “modularized” segment 2. At the far left, the gene order of syn1.0 segment 2 is indicated. Genes deleted in syn3.0 are indicated by faint gray lines. Retained genes are indicated by colored lines matching the functional categories to which they belong, which are shown on the right. Each line connects the position of the corresponding gene in syn1.0 with its position in the modularized segment 2. Black lines represent intergenic sequences containing promoters or transcriptional terminators.

  10. Fig. 9 Gene content and codon usage principles, tested using the DBT cycle.

    (A) Secondary structure of the modified rrs gene that was successfully incorporated into the syn3.0 genome; this gene was carrying M. capricolum mutations and had its h39 (inset) swapped with that of E. coli. Positions with nucleotide changes are indicated by red arrows, and E. coli numbering is used to indicate the position of M. capricolum mutations. (B) The sequences of the essential genes era, recO, and glyS were modified in three different ways: using M. mycoides CAI with TGG encoding tryptophan, E. coli CAI with TGG encoding tryptophan, or E. coli CAI with TGA encoding tryptophan. GC content of the wild-type and modified genes is noted. The JCat codon adaptation tool was used for this exercise (www.jcat.de) to optimize the three open reading frames, with the exception of the overlapping gene fragment. Green and purple indicate wild-type and codon-optimized sequences, respectively.