Assembly principles and structure of a 6.5-MDa bacterial microcompartment shell

See allHide authors and affiliations

Science  23 Jun 2017:
Vol. 356, Issue 6344, pp. 1293-1297
DOI: 10.1126/science.aan3289

How to make a protein-based nanocontainer

Bacterial microcompartments are to bacteria what membrane-bound organelles are to eukaryotic cells. They are specialized subcellular compartments for colocalizing enzymes to enhance reaction rates, protect sensitive proteins, and sequester toxic intermediates. Sutter et al. determined the atomic-resolution structure of a complete 6.5-megadalton bacterial microcompartment shell. The shell is composed of hundreds of copies of five distinct proteins that form hexamers, pentamers, and three types of trimers. The assembly principles revealed by the structure provide the basis to rationally manipulate self-assembly in native and engineered systems and could help, for example, in the design of subcellular nanoreactors.

Science, this issue p. 1293


Many bacteria contain primitive organelles composed entirely of protein. These bacterial microcompartments share a common architecture of an enzymatic core encapsulated in a selectively permeable protein shell; prominent examples include the carboxysome for CO2 fixation and catabolic microcompartments found in many pathogenic microbes. The shell sequesters enzymatic reactions from the cytosol, analogous to the lipid-based membrane of eukaryotic organelles. Despite available structural information for single building blocks, the principles of shell assembly have remained elusive. We present the crystal structure of an intact shell from Haliangium ochraceum, revealing the basic principles of bacterial microcompartment shell construction. Given the conservation among shell proteins of all bacterial microcompartments, these principles apply to functionally diverse organelles and can inform the design and engineering of shells with new functionalities.

Bacterial microcompartments (BMCs) are large, proteinaceous shells encapsulating enzymes. The first discovered, carboxysomes, enhance carbon fixation (1). The BMC shell is a singular example of a primitive, conserved yet functionally diverse bioarchitecture. Recent bioinformatic surveys of bacterial genomes have revealed the presence of genes encoding shell proteins in 23 different bacterial phyla, encapsulating segments of functionally diverse metabolic pathways (2). The major components of BMC shells are cyclic hexamers with a pronounced concave-versus-convex sidedness (3). These proteins, referred to as BMC-H, contain a single BMC (pfam00936) domain (Fig. 1A, blue). A derivative of BMC-H proteins, BMC-T, is a fusion of two BMC domains forming trimers or pseudohexamers (Fig. 1A, green). Some members of the BMC-T family are known to form tightly appressed, stacked dimers of trimers, containing a central cavity (4, 5) (Fig. 1A, BMC-T2 and BMC-T3). BMC-P proteins belong to pfam03319; they are structurally unrelated to the BMC/pfam00936 domain and form pentamers shaped like a truncated pyramid (6) (Fig. 1A, yellow). Despite detailed structural knowledge of the individual shell components, the architectural principles governing shell self-assembly are unknown.

Fig. 1 Overview of the components and overall structure of the BMC shell.

(A) Surface representation and dimensions of a side view (top row) and of the concave face (bottom row) of the structures of hexameric BMC-H (blue), trimeric BMC-T (green), and pentameric BMC-P (yellow) proteins that constitute the shell. The BMC-T2 and BMC-T3 proteins each consist of two closely appressed pseudohexamers. The BMC-P structure was extracted from the whole-shell structure and BMC-H and BMC-T1 from previously determined crystal structures (Protein Data Bank 5DJB and 5DIH, respectively). BMC-T2 and BMC-T3 are crystal structures determined in this study. (B) SDS-PAGE of purified H. ochraceum BMC shells. MM, molecular mass. (C) Overview of the 8.7 Å cryo-EM structure colored by shell protein. (D) Surface representation of the crystal structure with a color gradient by distance from center (light to dark from inside to outside) (left) and cross section through the center (right). (E) Close-up of the icosahedral asymmetric unit (dashed line), with symmetry axes indicated with solid symbols and pseudo threefold symmetry with open triangles. Only one stack is shown for the BMC-T protein.

Using a recombinant system containing all of the facet proteins (one BMC-H and three BMC-T paralogs) and one of the three BMC-P proteins of the myxobacterium Haliangium ochraceum BMC (Fig. 1, A and B) (7), we produced homogeneous 40-nm BMC shells with a molecular mass of 6.5 MDa. We crystallized a complete closed particle and determined its structure to a resolution of 3.5 Å [CC1/2 (8) of 26%, table S1]. A cryo–electron microscopy (cryo-EM) map at a resolution of 8.7 Å (Fig. 1C) was used to place individual structures and phase the crystallographic data. To facilitate the interpretation of our data, we also determined the crystal structures of the pseudohexameric BMC-T2 and BMC-T3 proteins (table S1).

The coexpressed shell proteins self-assemble into a pseudo T = 9 icosahedral shell (designated pseudo because not all subunits are identical), with a diameter of ~400 Å (Fig. 1D), where T represents the triangulation number. The shell consists of 12 BMC-P pentamers at the vertices; the facets are formed by 60 BMC-H hexamers enclosing 20 BMC-T pseudohexamers of the three different paralogous types (Fig. 1A, green). This stoichiometry is in agreement with what we observe for purified shells on SDS–polyacrylamide gel electrophoresis (SDS-PAGE) (Fig. 1B) and previous analyses (7). The icosahedral asymmetric unit consists of one BMC-P chain, six BMC-H chains, and one BMC-T chain (two chains for the double-stacking type) (Fig. 1E). Model building was facilitated by the available high-resolution structures of the hexamer (9), the pseudohexamers [(10) and this work], and the 30-fold noncrystallographic symmetry, which collectively resulted in good model fit and geometry (for sample electron density, see fig. S1A). Because three different proteins can occupy the BMC-T positions, this density is representative of a mixture. Owing to the structural similarity between all three BMC-T, we can confidently place a protein model (we chose BMC-T2 because of overall fit). The resulting shell facets consist of a single layer with a thickness of 20 to 30 Å, with one of the trimers of BMC-T2 and BMC-T3 protruding to the outside (Fig. 1D). The complete shell structure answers the fundamental questions of whether the shell is single or double layered, how stacked pseudohexamers are accommodated, and what are the orientations of the individual subunits. For the pentamers, the broader side (the base of the pyramid) faces outward. In the facets, the concave sides of BMC-H and BMC-T1 (pseudo) hexamers (containing the N and C termini) face outwards. Likewise, the lower trimers of the double-stacking BMC-T2 and BMC-T3 pseudohexamers are in the same (concave-out) orientation but, owing to a circular permutation, their N and C termini face the inside. Given that the outside of the structure provides the interface with cytosolic metabolism, knowledge of the location of the polypeptide termini and the sidedness of the shell proteins is crucial for understanding and manipulating the function of BMCs in their native context, as well as for engineering synthetic microcompartments.

There are four distinct interfaces in the intact shell (Fig. 2): two different hexamer-hexamer interactions (Fig. 2, A and B), the hexamer-pentamer interaction (Fig. 2C), and the hexamer-pseudohexamer interaction (Fig. 2D). The hexamers connecting pentamers between two vertices of the intact shell (Fig. 2A) are in a side-by-side, planar orientation, whereas the hexamers surrounding the pentamers (Fig. 2B) are tilted by 30°. Considering the high structural conservation among all hexamer and pentamer proteins (fig. S2, A and B), these orientations are likely universal among BMCs. The hexamer in the shell is slightly compressed on the edge adjoining the pentamer, as revealed by superimposing it on the structure of the hexamer determined in isolation (9) [fig. S2C and Fig. 1D, where the edge facing the pentamer bulges outward (darker color)]. This distortion illustrates why computationally modeling such a large, multiprotein complex on the basis of individual crystal structures would likely fail to result in an accurate model.

Fig. 2 Overview of the four distinct interfaces between the pentamer, hexamers, and pseudohexamers.

Structures are shown in cartoon view (surface view as gray background), with a pictogram showing their location on the shell. (A) Coplanar hexamer-hexamer interface connecting two pentamer vertices. (B) Hexamer-hexamer interface as observed surrounding the pentamer. (C) Hexamer-pentamer interface. (D) Hexamer-pseudohexamer interface.

Structurally, the pseudohexameric BMC-T proteins are slightly more compact than the BMC-H hexamers, with the BMC domains folded relatively inward on the concave side (fig. S2D). Placing hexamers in these positions would require substantial deformation to enable them to be accommodated. BMC-T pseudohexamers contain two copies of the BMC domain, and in our structure, one domain interacts with the coplanar hexamer-hexamer corner and the other with the corner where the two hexamers join at a 30° angle (Fig. 1E). Because the two domains are decoupled on a genetic level, their primary structures have evolved separately so that each domain can fulfill distinct interface roles. Indeed, all characterized BMCs contain at least one BMC-T–type protein; in almost all genomes encoding BMCs, including those of unknown function, a gene for a BMC-T protein is present (2), underscoring their structural importance.

The specific residues involved in the interactions among hexamers and pentamers are located in distinct, conserved patches distributed across the primary structure (Fig. 3A). Highly conserved pentamer residues that are involved in intersubunit interactions (Figs. 3A and 4A and fig. S3) include S13, the GAGxGE motif (residues 48 to 53, where “x” represents any amino acid), and the I-V/I-D motif (residues 81 to 83). On the hexamer, the KAA motif at position 25 to 27 and the PRPH motif at position 77 to 80 play central roles in forming the interface with the pentamer (Figs. 3B and 4A and fig. S4). Hexamer residues 49 to 51 (D/E-T/V-A/G/S) are located at the corner between the pentamer and two hexamers. The conservation of a small amino acid at position 51 is crucial; large residues there would likely preclude shell formation. Overall, shape complementarity governs the hexamer-pentamer interactions; there are few salt bridges and hydrogen bonds.

Fig. 3 Sequence alignment of BMC-H and BMC-P of representative species.

Sequence alignment of representative BMC-H (A) and BMC-P (B) (selected to correspond to characterized, functionally diverse BMCs with available crystal structures for the isolated subunits) with residue numbering adjusted to correspond to the H. ochraceum sequences. Interfacing residues are marked by yellow pentagons for pentamer interactions and blue hexagons for hexamer interactions. Conserved residues are colored according to physical properties (brown, hydrophobic; gray, proline or glycine; red, positively charged; blue, negatively charged; and green, polar). Sequence conservation logos of the combined representative types are below, with each amino acid colored individually; height of letters corresponds to relative frequency at each position. Additional details for each type are shown in figs. S3 and S4. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.

Fig. 4 Detailed view of the BMC-H–BMC-P and the two different BMC-H interfaces as viewed from the outside.

(A) Pentamer-hexamer interface, with pentamer residues in yellow, hexamer residues in blue with conservation indicated with asterisk(s), different chains indicated by color shading, and hydrogen bonds indicated by dashed lines. Pictograms show interface location in the context of the shell. (B) Angled hexamer-hexamer interface. (C) Coplanar hexamer-hexamer interface. Red shading highlights the interlocking of residue R78 with the adjacent hexamer.

For the hexamer-hexamer interface (Fig. 4B), the KAA and PRPH motifs of complementing chains account for most of the interacting surface area. The lysines of the KAA motif are arranged in an antiparallel manner, creating a flat interaction surface with hydrogen bonds between the ε-amino group and the backbone oxygens of the opposite lysine and R78 (Fig. 4B). The coplanar hexamer-hexamer interface maintains the KAA-PRPH motif interactions but contains an additional structural interdigitation between hexamers: The R78 side chain of the PRPH motif inserts in a pocket between the H80 side chain and backbone oxygens of V24, A27, and V29 of the adjacent hexamer (Fig. 4C and fig. S1B), creating an interlock. This was previously observed as a crystal-packing interaction in the structure of the α-carboxysomal BMC-H protein CsoS1A (11), an additional indication of the general structural conservation of the interactions across evolutionarily distant shell proteins (fig. S4).

The specific side chains influencing the interaction between the BMC-H hexamer and the BMC-T pseudohexamers are more enigmatic. The ability of three different BMC-T proteins to occupy the same position in the shell indicates a tolerance for a variety of side-chain interactions. The only universally conserved residue is the antiparallel lysine corresponding to the KAA motif in hexamers (figs. S5 and S6). Notably, all three BMC-Ts are able to occupy equivalent positions in the shell despite considerable sequence divergence, suggesting that in the BMC-H–BMC-T interfaces, the specific interactions mediating assembly are based primarily on shape complementarity.

The surface view of the intact shell (Figs. 1D and 2) shows that it is tightly packed; the only conduits to the interior of the shell are the pores formed at the cyclic symmetry axes of the hexamers and pseudohexamers. The largest channel to the interior is formed by the BMC-T proteins; the pore across the trimer within the facet is at least 5 Å wide with the potential to be larger owing to the flexibility of the loops surrounding the pore. The crystal structure of isolated BMC-T3 has both trimer pores closed, whereas in the crystal structure of the isolated BMC-T2, one pore is open and the other closed, as has been observed before for carboxysome proteins (4). This arrangement is reminiscent of the alternate access model of some transmembrane transporters of eukaryotic organelles [e.g., BtuCD-type adenosine triphosphate (ATP)–binding cassette (ABC) transporters (12)].

Using the interactions we see in our structure and the same set of hexamers, pseudohexamers, and pentamers, we can model larger compartments (T = 36, diameter 720 Å) than we have experimentally observed by only slightly changing the angles between hexamers and pseudohexamers while maintaining the coplanar hexamer-hexamer contacts (fig. S7). The extent of the facets is likely dictated by the interactions between different combinations of distinct BMC domains (i.e., the two different domains in each BMC-T paralog and the BMC-H), whereas the pentamer could prime the structure for an overall icosahedral shape. The subunits in the BMC-T positions thereby influence the curvature and the final size of the compartment. This differs from previous hypothetical models that proposed specific proteins in forming edges (13, 14). Although the particles appear to have edges in some views and in micrographs (figs. S7B and S9A), the curvature is distributed over the whole shell; larger BMCs effectively have less curvature per subunit. Accordingly, the structure that we have determined describes scalable principles for constructing a range of shell sizes, likely corresponding to the variation in shell sizes observed in BMCs in their native hosts, which range from 55 to 600 nm (15, 16).

The presence of structurally redundant building blocks suggests that the multiplicity is related to function, not structure—for example, to provide a range of conduits (i.e., differing in size and charge at the cyclic symmetry axes) for different metabolites (substrates and products) to cross the same shell. A second function would be to provide distinct patches on the interior surface to anchor and spatially organize the encapsulated enzymes. When we model the shell with the different BMC-Ts, an electrostatic (inside) surface view shows different regions that could be involved in specific interactions with the cargo proteins (fig. S8). The distinct convex binding surfaces of the different shell proteins could serve to position the encapsulated enzymes to channel substrates and products between enzymes, as well as across the shell.

Our model of the basic architecture of the bacterial micrcompartment shell likely applies to functionally diverse organelles found across the bacterial kingdom; it also can inform rational design of engineered microcompartments. For the BMC shell described here, on the basis of an inner diameter of 290 Å and assuming a typical protein density, there is space for approximately 150 copies of a 60-kDa enzyme in the interior, ample volume in which to localize multiple enzymes. Targeting could be achieved either by using specific encapsulation peptides found associated with the native cargo proteins (7, 17) or be engineered by using the structure of the inner surface as a guide. The overall structure of the BMC shell invites comparisons to viral capsids and their engineered functions; however, BMC shells offer an additional structural and functional feature—selective permeability. Collectively, the atomic-resolution model of a BMC shell reveals the construction principles of the membranes of these primitive, protein-based organelles that can be applied to understanding and manipulating their native and engineered functions.

Supplementary Materials

Materials and Methods

Figs. S1 to S9

Table S1

References (1834)

References and Notes

Acknowledgments: This work was supported by the National Institutes of Health–National Institute of Allergy and Infectious Diseases grant 1R01AI114975-01 and the U.S. DOE, Office of Science, Office of Basic Energy Sciences under contract no. DE-FG02-91ER20021. The Advanced Light Source is supported by the U.S. DOE, Director, Office of Science, Office of Basic Energy Sciences under contract no. DE-AC02-05CH11231. B.G. was supported by an advanced postdoctoral mobility fellowship from the Swiss National Science Foundation (project P300PA_160983). Use of the Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, is supported by the U.S. DOE, Office of Science, Office of Basic Energy Sciences under contract no. DE-AC02-76SF00515. We thank B. Paasch and J. Zarzycki for their contributions to the BMC-T3 structure determination. We thank E. Nogales for providing access to the electron microscopy facility at University of California, Berkeley, and the Adams lab at Lawrence Berkeley National Lab for use of the crystallization robot. M.S. and C.A.K. are inventors on patent application 62509553 submitted by Berkeley National Laboratory that covers strategies for scaling the shell-protein system described in this work. The cryo-EM map of the complete shell has been deposited at the Electron Microscopy Data Bank (EMDB) with accession code EMD-8747. The x-ray crystallographic coordinates and structure-factor files have been deposited in the Protein Data Bank (PDB) under the following accession numbers: 5V74 (complete shell), 5V75 (BMC-T2), and 5V76 (BMC-T3).

Stay Connected to Science

Navigate This Article