2.2 Å resolution cryo-EM structure of β-galactosidase in complex with a cell-permeant inhibitor

See allHide authors and affiliations

Science  05 Jun 2015:
Vol. 348, Issue 6239, pp. 1147-1151
DOI: 10.1126/science.aab1576

Pushing the limits of electron microscopy

Recent advances in cryo–electron microscopy (cryo-EM) allow structures of large macromolecules to be determined at near-atomic resolution. So far, though, resolutions approaching 2 Å, where features key to drug design are revealed, remain the province of x-ray crystallography. Bartesaghi et al. achieved a resolution of 2.2 Å for a 465-kD ligand-bound protein complex using cryo-EM. The density map is detailed enough to show close to 800 water molecules, magnesium and sodium ions, and precise side-chain conformations. These results bring routine use of cryo-EM in rational drug design a step closer.

Science, this issue p. 1147


Cryo–electron microscopy (cryo-EM) is rapidly emerging as a powerful tool for protein structure determination at high resolution. Here we report the structure of a complex between Escherichia coli β-galactosidase and the cell-permeant inhibitor phenylethyl β-d-thiogalactopyranoside (PETG), determined by cryo-EM at an average resolution of ~2.2 angstroms (Å). Besides the PETG ligand, we identified densities in the map for ~800 water molecules and for magnesium and sodium ions. Although it is likely that continued advances in detector technology may further enhance resolution, our findings demonstrate that preparation of specimens of adequate quality and intrinsic protein flexibility, rather than imaging or image-processing technologies, now represent the major bottlenecks to routinely achieving resolutions close to 2 Å using single-particle cryo-EM.

Icosahedral viruses were the first biological assemblies whose structures were determined at near-atomic resolution using cryo–electron microscopy (cryo-EM) combined with methods for image averaging (110). Over the past 2 years, structures for a variety of nonviral assemblies have been reported using cryo-EM at resolutions between ~2.8 and ~4.5 Å (1120). Four of these instances have been of complexes with sizes below 1 MD: the 700-kD proteasome at 3.3 Å (16) and 2.8 Å resolution (17), the 465-kD Escherichia coli β-galactosidase (β-Gal) at 3.2 Å resolution (18), the 440-kD anthrax protective antigen pore at 2.9 Å (19), and the 300-kD TrpV1 ion channel at 3.4 Å resolution (20). Because these structures are of complexes that are dispersed in the aqueous phase, the peripheral regions of the proteins are less ordered and are at lower resolution than the more central regions; nevertheless, most side-chain densities are clearly delineated in the well-ordered regions of the maps. In crystallographically determined structures of proteins at resolutions of 2.3 Å or better, features such as protein-ligand hydrogen bonding, salt bridges, and location of key structured water molecules can be ascertained with a high degree of confidence (21). There is great potential for the use of cryo-EM methods in applications such as drug discovery and development if similar resolutions could be achieved without crystallization. Whether there are fundamental limitations with currently available methods for specimen preparation, microscope hardware, inelastic scattering from the ice layer, inaccuracies in microscope alignment, detector technology, data collection procedures, or image processing software to achieve resolutions approaching 2 Å is a question that remains unanswered in the current context of rapid advances in the cryo-EM field (22). This is especially relevant for smaller protein complexes (<1 MD) with low symmetry, where the errors in alignment of the projection images make the analysis more challenging than for larger or more symmetric complexes such as ribosomes and ordered viruses (23).

We recently reported the structure of E. coli β-Gal at 3.2 Å resolution (18). Comparing the cryo-EM–derived structure with that derived from x-ray crystallography, we identified regions such as the periphery of the protein and crystal contact zones where there were measurable deviations between crystal and solution structures. To test whether we could further improve map resolution, we explored a range of experimental conditions including variations in specimen preparation, imaging, and steps in data processing (see supplementary materials and methods). We analyzed the structure of β-Gal bound to phenylethyl β-d-thiogalactopyranoside (PETG), a potent inhibitor that blocks enzyme activity by replacing the oxygen in the O-glycosidic bond with a sulfur atom. Although no crystal structure is available for the complex formed between E. coli β-Gal and PETG, a crystal structure is available for PETG bound to Trichoderma reesei β-Gal (24). There is, however, very little sequence similarity (sequence identity of 12.8% determined by Clustal 12.1) between the two variants, with the T. reesei variant displaying a completely different fold and crystallizing as a monomer instead of a tetramer (fig. S1).

Cryo-EM images recorded from plunge-frozen specimens of the β-Gal–PETG complex and the corresponding radially averaged power spectra were analyzed to select images displaying signal at high resolution (fig. S2, A to C). For each recorded image, we also assessed the extent of movement during the course of the ~8-s exposure (fig. S2D). From a data set of 1487 images that displayed detectable signal in the power spectra extending beyond 3 Å and that had low amounts of beam-induced movement during the exposure, we extracted 93,686 molecular images using automated particle-selection procedures using a Gaussian disk as a template. We used various combinations and subsets of the frames collected from each region and iteratively evaluated their contribution to map quality (fig. S3A). The final map, which we assessed as having the highest overall map quality, was obtained using the information in the images collected from ~12 electrons (e)/Å2 of each exposure (Fig. 1A). We estimate the overall average resolution of the map to be ~2.2 Å, using both the 0.143 Fourier shell correlation (FSC) criterion, as well as the resolution at which an FSC obtained between the experimental cryo-EM map and the map computed from the map-derived cryo-EM atomic model has a value of 0.5 (fig. S3B). This was further supported by visual inspection of map quality (movie S1). The 2.2 Å mean resolution of our map indicates that some regions, such as at the periphery, are at lower resolution than 2.2 Å, whereas other regions closer to the center are at higher resolution, displaying features consistent with electron density maps from x-ray structures determined at resolutions of ~2 Å (fig. S4).

Fig. 1 Cryo-EM density map of the β-Gal–PETG complex at 2.2 Å resolution.

(A) Surface representation of the density map of one of the four protomers in the tetrameric complex. (B to D) Visualization of selected map regions showing delineation of secondary structural elements, amino acid densities, and carbonyl moieties (indicated by asterisks). The density for Phe627 is thinned out in the center of the aromatic ring, revealing the presence of a “hole” in the ring, a feature typically observed in structures determined by x-ray crystallography at resolutions of ~2 Å.

An overview of the density map for one of the four equivalent chains in the β-Gal complex and densities for regions from different portions of the molecule are shown in Fig. 1. The path of the polypeptide chain is well delineated, enabling placement of the sequence into the density map (Fig. 1, B to D). Examination of the map also shows clear densities for several backbone carbonyl groups and several ordered water molecules in the structure. We identified 194 densities in each protomer where we could place water molecules with confidence based on the shape of the local density, map value, and location at the right distance range for hydrogen bonding to polar groups in the vicinity. In the majority of instances, these water molecules are at locations also identified in the 1.7 Å crystal structure of β-Gal (Protein Data Bank ID 1DP0), providing independent validation of their assignment. Selected examples of tightly bound water molecules in the structure are shown in Fig. 2, illustrating instances where they are present in connected chains, coordinated to multiple polar residues, coordinated to the polypeptide backbone, or coordinated to the Mg2+ ion in the active site. The fact that water molecules can be placed with confidence in a structure of a 465-kD complex determined by single-particle cryo-EM is an exciting advance that bodes well for the use of cryo-EM in drug-discovery applications.

Fig. 2 Visualization of tightly bound water molecules in the structure of the β-Gal–PETG complex.

(A to F) Selected examples of densities for water molecules (highlighted in yellow) hydrogen bonded in pearl-string–like chains (A), connected to the polypeptide backbone and multiple amino acid side chains [(B) to (E)], or interacting with the Mg2+ ion in the active site (F).

The cryo-EM map includes density for PETG in the active site (Fig. 3A). The key catalytic residues Glu461 (general base for acid catalysis), coordinating to the Mg2+ ion, and Glu537 (nucleophile) are in close proximity to the ligand, with density for a structured water molecule visible in the binding pocket. In addition, other residues that stabilize the binding of the inhibitor (Asn102, Asp201, Met502, Tyr503, His540, Asp568, Phe601, Val795, and Trp999) also show appropriate steric dispositions, as displayed in Fig. 3B. There are substantial differences in the orientation and location of the PETG molecule in the T. reesei enzyme (Fig. 3C), with rotation of the benzyl moiety around the S–C-7 single bond by almost 180°. This is perhaps not unexpected, given that there is very little overall structural similarity between the enzymes from these two species (fig. S1) and the pattern of residues that are involved in H-bonding to the ligand are also different (Fig. 3, B and D). Stereochemical parameters of the key conserved catalytic residues (Glu200 and Glu298) in the T. reesei complex are different from those in the E. coli complex, as are the general distribution of nonpolar residues that stabilize the inhibitor in the pocket, establishing the value of direct determination of the actual structure of the ligand in a protein complex, even when a related structure is available.

Fig. 3 Active-site structure in PETG-liganded E. coli β-Gal.

(A) Uncorrected cryo-EM density map showing density for PETG, an associated water molecule, and six of the amino acids that line the binding pocket. (B) Plot of distances of various parts of PETG to residues in the vicinity of β-Gal from E. coli, determined using LIGPLOT ( (C) Superposition of the ligand binding pocket structures in β-Gal from E. coli (light blue, determined by cryo-EM at 2.2 Å resolution) and T. reesei (green, determined by x-ray crystallography at 1.4 Å resolution), illustrating the differences in protein and ligand structures. (Inset) Comparison between the corresponding configurations of PETG. (D) Plot of distances of various parts of PETG to amino acids in the vicinity of β-Gal from T. reesei, determined using LIGPLOT.

In Fig. 4, we show examples of the densities observed for each of the 20 standard amino acids, where the level of detail at which individual C, N, and O atoms are observed is consistent with maps derived from x-ray crystallography at nominal resolutions of ~2 Å (fig. S4). However, in contrast to a 2FobsFcalc map obtained by x-ray crystallography, both phases and amplitudes of cryo-EM density maps are derived experimentally from the images, eliminating the need to assign phases derived from the atomic model, as is customarily done in x-ray crystallography. As a consequence, density contours in cryo-EM maps are subject to inaccuracies from a number of resolution-lowering distortions (instances of which are visible in Fig. 4A) that can arise at various stages of data collection and processing. Factors that can contribute to distortions include inaccuracies in determination of the contrast transfer function for each image, errors in orientation determination during refinement, distinct patterns of radiation damage in each of the molecular images used for reconstruction, and the changes introduced from applying a uniform temperature factor correction to scale the map. Despite these distortions, which appear to be random, the overall shape of the residues can nevertheless be distinguished clearly. As more structures are determined at these higher resolutions, it is possible that there may be enough statistical basis to study these distortions quantitatively, and perhaps exploit patterns that may emerge from this analysis to improve refinement strategies to achieve even higher resolution.

Fig. 4 Illustration of map quality at the level of amino acids.

(A) Visualization of map density for examples of each of the 20 standard amino acids, which are grouped into neutral (nonpolar and polar), basic, and acidic categories. (B and C) Illustration of contours of densities for multiple Ile residues (B) and front, tilted, and edge views for Tyr552 (C). In each case, the density contours are consistent with the 2.2 Å resolution we report.

The data collection schemes currently used in cryo-EM with direct electron detectors enable the use of numerous combinations in which the dose can be fractionated during the exposure, as well as a number of ways in which different subsets of the frames collected for each exposure can be combined to generate a three-dimensional (3D) reconstruction. The map we present in Figs. 1 to 4 was obtained from a subset that excludes the very early portion of the exposure, uses the next 12 e2, and excludes the latter part of the exposure. In the course of our studies, we analyzed many different maps constructed by using different subsets of the exposure. The highest-resolution features, such as holes in the rings of the aromatic residues (Fig. 4 and fig. S4), were better resolved in maps constructed using the interval of the exposure containing the highest-resolution information (fig. S3A).

Based on our present analysis and its comparison with our earlier cryo-EM structure of β-Gal at 3.2 Å resolution, we can now articulate our best understanding of all of the changes that we introduced that enabled us to improve the resolution to ~2.2 Å. Perhaps the most important is the much more careful selection of regions where the ice was thin enough to obtain the highest detectable signals yet thick enough to allow a spread of orientations (as judged by the distribution of orientations assigned to each molecular image used to construct the final map). Second, the use of a lower dose rate to minimize the effects of coincidence loss of the detector and the use of a finer pixel size resulted in improved image contrast and maximization of amplitudes at low resolution (25), which allowed us to go to closer to focus and still be able to correctly pick and align particles. Third, we carried out 3D classification throughout the iterative refinement cycle, which we did not do in the case of the structure at 3.2 Å resolution. Finally, we believe that the use of nearest-neighbor interpolation during motion correction, coupled with better-quality data, allowed improved recovery of higher-resolution information in the final reconstruction.

X-ray crystallographic methods have led to the deposition of almost 95,000 atomic-resolution protein and protein-nucleic acid structures over the past few decades. There have been impressive advances in speed and resolution, as well as in the development of highly automated workflows over the years. Relative to those of the x-ray field, cryo-EM methods are still in an early phase of development, with only ~30 deposited models for coordinates derived from electron microscopic analysis at near-atomic resolution. The recent progress by many groups worldwide suggests that this number will increase rapidly and will extend to specimens that may not be easily amenable to crystallization. Our demonstration here that the structure of a ligand-protein complex can be determined in the solution phase at resolutions close to 2 Å suggests that cryo-EM is positioned to become an indispensable tool in structural biology and for drug-discovery applications.

Supplementary Materials

Materials and Methods

Figs. S1 to S4

References (2634)

Movie S1

References and Notes

  1. Acknowledgments: This research was supported by funds from the Intramural Research Program of the NIH, Center for Cancer Research, National Cancer Institute, and the Intramural AIDS Targeted Antiviral Program. We thank R. Mueller and J. Cometa for technical assistance with electron microscopy, P. Mooney and F. Ulmer for advice and assistance with optimizing detector performance, K. Podolsky for assistance with data collection, J.-J. Fernandez for providing the code to run the TOMOCTFFIND program, and V. Falconieri for assistance in preparation of the figures and the supplementary movie. This study used the high-performance computational capabilities of the Biowulf Linux cluster at the NIH ( The density map and refined atomic model have been deposited with the Electron Microscopy Data Bank (accession number EMD-2984) and the Protein Data Bank (entry code 5a1a), respectively.
View Abstract

Stay Connected to Science

Navigate This Article