Report

# Structure of SARS Coronavirus Spike Receptor-Binding Domain Complexed with Receptor

See allHide authors and affiliations

Science  16 Sep 2005:
Vol. 309, Issue 5742, pp. 1864-1868
DOI: 10.1126/science.1116480

## Abstract

The spike protein (S) of SARS coronavirus (SARS-CoV) attaches the virus to its cellular receptor, angiotensin-converting enzyme 2 (ACE2). A defined receptor-binding domain (RBD) on S mediates this interaction. The crystal structure at 2.9 angstrom resolution of the RBD bound with the peptidase domain of human ACE2 shows that the RBD presents a gently concave surface, which cradles the N-terminal lobe of the peptidase. The atomic details at the interface between the two proteins clarify the importance of residue changes that facilitate efficient cross-species infection and human-to-human transmission. The structure of the RBD suggests ways to make truncated disulfide-stabilized RBD variants for use in the design of coronavirus vaccines.

The SARS coronavirus (SARS-CoV) is the agent of severe acute respiratory syndrome, which emerged as a serious epidemic in 2002 to 2003, with over 8,000 infected cases and a fatality rate of ∼10% (1-4). Coronaviruses, which are large, enveloped, positive-strand RNA viruses, infect a variety of mammalian and avian species and can cause upper respiratory, gastrointestinal, and central nervous system diseases (5). The large spike protein (S) on the virion surface mediates both cell attachment and membrane fusion (5). In the case of several avian and mammalian coronaviruses, S is cleaved by furin or a related protease into S1 and S2; the former bears the receptor attachment site; the latter, the fusion activity. The structures of refolded heptad-repeat fragments of S2 from the mouse hepatitis coronavirus (MHV) and from SARS-CoV (6-8) confirm earlier predictions (4) that the postfusion conformation has the trimer-of-hairpins organization characteristic of “class 1” fusion proteins, such as those of HIV, influenza virus, and Ebola virus (9). S on mature SARS-CoV virions does not appear to be cleaved, and the sequence that aligns with the MHV cleavage site lacks the essential residues for furin susceptibility (3, 4, 10, 11). We therefore refer to the S1 and S2 “regions” (12), which contain 666 and 583 amino acid residues, respectively (Fig. 1A).

Coronaviruses exploit a wide variety of cellular receptors (5). SARS-CoV and another human coronavirus, HCoV-NL63, both use as their receptor a cell-surface zinc peptidase, angiotensin-converting enzyme 2 (ACE2) (13, 14). The crystal structure of the ACE2 ectodomain (15) shows a claw-like N-terminal peptidase domain, with the active site at the base of a deep groove, and a C-terminal “collectrin” domain. A fragment of the S1 region, residues 318 to 510, is sufficient for tight binding to the peptidase domain of ACE2 (11, 16, 17). This fragment, the receptor-binding domain (RBD), is the critical determinant of virus-receptor interaction and thus of viral host range and tropism (18). SARS-CoV isolated from patients during the 2002–2003 epidemic, and also from milder sporadic cases in 2003 to 2004, appears to derive from a nearly identical virus circulating in palm civets and raccoon dogs (19, 20). Changes in just a few residues in the RBD can lead to efficient cross-species transmission (18, 20). The RBD also includes important viral-neutralizing epitopes (21-23), and it may be sufficient to raise a protective antibody response in inoculated animals.

We expressed the SARS-CoV spike protein RBD, residues 306 to 575, in Sf9 cells and purified the fragment (24). Brief treatment with chymotrypsin yielded a shorter fragment, residues 306 to 527. Soluble ACE2, residues 19 to 615, was expressed in Sf9 cells and purified as described in (24). The two components were mixed, and the complex was purified by size-exclusion chromatography on Superdex 200 (Amersham Biosciences, Piscataway, NJ). Crystals in space group P21, a = 82.3 Å, b = 119.4 Å, c = 113.2 Å, β = 91.2°, with two complexes per asymmetric unit, were grown at room temperature from a mother liquor containing 24% polyethylene glycol 6000, 150 mM NaCl, 100 mM Tris at pH 8.2, and 10% ethylene glycol. We determined the structure of the ACE2/SARS-CoV/RBD complex by molecular replacement with ACE2 as the search model, and we refined it at 2.9 Å resolution (24). The final model contains residues 19 to 615 of the N-terminal peptidase domain of human ACE2 and residues 323 to 502 (except for 376 to 381) of the RBD; as well as glycans N-linked to ACE2 residues 53, 90, 322, and 546 and to RBD residue 330; and 65 solvent molecules. The Rfree is 27.5% and Rwork is 22.1% (see table S1 for definitions).

The ACE2 peptidase domain has two lobes that close toward each other after substrate engagement (15). In one of the two complexes in the asymmetric unit of our crystals, ACE2 is fully open; in the other, it is slightly closed (fig. S1). The SARS-CoV S protein contacts the tip of one lobe of ACE2 (Fig. 1). It does not contact the other lobe, nor does it occlude the peptidase active site. Binding of the spike protein to ACE2 is not altered by the addition of a specific ACE2 inhibitor, which is expected to favor the closed state (18). Thus, both structural and biochemical data indicate that viral attachment is unaffected by the open-to-closed transition.

The RBD contains two subdomains (Fig. 1): a core and an extended loop. The core is a five-stranded anti-parallel β sheet (β1 to β4 and β7), with three short connecting α helices (αA to αC). There are nine cysteines in the chymotryptic fragment. Disulfide bonds connect cysteines 323 to 348, 366 to 419, and 467 to 474. The remaining cysteines are disordered but two (378 and 511) are in the same neighborhood and could form a disulfide in the recombinant fragment, even if they have other partners in the intact S protein. The extended loop subdomain lies at one edge of the core; it presents a gently concave outer surface formed by a two-stranded β sheet (β5 and β6). The base of this concavity cradles the N-terminal helix of ACE2; a ridge to one side of it, which is reinforced by the Cys467–Cys474 disulfide bridge, contacts the loops between ACE2 helices α2 and α3; a ridge to the other side inserts between a short ACE2 helix (residues 329 to 333) and a β hairpin at ACE2 residue 353 (Fig. 1C). Residues 445 to 460 of the RBD anchor the entire receptor-binding loop to the core of the RBD. We refer to this loop (residues 424 to 494), which makes all the contacts with ACE2, as the receptor-binding motif (RBM).

The RBM surface is complementary to the receptor tip, with about 1700 Å2 of buried surface at the interface (Fig. 2A and fig. S2), consistent with their high affinity (dissociation constant Kd ∼ 10–8) (18, 21). A total of 18 residues of the receptor contact 14 residues of the viral spike protein (Table 1). Networks of hydrophilic interactions, which occur largely among amino acid side chains, predominate. Six RBM residues at this interface are tyrosines, which present both a polar hydroxyl group and a hydrophobic aromatic ring (Fig. 2B).

Table 1.

Contacts between ACE2 and SARS-CoV RBD. Residues in ACE2 that contact the RBD are listed by their position (numbers across the top of each column) and by their single-letter identity (36) in the palm-civet, mouse, rat, and human receptors. The residues they contact in the structure described here and their position numbers in the spike proteins from human isolates are shown at the bottom of each column.

 24 27 31 34 37 38 41 42 45 79 82 83 90 325 329 330 353 354 L T T Y Q E Y Q V L T Y D Q E N K G civet ACE2 N T N Q E D Y Q L T S F T Q A N H G mouse ACE2 K S K Q E D Y Q L I N F N P T N H G rat ACE2 Q T K H E D Y Q L L M Y N Q E N K G human ACE2 N473 Y475 Y475 Y440 Y491 Y436 Y484 Y436 Y484 L472 L472 N473 T402 R426 R426 T486 G488 Y491 human SARS Y442 N479 T486 Y484 Y475 T487 G488 T487 Y491

Coronaviruses are classified in three groups (5); SARS-CoV belongs to group 2 (fig. S3). Spike-protein sequences from several members of group 2 lead us to expect that all have rather similar structures, including the RBD core (fig. S3). The SARS-CoV RBM is substantially shorter than are the corresponding regions in several other group-2 viral spike proteins, however, and it has no evident sequence similarity to the others (fig. S3). Thus, this extended loop is probably a hypervariable decoration of an otherwise-conserved domain. In the case of MHV, the receptor (murine carcinoembryonic antigen cell adhesion molecule 1a, or CEACAM1a) (25, 26) makes contact not with the extended-loop subdomain (nor, indeed, with any part of the domain homologous to the SARS-CoV RBD), but rather with structures in the N-terminal region of the spike protein (27). Receptors and receptor-binding regions of other group-2 coronaviruses have not been identified. The group-1 human coronavirus 229E receptor is aminopeptidase N; the corresponding RBD on its spike protein is known (28).

The SARS-CoV appears to derive from a cross-species infection with a coronavirus isolated from palm civets (19, 20). S-gene sequences from civet and human specimens obtained during the 2002-to-2003 epidemic show that their RBDs differ at only four positions, residues 344, 360, 479, and 487, but the human viral spike protein binds the human receptor 103 to 104 times more tightly than does its civet spike counterpart (18). Residues 344 and 360 are far from the binding interface in the complex described here, and mutation to the corresponding civet CoV residues does not affect affinity or infectivity (18). The critical changes are therefore at positions 479 and 487, both of which lie in the RBD-receptor contact (Figs. 1 and 3 and Table 1).

The changes at these two positions are relatively subtle. In most viral sequences from palm-civet specimens, residue 479 is lysine and 487 is serine, whereas in SARS-CoV sequences from the 2002–2003 epidemic, these residues are asparagine and threonine, respectively. The presence of lysine at 479 reduces affinity for human but not for civet ACE2; serine at 487 reduces affinity for both receptors (18). Position 479 lies opposite the ACE2 N-terminal helix (α1), on which several residues differ in identity between civet and human (Table 1). Some civet coronavirus sequences have asparagine at position 479, and the difference does not appear to be critical for binding to the civet receptor (18). At position 487 in the spike protein, replacing threonine (SARS-CoV) with serine (civet viral sequences) would remove the threonine methyl group, which lies in a hydrophobic pocket bounded by atoms in the side chains of Tyr41 and Lys353 on the receptor and Tyr484 in the RBM (Fig. 3C). This pocket appears to be relatively inflexible. A main-chain hydrogen bond (carbonyl of ACE2 Lys353 to amide of RBD Gly488) fixes the relative positions of receptor and spike protein quite precisely. Moreover, the Thr487 rotamer is determined by a hydrogen bond from Oγ to the main-chain carbonyl of Tyr484; the aliphatic part of the Lys353 side chain is sandwiched between the rings of ACE2 Tyr41 and RBD Tyr491, and the $Math$ is neutralized by ACE2 Asp38. Mutation to serine would thus leave a hard-to-fill van der Waals hole; indeed, a mutation in which Thr487 is replaced by Ser in the human RBD decreases affinity for human ACE2 by more than 20-fold (18). Civet ACE2 is essentially identical to human ACE2 at all the relevant positions in the vicinity of this interaction; like the human receptor, it appears to bind RBDs with threonine at 487 more tightly than those with serine (18). All of the more than 100 S-protein sequences obtained during the 2002–2003 SARS epidemic have threonine at this position, whereas all 14 such sequences from palm-civet and raccoon-dog isolates have serine (29, 30).

Viruses from sporadic SARS cases during 2003 to 2004, each of which was an independent cross-species event from which no human-to-human transmission occurred, all had asparagine at 479 and serine at 487 (29, 30). It is therefore plausible that a key factor determining severity (and possibly human-to-human transmission) is the presence or absence of a γ-methyl group on the 487 side chain. The 2003–2004 sequences differed, however, at two other RBD positions from those sequences obtained during the epidemic of the previous winter: Leu472 had changed to proline and Asp480 to glycine. Inspection of the model suggests that the leucine-to-proline change might have contributed to attenuation, by reducing the spike-receptor contact surface (Fig. 3A). A similar rationale is harder to find for the aspartate-to-glycine substitution, because the aspartyl side chain projects into solution, and mutation of this residue to alanine has no effect on RBD binding to ACE2 (16).

Two other species differences are worth noting. Rat ACE2 does not support infection by SARS-CoV, and mouse ACE2 does so only inefficiently (30). At position 82, where the human receptor has a methionine, the rat protein has a glycosylated asparagine; the glycan would disrupt by steric interference a hydrophobic contact between Met82 and Leu472 in the RBM (Fig. 3A). At position 353, where the human receptor has a lysine critical for the contact with Thr487 in the RBM (Fig. 3B), the rat receptor has histidine. Mouse ACE2 also has histine at 353, but it does not have a glycosylation site at 82. It thus bears one but not both of the differences that render rat ACE2 inactive as a receptor, and mutation of His353 to lysine in mouse ACE2 allows high-level infection of murine cells by SARS-CoV (30).

The residues singled out for description in the preceding paragraphs are not, of course, the only ones critical for the tight complementarity of the SARS-CoV RBD and human (or palm civet) ACE2. They are simply the positions at which there are differences among isolates and receptors important for binding and entry. Other species might in principle harbor variants of the same virus that would require changes at different positions to be able to infect human cells, and other changes in the civet virus might permit cross-species infection even in the absence of the serine-to-threonine mutation at position 487. The structure might allow one to recognize such changes in future animal isolates. For example, the human receptor (but not the civet receptor) bears an N-linked glycan at position 90. Mutation of Asn90 to eliminate the glycan enhances S-protein–mediated binding and infection of human cells by pseudotyped lentiviruses (18). The glycan faces a loop in the RBD containing residues 399 to 412. Changes in this loop that reduce likely interference with the glycan might have the same enhancing effects as does elimination of the glycan on the receptor or mutation of Ser487 to threonine on the S protein.

Neutralizing antibodies against SARS-CoV recognize epitopes in the RBD (21-23). For example, a high-affinity recombinant human monoclonal antibody, 80R, which is sensitive to mutation within the RBM, inhibits viral entry by blocking association of virus and receptor (21, 31). The soluble SARS-CoV RBD is therefore of potential use as an immunogen (23, 32). In the structure described here, the interface of the RBD with the receptor is very well defined, but the opposite face of the RBD is more disordered. The latter surface would interact with the rest of the spike protein, and it indeed contains the N and C termini of the RBD fragment as well as the disordered loop, residues 376 and 381. Thus, this face of the protein could be modified in various ways in the molecular engineering of a candidate vaccine. The loop from 376 to 381 could probably be shortened and the disordered cysteines removed; other disulfides could be introduced to add stability; and the C-terminal segment could be used to link the RBD to an oligomeric core. Of the 23 glycosylation sites on S, three are in the RBD. Only one (Asn330) is sufficiently ordered in our structure to show even a single sugar, and all are well separated from the RBM. Glycosylation is therefore unlikely to interfere with potential neutralizing epitopes within the RBD; introduction of new glycosylation sites could in principle “focus” the antigenicity of a candidate immunogen.

Supporting Online Material

Materials and Methods

Figs. S1 to S4

Table S1

References

View Abstract