X-ray crystal structures of native HIV-1 capsid protein reveal conformational variability

See allHide authors and affiliations

Science  03 Jul 2015:
Vol. 349, Issue 6243, pp. 99-103
DOI: 10.1126/science.aaa5936

Retroviral capsids in their native form

Capsid proteins of retroviruses form protective lattices around viral RNA molecules. The precise molecular details of how individual, full-length capsid proteins assemble to shield the viral genome; however, are not well understood. Obal et al. and Gres et al. now report high resolution crystal structures of the full length capsid proteins from Bovine Leukemia Virus and HIV-1, respectively. The two studies complement each other to reveal the dynamic nature of capsid protein assembly and of how individual capsid proteins interact in the lattice. The findings may have relevance for drug design.

Science, this issue p. 95; see also p. 99


The detailed molecular interactions between native HIV-1 capsid protein (CA) hexamers that shield the viral genome and proteins have been elusive. We report crystal structures describing interactions between CA monomers related by sixfold symmetry within hexamers (intrahexamer) and threefold and twofold symmetry between neighboring hexamers (interhexamer). The structures describe how CA builds hexagonal lattices, the foundation of mature capsids. Lattice structure depends on an adaptable hydration layer modulating interactions among CA molecules. Disruption of this layer alters interhexamer interfaces, highlighting an inherent structural variability. A CA-targeting antiviral affects capsid stability by binding across CA molecules and subtly altering interhexamer interfaces remote to the ligand-binding site. Inherent structural plasticity, hydration layer rearrangement, and effector binding affect capsid stability and have functional implications for the retroviral life cycle.

The mature capsid of HIV-1 is formed from a single capsid protein (CA) containing N-terminal (CANTD, residues 1 to 145) and C-terminal (CACTD, residues 150 to 231) domains connected by a flexible linker region (15). The capsid contains ~250 CA hexamers and 12 CA pentamers. CA hexamers comprise six CANTDs held together by CANTD-CANTD and CANTD-CACTD contacts between adjacent CAs, whereas the six CACTDs engage in interhexamer interactions (15). CA-CA interactions affect capsid structural integrity and infectivity (13, 611). Following viral entry, the capsid undergoes controlled disassembly (uncoating), which seems coordinated with reverse transcription (7, 9). Antivirals targeting CA (1217) include PF-3450074 (PF74), which has a bimodal mechanism of action (1821). At lower concentrations (nanomolar to ~2 μM), it competes with binding of host factors CPSF6 and NUP153, affecting nuclear entry. At higher concentrations, PF74 blocks uncoating and reverse transcription (1823). Crystal structures of PF74 with CANTD (CANTD-PF74) (14) or cross-linked CA hexamers (CAXL-PF74) (18, 19) have shown that PF74 binds at the same site as CPSF6 and NUP153. However, the structural mechanism by which therapeutically relevant high concentrations of PF74 affect uncoating remains incompletely defined.

Cryo–electron microscopy (cryo-EM) studies have helped build informative models of the mature HIV-1 capsid (2, 3, 24). X-ray and nuclear magnetic resonance (NMR) structures of CA domains or CA monomers or dimers have described some interactions at interfaces (15, 16, 25, 26). Structures of cross-linked CA (CAXL) elucidated the CA interactions in hexamers and pentamers (1, 3). However, engineered mutations left interhexamer interactions that govern virus uncoating and assembly insufficiently described.

We crystallized native full-length CA (Fig. 1A) and solved its structure in space group P6 with one molecule per asymmetric unit (Fig. 1B and table S1; see also supplementary materials and methods). CA subunits from neighboring hexamers are related by two- and threefold crystallographic symmetry (Fig. 1, C and D). The native CA structure is in general agreement with the 9 Å cryo-EM maps of the flattened CA hexagonal lattice (24) (fig. S1A) and tubes (2) (fig. S1B). The native CA fold is also in agreement with crystal and NMR structures of full-length CA (1, 35, 24), CANTD, and CACTD (12, 14, 16, 25, 27, 28). Key interactions between CANTD and CACTD that are likely to stabilize the capsid are intrahexamer (CANTD-CANTD, CANTD-CACTD) contacts between adjacent CAs around the sixfold axis, interhexamer (CACTD-CACTD) contacts at the two- and threefold axes (Fig. 1D) (1, 2), and intrasubunit (CANTD-CACTD) contacts.

Fig. 1 Crystal structure of native CA.

(A) Secondary structure and ribbon diagram of native CA. The CANTD comprises β hairpin (residues 1 to 13, brown), CypA-BL (85 to 93, light blue), and seven α helices: H1 (17 to 30, yellow); H2 (36 to 43, black); H3 (49 to 57, purple); and H4 (63 to 83), H5 (101 to 104), H6 (111 to 119), and H7 (126 to 145) in gray. The CACTD comprises the 310 helix (residues 150-152, green) and four α helices: H8 (161 to 173, gray), H9 (179 to 192, orange), H10 (196 to 205, blue), and H11 (211 to 217; pink). (B) Application of sixfold crystallographic symmetry generates the native CA hexagonal lattice. A single native CA molecule is shown in surface view representation (pink, CANTD; red, CACTD). (C) The six native CA subunits in a hexamer are related by sixfold crystallographic symmetry (yellow hexagon); CA subunits from neighboring hexamers are related by twofold (orange ovals) and threefold (pink triangles) crystallographic symmetry, shown at the interhexamer interfaces. (D) Orthogonal views of three native CA hexamers colored as in (A). The hexamers are stabilized by interactions at the sixfold (brown, β hairpin; yellow, H1; black, H2; purple, H3), twofold (green, 310; orange, H9), and threefold (blue, H10; pink, H11) interfaces. (E) Interhexamer interactions at the twofold interface. Interpretable electron density is now observed for all residues at the twofold interface of native CA (2.4 Å; 2FobsFcalc; σ = 1.2), including residues 176 to 187, which were previously disordered in cross-linked hexamer structures.

The interhexamer interactions at the twofold axis are clearly defined in native CA (Fig. 1E). They involve multiple residues and water molecules (Fig. 2A). These contacts are reminiscent of but different than those in CACTD structures (fig. S2A and table S2) (16, 25). They also differ considerably from the original cross-linked hexameric [CAXL, Protein Data Bank identification number (PDB ID): 3H47] (fig. S2B and table S2) and other CA structures containing W184→A184 (W184A) and M185A (29) at the dimerization interface (1, 18, 19).

Fig. 2 Interhexamer interactions at the twofold and threefold interfaces.

Stereo views of CACTD regions that are related by twofold [(A) and (C)] or threefold [(B) and (D)] symmetry from native CA [(A) and (B)] and dCA [(C) and (D)]. The twofold interface comprises helices H9 and 310 in (A) and (C). The threefold interface comprises helices H10 and H11 in (B) and (D). Ordered water molecules (spheres) at the twofold (A) and threefold (B) interfaces of native CA are modeled in 2.4 Å simulated annealing omit FobsFcalc electron density maps at σ = 2.5 (green mesh). B-factors of refined waters were between 30 to 50 Å2 and matched well with those of interacting atoms from S149, E175, Q176, W184, I201, and A204. Helices are shown in cartoon representation; black dashed lines between residues (in sticks) indicate that they are within ~4 Å of each other. No water molecules are present at the twofold and threefold interfaces of dCA [(C) and (D)].

Cryo-EM and modeling studies indicate that interhexamer interactions are hydrophobic (2, 25). The higher-resolution native CA structure reveals details of these contacts (Fig. 2B and table S3) that include hydrophilic, water-mediated interactions at the two- and threefold interfaces (Fig. 2, A and B). These waters engage in H-bond interactions with either side chains of conserved residues (S149, E175, and W184 at the twofold interface) or main-chain carbonyls (Q176 at the twofold interface; I201 and A204 at the threefold interface) (Fig. 2, A and B). Thus, the closest intersubunit distance at the threefold interface of native CA is ~6 Å versus ~15 Å in CAXL (fig. S3, A and C). Interhexamer interface waters contribute to capsid stabilization. Of 450 assigned waters per hexamer (75 per monomer), 30 are at the two- and threefold interfaces (three waters for each of the six twofold interfaces and two waters for each of the six threefold interfaces per hexamer). As there are ~250 hexamers per capsid, thousands of water molecules at the interhexamer interfaces should substantially contribute to capsid stability.

To investigate the role of this hydration layer, we used controlled dehydration (30) that alters the water content of crystals. CA crystal dehydration (dCA) contracted the unit cell dimensions by ~3 to 6% (table S1). Superposition of native CA on dCA reveals conformational rearrangement in the relative orientation of CANTD and CACTD, imparted by a hingelike motion with the linker region as a pivot point (fig. S4). These changes correlate with packing differences at the two- and threefold interfaces (Fig. 2, A versus C and B versus D, and fig. S3, A versus B). Hence, dCA hexamers (fig. S3B) arrange tighter (closest distance ~3 Å) than CAXL (fig. S3C) and native CA (fig. S3A) hexamers, creating up to 22 and 15 additional contacts in the dCA two- and threefold interfaces (tables S2 and S3). Solvent-accessible area calculations reveal that dCA has ~150 and ~200% more buried surface than native CA and CAXL at the twofold interface and ~400% more than CA at the threefold interface (table S4). The threefold interface interactions are between K203-T216 and K203-A217 (main-chain atoms) and A204-L205, G206-T216, and P207-E213 (side chains) (table S3 and Fig. 2D). Sequence alignment (fig. S5) shows that almost all two- and threefold residues are entirely conserved (205, 206, 213, 216; 207 is highly conserved). Consistent with our data, A204 substitutions result in noninfectious virions with unstable and/or abnormal cores (2, 7). We also observe variations in the CANTD-CANTD and CANTD-CACTD intrahexamer contacts (table S5) that alter H-bond networks and water-mediated or hydrophobic interactions (Fig. 3A). Mutagenesis studies also confirm the importance of interface residues for core morphology and stability, DNA synthesis, and infectivity (tables S2 and S3) (1, 611, 26, 31).

Fig. 3 Changes at the intrahexamer interfaces.

(A) Stereo view of intrahexamer CANTD-CANTD and CANTD-CACTD intersubunit interfaces. Two neighboring CA subunits are shown as cartoons outlined for clarity (CANTDs in lighter colors than the corresponding CACTDs). Sites of varying interactions among native CA, dCA, CAXL, and CAPF74 are marked in blue and orange for CANTD-CANTD interfaces or brown and pink for CANTD-CACTD interfaces. (B) Intrasubunit rearrangement linked to changes at the twofold interface. Enlarged stereo view of the boxed region shows changes in the position of H9 helices in neighboring subunits (marked with prime symbols). Least-squares superposition (residues 143 to 174 and 192 to 219) of dCA (cyan, CANTD; darker blue, CACTD) on native CA (pink, CANTD; red, CACTD) is shown. Crystal dehydration results in a slight extension of helix H8 (small blue arrow), interaction of R143 with the main-chain E175 carbonyl instead of Q176, and repositioning of the helix H9 (black arrow). In dCA, W184′ from the H9′ helix (light blue for dCA and light pink for CA) forms a hydrogen bond with main-chain E175 carbonyl from a neighboring subunit, whereas in native CA, W184′ interacts with Q176 and the side chain of E175 through water-mediated contacts. Moreover, in dCA, R143 also interacts with E187′ and T188′ from the neighboring subunit, thus becoming a part of the twofold interface. Black dashed lines connect residues or waters that interact through hydrogen bonds.

A notable difference between native CA and dCA involves interactions of R143 in CANTD with E175 or Q176 in CACTD helix H8 within the same subunit (Fig. 3, A and B). The R143-Q176 interaction in native CA is replaced with R143-E175 in dCA. This change repositions helix H9, leads to the loss of the W184-bound water, and alters the H9-H9′ twofold interhexamer interactions (Fig. 3B). Alternative interactions of R143 with CACTD were observed in engineered unliganded (PDB ID: 3H4E) or ligand-complexed structures (PDB ID: 4U0E) crystallized in orthorhombic (1) or hexagonal space groups, respectively (18). This finding, along with virological studies (7, 18), highlights the functional importance of R143.

Capsid plasticity may allow therapeutic intervention by stabilizing nonproductive uncoating and assembly intermediates or disrupting the interhexamer hydration layer. Residues at interhexamer interfaces are conserved among clades (7, 9, 32), providing a high genetic fragility that makes CA an attractive antiviral target (9). Among >20 CA-targeting compounds, PF74 is the most potent antiviral (1217). To address how PF74 alters interhexamer interactions affecting capsid stability and reverse transcription, we solved the crystal structure of PF74 with native CA (CAPF74). PF74 binds across two CA monomers in CAPF74, similar to recent cross-linked structures (18, 19) (Fig. 4A and fig. S6A) but with the indole moiety arranged differently than in CANTD-PF74 (Fig. 4, B and C). In addition to the interactions with CANTD helices H3, H4, H5, and H7 observed in CANTD-PF74 (14), PF74 also contacts CACTD H8 and H9 from a neighboring subunit in CAXL-PF74 and CAPF74 (18, 19). In CAXL-PF74, H9 mutations W184A and M185A leave part of the H9-H9′ twofold interface disordered, and there are no interactions at the neighboring threefold region. These interactions are now observed in CAPF74. We also observed differences in PF74 contacts with CACTD residues at or near dimerization helix H9 (Y169 and K182 in CAPF74 versus Q176, S178, and E179 in CAXL-PF74) (Fig. 4, B and C) (18, 19). PF74 binding results in subtle structural changes at the remote three- and twofold regions (Fig. 4D and fig. S6, B and C). These variations are reminiscent of allosteric changes in the CACTD complex with antiviral peptide CAI binding near the CACTD-CACTD interface (16). Specifically, we observe changes in hydrophobic and water-mediated interactions at the CAPF74 twofold region that affect the buried surface area (tables S2 and S4). Also, changes of the CAPF74 H10 helices (root mean square deviation ~1 Å with native CA) lead to their convergence at the threefold region, with the distance between A204 main-chain oxygens changing from 5.5 to 3.6 Å. In turn, this change leads to a displacement of the water molecules seen in native CA (Fig. 4D and fig. S6C) and a 150% increase in buried surface area at the threefold interface of CAPF74 (tables S3 and S4). Hence, the CAPF74 structure provides insights into the mechanism by which high concentrations of PF74 change intra- and interhexamer interactions and affect core stability (tables S2, S3, and S5).

Fig. 4 Effects of PF74 on HIV-1 CA structure.

(A) PF74 binding at the CANTD-CACTD interfaces of neighboring subunits within a CA hexamer. Top view of CAPF74 hexamers is shown (side view is shown in fig. S6A). There is one PF74 molecule bound to every CA subunit at sites that are distant to the threefold interface (black triangle). CANTDs and the corresponding CACTDs are colored by the same colors (light and dark, respectively). (B) Close-up stereo view of the PF74 binding site [small box in (A)]. CANTD and a CACTD of a neighboring subunit bind PF74 (light blue and purple ribbons). H9 is omitted for clarity. PF74 is modeled in a 2.7 Å simulated annealing omit map (σ = 2.5). Black dashed lines indicate interactions within ~4 Å, and red dashed lines indicate H-bond interactions. (C) PF74 conformations in CAPF74 (blue), CAXL-PF74 (PDB ID: 4U0E; PF74 in gray), and CANTD-PF74 (PDB ID: 2XDE; PF74 in magenta). PF74 in PDB ID 4QNB is almost identical to 4U0E and is omitted for clarity. (D) Stereo view of the threefold interface of CAPF74 superposed onto native CA (aligned on residues 1 to 219). Helices H10 of CAPF74 and CA are in blue and yellow, respectively. CA waters are shown as yellow spheres; no waters were present in CAPF74.

Obal et al. report extensive structural heterogeneity of CA monomers in the bovine leukemia virus CA hexamers (33). Similarly, comparison of the present native CA, dCA, and CAPF74 HIV-1 structures also suggests a CA conformational plasticity, indicating that capsid structural variability is a common feature among multiple retroviruses. Variability in inter- and intrahexamer contacts is also proposed in HIV-1 capsid molecular models (2, 24).

The nature of CA structural variability goes beyond mere side-chain rearrangement. A likely key structural determinant important for the stabilization of variable interfaces is the presence of structured waters observed at strategic regions (1), including the two- and threefold interfaces (Fig. 2, A and B). This hydration layer could function as an extension of the CA structure, contributing to surface complementarity among flexible CAs. It is likely adaptable and facilitates nearly isoenergetic structural rearrangements contributing to quasi-equivalent structural variability.

Similar to previous reports (1), we observed changes in the relative orientation of CANTD and CACTD (~10° rotation in fig. S4) that may cause a tilt among neighboring hexamers and contribute to the surface curvature of the capsid. Different interhexamer tilt angles would be anticipated at the narrow and broad ends of the capsid or between laterally or longitudinally positioned hexamers. Given the asymmetric nature of capsid, it is likely that no two CAs are identical in an HIV-1 capsid (2, 24). Hence, CA plasticity allows a wide range of conformations that contribute to the structural variability of the HIV-1 asymmetric core.

CA pliability allows interactions of capsids with multiple host factors. For HIV-1, these include TRIM5α, CPSF6, MxB, cyclophilin, NUP153, and NUP358 [reviewed in (34, 35)]. Such diverse interactions afford a functional versatility reminiscent of a Swiss Army knife. Moreover, the variability in subunit distances at the threefold interface (~3 to 15 Å) (fig. S3) may affect permeability to deoxynucleoside triphosphates (footprint diameter ≤10 Å). This may be relevant to reverse transcription, although nucleotide trafficking may also occur through imperfections in the malleable core structure. Extensive variability may become an Achilles heel for HIV by providing opportunities for pocket targeting, core destabilization, or stabilization of nonproductive structural intermediates.

Supplementary Materials

Materials and Methods

Figs. S1 to S6

Tables S1 to S5

References (3650)

References and Notes

  1. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.
  2. Acknowledgments: We thank J. Nix of Advanced Light Source (ALS) beamline 4.2.2 for assistance with data collection. ALS is supported by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under contract DE-AC02-05CH11231. single-wavelength anomalous diffraction phasing and initial model building were carried out at the workshop entitled “CCP4/APS School in Macromolecular Crystallography: From Data Collection to Structure Refinement and Beyond” at the Argonne National Laboratory in June 2014 ( We also thank all the lecturers of the workshop and the staff of Advanced Photon Source Sector 23 (GM/CA-CAT) for helpful discussions regarding data collection, processing, refinement, and validation strategies. We also thank C. Tang for providing a CA-expressing plasmid. The data presented in this manuscript are tabulated in the main paper and in the supplementary materials. Final coordinates and structure factors have been deposited in the Protein Data Bank (PDB) and are available under accession codes 4XFX (CA), 4XFY (dCA), and 4XFZ (CAPF74). This work was supported in whole or in part by NIH grants AI112417, AI120860, GM103368, AI076119, AI099284, and AI100890 (S.G.S.) and GM066087 (O.P.).
View Abstract

Stay Connected to Science


Navigate This Article