Structural basis for strand-transfer inhibitor binding to HIV intasomes

See allHide authors and affiliations

Science  14 Feb 2020:
Vol. 367, Issue 6479, pp. 810-814
DOI: 10.1126/science.aay8015

Strengths and weaknesses of an HIV drug

Retroviruses replicate by inserting a copy of their RNA, which has been reverse transcribed into DNA, into the host genome. This process involves the intasome, a nucleoprotein complex comprising copies of the viral integrase bound at the ends of the viral DNA. HIV integrase strand-transfer inhibitors (INSTIs) stop HIV from replicating by blocking the viral integrase and are widely used in HIV treatment. Cook et al. describe structures of second-generation inhibitors bound to the simian immunodeficiency virus (SIV) intasome and to an intasome with integrase mutations known to cause drug resistance. Passos et al. describe the structures of the HIV intasome bound to a second-generation inhibitor and to developmental compounds that are promising drug leads. These structures show how mutations can cause subtle changes in the active site that affect drug binding, show the basis for the higher activity of later-generation inhibitors, and may guide development of better drugs.

Science, this issue p. 806, p. 810


The HIV intasome is a large nucleoprotein assembly that mediates the integration of a DNA copy of the viral genome into host chromatin. Intasomes are targeted by the latest generation of antiretroviral drugs, integrase strand-transfer inhibitors (INSTIs). Challenges associated with lentiviral intasome biochemistry have hindered high-resolution structural studies of how INSTIs bind to their native drug target. Here, we present high-resolution cryo–electron microscopy structures of HIV intasomes bound to the latest generation of INSTIs. These structures highlight how small changes in the integrase active site can have notable implications for drug binding and design and provide mechanistic insights into why a leading INSTI retains efficacy against a broad spectrum of drug-resistant variants. The data have implications for expanding effective treatments available for HIV-infected individuals.

HIV currently infects ~40 million people worldwide. The virus’s ability to integrate a viral DNA (vDNA) copy of its RNA genome into host chromatin, leading to the establishment of a permanent and irreversible infection of the target cell (and any progeny cells), is the central challenge in developing a cure (1). Integration, catalyzed by the viral integrase (IN) protein, is essential for retroviral replication and results in the covalent linkage of vDNA to the host genome (2, 3). Proper integration depends on the formation of a large oligomeric nucleoprotein complex containing viral IN assembled on the ends of vDNA, commonly referred to as an intasome (49). All intasomes contain multimeric IN bound to vDNA ends, but they are characterized by distinct oligomeric configurations and domain arrangements.

Intasome assembly and catalysis proceed through a multistep process that involves several distinct intermediates (fig. S1). The catalytically competent cleaved synaptic complex (CSC) intasome, which contains free 3′-OH ends, is the specific target of the IN strand-transfer inhibitors (INSTIs), a group of drugs that bind to both the active site of HIV IN and the ends of vDNA, thereby blocking catalysis. Treatment with INSTIs, which are a key component of combined antiretroviral therapy, leads to a rapid decrease in viral load in patients. INSTIs are generally well tolerated, and the second-generation drugs do not readily select for resistance (1013). They are used in the recommended first-line combination therapies for treating HIV-infected patients and are prime candidates for future development (14, 15).

The prototype foamy virus (PFV) intasome has been used as a model system to understand INSTI binding (6, 1619). However, this system has limitations. PFV and HIV INs share only ~25% of sequence identity in the catalytic core domain (CCD) (6), and many of the sites where drug-resistance mutations occur in HIV IN are not conserved in PFV IN. Moreover, minor changes in the structure of an INSTI can profoundly affect its ability to inhibit mutant forms of HIV (19, 20). Thus, understanding how INSTIs interact with HIV intasomes—their natural target—at a molecular level is needed to overcome drug resistance and to guide development of improved inhibitors.

We established conditions for assembling, purifying, and structurally characterizing HIV CSC intasomes. Previously, we have shown that fusion of the small protein Sso7d to the N-terminal domain (NTD) of HIV IN improves its solubility and facilitates assembly and purification of strand-transfer complex intasomes (4, 21). We further optimized conditions required for CSC formation and purification and showed that these complexes are biochemically active for concerted integration (fig. S2). We used a tilted cryo–electron microscopy (cryo-EM) data collection strategy to alleviate the effects of preferential specimen orientation on cryo-EM grids (22), which allowed us to collect data on the apo form of the HIV CSC intasome. The cryo-EM reconstruction of the HIV CSC intasome reveals a twofold symmetric dodecameric molecular assembly of IN. The highest resolution (~2.7 Å) resides within the core containing the two catalytic sites and the ends of vDNA (fig. S3 and table S1).

Lentiviral intasomes have a large degree of heterogeneity and vary in size depending on the protein and biochemical conditions, forming tetramers, dodecamers, hexadecamers, and proto-intasome stacks (figs. S4 and S5). The basic underlying unit, the conserved intasome core (CIC), resembles—but is not identical to—the tetrameric PFV intasome. The CIC is composed of two IN dimers, each of which binds one vDNA end and a C-terminal domain (CTD) from a neighboring protomer (23). In the cryo-EM reconstruction, four fully defined IN protomers, two CTDs from flanking protomers, and two additional CTDs from distal subunits are clearly resolved (Fig. 1A); these were used to build an atomic model (Fig. 1B). With the exception of the additional CTDs from distal subunits, which are not conserved in other retroviral species, the resolved regions constitute the intasome CIC.

Fig. 1 Cryo-EM structure of the HIV intasome core.

(A and B) Cryo-EM reconstruction (A) and corresponding atomic model (B) of the HIV CIC, colored by protomer (red and yellow CTDs from distal protomers are not part of the CIC but are conserved among lentiviral intasomes). The two catalytic sites are indicated by dashed squares. (C) Close-up of the HIV intasome active site, colored by root mean square deviation from the corresponding region in the PFV intasome (PDB 3L2Q). IN residues that frequently mutated in patient-derived clinical samples in response to second-generation INSTI treatment are indicated (11, 12). Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.

Each of the two active sites in an HIV intasome contains the catalytic residues Asp64, Asp116, and Glu152, forming the prototypical DDE motif present in many nucleases, transposases, and other INs (24). The regions near the active sites of the PFV and HIV intasomes are similar because many of the residues participate in substrate binding and catalysis. However, farther from the active sites, the structures diverge (Fig. 1C and figs. S6 and S7). The largest differences reside in the synaptic CTD from the flanking protomer, specifically the region around the loop spanning HIV IN Arg228-Lys236. The corresponding loop in PFV IN has four additional residues and assumes a distinct configuration. Clinically relevant drug-resistance mutations occur within regions of HIV IN where the amino acid sequences between the two orthologs diverge (11, 12).

To better understand how INSTIs interact with HIV intasomes, we assembled the complex with bictegravir (BIC), a leading second-generation INSTI and the most broadly potent of all clinically approved INSTIs (25). We also examined the binding of additional compounds—named 4f, 4d, and 4c, which contain a distinct chelating core (Fig. 2A)—whose development was motivated by the need to further improve potency against drug-resistant variants (19, 20). Currently, 4d is a leading drug candidate that shows improved efficacy over all clinically used and developmental compounds against the known drug-resistant variants (25, 26) (fig. S8). Intasomes were coassembled and copurified with INSTIs, and we verified their inhibitory activity (fig. S9). The cryo-EM structures of INSTI-bound CSCs extend to a comparable ~2.6 to 2.7 Å resolution near the active site, which allows the derivation of atomic models (figs. S10 to S12 and table S1).

Fig. 2 Structural basis of INSTI binding to HIV intasomes.

(A) Chemical structures of the compounds used in this study, including the leading clinical drug BIC and developmental inhibitors 4f, 4d, and 4c [nomenclature based on previously reported work (19)]. Halogenated phenyl groups are shown in blue and the metal-chelating heteroatoms are in red. (B and C) Binding modes are depicted for (B) BIC or (C) 4f (pink), 4d (light blue), and 4c (green) in the HIV intasome active site. (D) Superimposed binding modes of BIC and 4d. The terminal adenine base of vDNA and all water molecules are omitted for clarity.

INSTIs bind HIV CSCs within a well-defined pocket, formed by the interface between two IN protomers and vDNA. Several important pharmacophores characterize the binding of all INSTIs (Fig. 2, B and C). First, three central electronegative heteroatoms chelate two Mg2+ cofactors within the active site of IN. A halogenated benzyl moiety appended to the core by a short linker then displaces and substitutes for the 3′ terminal adenosine of processed vDNA, making a π-stacking interaction with the base of the penultimate cytosine. The displaced adenosine can adopt multiple rotameric conformations (17), only one of which contributes to INSTI binding by stacking on the central ring of the INSTI core (fig. S13). Removing the adenosine from the end of vDNA increases INSTI dissociation (27). The nature of the INSTI core and its substituents modulates its binding and helps to determine its spatial orientation within the active site. For example, the core naphthyridine ring of the 4c, 4d, and 4f compounds binds closer to the Mg2+ ions than the chelating core of BIC (Fig. 2, C and D). These naphthyridine compounds position their 6-substituents within a constriction formed by the side chain of Tyr143 and the backbone of Asn117. Fifteen of the most commonly found mutations that cause resistance in HIV IN are located within 10 Å of an INSTI core; however, only six are conserved between HIV IN and PFV IN (table S2). Small chemical modifications can markedly affect drug potency, as demonstrated previously for compounds targeting reverse transcriptase (28) or protease (29, 30). Thus, it is important to understand all interactions at the molecular level.

One strategy for developing inhibitors with broad potency against rapidly evolving enzyme targets is based on the concept of filling the substrate envelope (29). The rationale is that if inhibitory compounds bind entirely within a conserved consensus volume occupied by an enzyme’s natural substrates, this limits the ability of the virus to evolve changes in the target enzyme that allow it to discriminate between its normal substrates and synthetic inhibitors. The concept was originally used to guide the development of protease inhibitors and resulted in compounds with broad potency against viral-resistant variants (31). We extended the substrate envelope hypothesis to the development of INSTIs; however, the structural models initially used were based on PFV intasomes (19). The cryo-EM structures of HIV intasomes with bound INSTIs reveal key differences in the substrate binding region. For example, although the chelating naphthyridine core of 4f binds to PFV and HIV intasomes similarly, the 6-substituted sulfonyl benzyl moiety, which is key to the potency of the compound (19, 20), adopts distinct configurations for the different intasomes (Fig. 3, A to C). In compound 4c, the 6-substitution is an n-pentanol chain. When bound to the HIV CSC, the pentanol group of 4c adopts an extended configuration and makes contacts with HIV IN that are distinct from interactions that the pentanol substituent of 4c makes with PFV IN (Fig. 3, D to F) (19, 26). Compound 4d, which is more potent than 4c (fig. S8), adapts a similar extended configuration (Fig. 3F). Therefore, the differences in INSTI configuration are induced by the nature of the IN to which they bind. The simplest explanation for these differences is that multiple minor variations in the amino acids that surround the bound INSTI and DNA substrates affect the binding of the compound in the active site. These compounds mimic aspects of bound forms of vDNA and tDNA substrates, residing within the substrate envelope (fig. S14).

Fig. 3 INSTIs can bind differently to PFV and HIV intasomes.

(A and B) Compound 4f bound to the (A) HIV (pink) and (B) PFV (gray) intasome. (C) Overlay of compound 4f binding modes. (D and E) Compound 4c, containing a 6-pentanol substituent, bound to the (D) HIV (green) and (E) PFV (gray, PDB 5FRN) intasome. (F) Overlay of compound 4c binding modes. Compound 4d, containing a 6-hexanol substituent, is also shown in its binding mode to the HIV (light blue) intasome. In (A), (B), (D), and (E), intasome active sites are shown as surface views, with labeled residues. R231 is poorly ordered in the map and is, therefore, displayed as an Ala stub. The terminal adenine is removed for clarity.

We were particularly interested in understanding why 4d is, in general, more broadly effective against resistant mutants than other INSTIs (fig. S8). The high-resolution maps revealed a complex and dynamic network of water molecules surrounding bound INSTIs (fig. S15). The binding sites of many water molecules appear to be conserved, occupying similar positions in the unliganded and INSTI-bound CSC structures. However, some water molecules are displaced or shifted as a consequence of INSTI binding; others are found only when INSTIs are bound, which suggests that the conformational changes induced by the binding stabilize their position. To simplify the analysis, INSTI interactions and water molecules can be subdivided by their relative positions, with respect to the plane formed by the Mg2+-coordinating ligand scaffolds—respectively above, in-plane, and below the plane, as depicted in Fig. 4. The naphthyridine cores are engaged from above by the purine ring of the 3′-adenosine via a π-stacking interaction. This helps to stabilize a hydrogen bonding network involving the phosphate and N1 nitrogen of the adenine on one end and four water molecules in the cavity delimited by His67, Glu92, Asn120, and Ser119 on the other end. In-plane, the presence of the amino group at the 4-position of the naphthyridine core was previously shown to impart a >10-fold increase in potency (20). This improved efficacy appears to be due to (i) formation of an intramolecular hydrogen bond with the halobenzylamide oxygen, which stabilizes its planar conformation, and (ii) electronic and/or inductive effects on the aromatic core increasing the metal coordination strength and electrostatic potential over the ring (i.e., stronger π-stacking) (fig. S16 and supplementary note 1). Below the plane, the R1 substituent points toward the bulk solvent, and the positioning of its long chain displaces loosely bound water molecules. Displacement of the solvent should be entropically advantageous. In turn, the location of one of the displaced water molecules closely matches the location of the hydroxyl moiety of 4d, providing additional enthalpic gain. This observation helps explain why the 6-hexanol side chain of 4d imparts this derivative with superior potency against resistant viral variants (sometimes up to ~10-fold) compared with very similar compounds in which the lengths of the side chain are shorter (propanol or pentanol) or longer (octanol) (19, 26). Finally, there are three tightly bound water molecules underneath the DDE motif, reaching toward the backbone of Asn117 and Tyr143 and projecting toward the bulk solvent. These bound water molecules can be exploited for the development of improved compounds.

Fig. 4 Interactions of naphthyridine-based INSTIs and HIV intasomes.

Schematic representation that recapitulates the receptor molecular environment and the water (W) networks with which the naphthyridine scaffold ligands interact when coordinating the Mg2+ ions. The scheme summarizes interactions by their locations with respect to the metal coordination plane of the naphthyridine scaffold (above, in-plane, or below). For clarity, the two water molecules coordinating the Mg2+ ions from above are not shown.

Within the substrate envelope, differences in geometry of the catalytic pockets, their overall volume, and the locations of bound water molecules, among other features, all matter for understanding INSTI interactions. The current work highlights how small changes in the active site modulate drug binding and have implications for drug design. Structures of wild-type and mutant HIV intasomes bound to INSTIs should improve our understanding of resistance mechanisms and lead to the development of better drugs to be used in combination antiretroviral therapy for targeting viral escape mutants.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S16

Tables S1 and S2

References (3248)

References and Notes

Acknowledgments: The authors acknowledge B. Anderson at The Scripps Research Institute for help with EM data collection, P. Baldwin at Salk for assistance with the local computational infrastructure, T. Grant at Janelia Research Campus for providing the beam-tilt refinement program, and V. Dandey at the National Resource for Automated Molecular Microscopy (NRAMM) for early work identifying conditions for sample vitrification. Funding: NRAMM is supported by a grant from the National Institute of General Medical Sciences (9 P41 GM103310) from the NIH. Molecular graphics and analyses were performed with the UCSF Chimera package (supported by NIH P41 GM103331). This work was supported by NIH grants R01 AI136680 and R01 AI146017 (to D.L.), R01 GM069832 (to S.F.), and U54 AI150472 (to D.L. and S.F.) and by the Intramural Programs of the National Institute of Diabetes and Digestive Diseases (R.C.), the National Cancer Institute (X.Z.Z., T.R.B., S.J.S., and S.H.H.), and the Intramural AIDS Targeted Antiviral Program (IATAP) of the NIH. Author contributions: D.O.P. collected and processed cryo-EM data. M.L. assembled and purified intasomes and performed biochemical assays. I.K.J., D.O.P., and D.L. built and refined atomic models. X.Z.Z. prepared the INSTIs. R.Y. purified IN. Y.J. assisted with sample vitrification and data collection. S.J.S. determined the effects of mutations in IN on the potency of INSTIs. S.F. and D.S.-M. performed computational calculations and helped with the chemical and structural analysis of the models. S.H.H., T.R.B., R.C., and D.L. supervised experiments. D.L., D.O.P., and M.L. conceived the study. D.L., D.O.P., and I.K.J. wrote the manuscript with help from all authors. Competing interests: X.Z.Z., S.J.S., S.H.H., and T.R.B. are inventors on provisional patent applications U.S. 9,676,771 and U.S. 10,208,035 held by the National Cancer Institute. Data and materials availability: The cryo-EM maps and atomic models have been deposited into the Electron Microscopy Data Bank and Protein Data Bank under the following accession codes: CSCAPO (EMD-20481 and 6PUT); CSCBIC (EMD-20483 and 6PUW); CSC4d (EMD-20484 and 6PUY); CSC4f (EMD-20485 and 6PUZ); and CSC4c (EMD-21038 and 6V3K). The inhibitors 4c, 4d, and 4f are available from T.R.B. or S.H.H. under a material transfer agreement with the National Cancer Institute.

Stay Connected to Science

Navigate This Article