Cryo-EM structures and atomic model of the HIV-1 strand transfer complex intasome

See allHide authors and affiliations

Science  06 Jan 2017:
Vol. 355, Issue 6320, pp. 89-92
DOI: 10.1126/science.aah5163

High-resolution insights into the intasome

An essential step in the life cycle of lentiviruses such as HIV-1 is when viral DNA integrates into the host genome, establishing a permanent infection of the host cell. The viral integrase enzyme catalyzes this process and is a major drug target. During viral integration, integrase binds the ends of viral DNA, forming a higher-order structure called the intasome. Passos et al. and Ballandras-Colas et al. used cryo—electron microscopy to solve the structures of the intasomes from HIV-1 and maedi-visna virus (ovine lentivirus), respectively. These structures reveal how integrase self-associates to form a functional intasome and help resolve previous conflicting models of intasome assembly.

Science, this issue p. 89, p. 93


Like all retroviruses, HIV-1 irreversibly inserts a viral DNA (vDNA) copy of its RNA genome into host target DNA (tDNA). The intasome, a higher-order nucleoprotein complex composed of viral integrase (IN) and the ends of linear vDNA, mediates integration. Productive integration into host chromatin results in the formation of the strand transfer complex (STC) containing catalytically joined vDNA and tDNA. HIV-1 intasomes have been refractory to high-resolution structural studies. We used a soluble IN fusion protein to facilitate structural studies, through which we present a high-resolution cryo–electron microscopy (cryo-EM) structure of the core tetrameric HIV-1 STC and a higher-order form that adopts carboxyl-terminal domain rearrangements. The distinct STC structures highlight how HIV-1 can use the common retroviral intasome core architecture to accommodate different IN domain modules for assembly.

Catalytic integration of a viral DNA (vDNA) copy of an RNA genome into host target DNA (tDNA) represents the hallmark characteristic of all retroviruses, including HIV-1. Integration establishes a permanent infection in host cells and enables the newly inserted provirus to be replicated and transcribed in parallel with other genes of the host organism (1). This critical step in the HIV-1 replication cycle represents one of the underlying difficulties in combating the HIV/AIDS pandemic. Integration is catalyzed by the viral integrase (IN) protein, which oligomerizes into a higher-order stable synaptic complex (SSC) containing the two vDNA ends. Following cleavage of the GT dinucleotide from both 3′ vDNA ends and nuclear entry, cleaved SSCs engage tDNA and catalyze irreversible DNA strand transfer into host chromatin to form the strand transfer complex (STC) (2). Both retroviral SSCs and the postcatalytic STCs are collectively called intasomes. Clinically exploited HIV-1 IN strand transfer inhibitors (INSTIs) selectively bind cleaved SSCs and interfere with the formation of the STC. Therefore, high-resolution structures of key integration intermediate nucleoprotein complexes are required to further our understanding of the mechanisms of action of INSTIs and the evolution of drug-resistant HIV-1 phenotypes (3). The mechanism of HIV-1 DNA integration has been extensively studied at the biochemical and cellular level, but progress with structural studies of nucleoprotein reaction intermediates has been slow; only structures of domains of HIV-1 IN are currently available (48), although intasome structures have been determined for related retroviruses (912) and predicted for HIV-1 through homology modeling (13, 14).

Structural studies of HIV-1 intasomes have been challenging, owing to the tendency of HIV-1 IN protein and assembled intasomes to aggregate. Fusion of the DNA binding protein Sso7d to the N terminus of IN results in a protein that is hyperactive in vitro, has markedly improved solubility properties, and retains activity in vivo when incorporated into HIV-1 virions (15). We therefore used Sso7d-IN to assemble HIV-1 intasomes for structural studies. STC intasomes were assembled on branched DNA, mimicking the product of DNA integration (fig. S1A) using the strategy previously described for prototype foamy virus (PFV) (16) and Rous sarcoma virus (RSV) (12) intasomes. HIV-1 intasomes were first purified by Ni-affinity and anion-exchange chromatography (fig. S1B). Analytical ultracentrifugation after anion exchange chromatography indicated the presence of the tetrameric STC as well as larger discrete species (fig. S2), in agreement with previous studies (1719). An additional gel-filtration step before cryo–electron microscopy (cryo-EM) structural analysis yielded a preparation that was mainly tetrameric but also included larger species, as evidenced by the broad and asymmetric peak shape (fig. S1C).

Tetrameric HIV-1 STCs are relatively small by cryo-EM standards (~200 kDa) and require high salt and glycerol to prevent aggregation, factors that negatively affect image contrast of individual particles. To overcome these problems, we employed a high-dose imaging strategy and an exposure filter that accounts for the effects of radiation damage while maximizing low-frequency contrast (20). Single-particle classification and refinement of exposure-filtered images produced a density map resolved to ~3.5 to 4.5 Å, with the highest-resolution information characterizing the STC core, in and around the active site (figs. S3 and S4). This enabled derivation of a molecular model of an HIV-1 STC, which contained four IN protomers arranged with twofold symmetry around the product of DNA strand transfer (Fig. 1 and table S1).

Fig. 1 HIV-1 STC intasome structure.

(A) Cryo-EM reconstruction of the STC, segmented by IN protomers (red, green, yellow, and blue) and product DNA components (dark and light gray). (B) Atomic model derived from the cryo-EM density, colored as in (A). (C) Segmented cryo-EM density and (D) asymmetric subunit of the atomic model, colored by IN domain: NTD, green; CCD, beige; NTD-CCD linker, blue; CTD, purple.

The tetrameric HIV-1 STC intasome is a dimer of dimers with a similar overall architecture to PFV intasomes (Fig. 1, A and B). Each protomer contains an N-terminal domain (NTD), a catalytic core domain (CCD), and a C-terminal domain (CTD). The inner protomers wrap their three functionally relevant domains around a pair of vDNA ends and dock onto tDNA, bringing two vDNA 3′-OH groups into proximity to catalyze concerted integration and form the STC. The inner protomers also make most of the contacts with vDNA and tDNA. The outer IN CTD (CTDouter) adopts a retracted configuration in the HIV-1 intasome, contributing partially to vDNA binding and positioning itself in proximity to the inner CTD (CTDinner) (Fig. 1, C and D). The outer NTDs, as well as all Sso7d fusion domains, are disordered in the cryo-EM density.

Retroviral intasomes recognize and cut target sites with a characteristic 4- to 6-bp spacing, generating equivalently sized target-site duplications (TSDs) flanking either end of the integrated proviral DNA (2). To a large extent, the TSD sizes map onto the retroviral phylogenetic tree (21), although the precise TSD spacing can differ within an individual genus (22). In all of the available STC structures, the target DNA is substantially distorted from B form (fig. S5), resulting in a 4-bp TSD for PFV, which has the shortest distance between the active sites, and 5 and 6 bps for HIV-1 and RSV, respectively, which have a longer spacing.

The STC model substantiates and rationalizes much of the existing in vitro and in vivo data pertaining to HIV-1 IN residues involved in function, inhibitor binding, and mechanisms of drug resistance. In this regard, Fig. 2 and table S2 present a comprehensive analysis of predicted electrostatic protein-DNA and interdomain interactions within these core components. Notably, the table includes all of the residues that were experimentally shown to affect IN function, but it also provides additional details that were not captured by a homology model (13). Residues that interact with vDNA are distributed throughout the IN protein, whereas residues interacting with tDNA are largely clustered in the CCD (Fig. 2A). Multiple residues are also involved in interdomain interactions (Fig. 2B). Several specific DNA-binding residues (displayed in Fig. 2C and fig. S6) deserve particular mention. A cluster of basic residues—including K46, which was not identified by a homology model, and K156, K159, and K160—is inserted into the vDNA minor groove next to the active site. The relevance of K46 is addressed below, whereas K156, K159, and K160 play various roles in vDNA binding, sequence specificity, and catalysis (13, 23, 24). R231, the only non-CCD residue that strongly interacts with tDNA (but also with vDNA through the other protomer), has previously been shown to affect nucleotide preferences within the target site, although the magnitude of this effect in HIV-1 IN mutants is considerably lower than analogous IN mutations in PFV (25). The weaker interaction between R231 of HIV-1 with target-site nucleotides—as compared with R329 of PFV, which contains a longer loop that can accommodate subtle structural changes—helps to explain the phenotype. Substitutions of HIV-1 IN S119 (25), similarly to analogous changes in RSV and PFV INs (10, 26), alter target-site nucleotide specificity by perturbing interactions with tDNA. The model also provides important guidance for rationally improving clinically relevant inhibitors. Specifically, several residues around the vicinity of the active site, especially R231, are positioned differently in HIV-1 compared with PFV, which has been used as a model system to study mechanism of INSTI action (fig. S7). Slight differences in the active site can be exploited to facilitate rational inhibitor design. Collectively, the current model provides a composite platform for both understanding IN function and elucidating modes of action of INSTIs.

Fig. 2 Network of interactions within the HIV-1 STC intasome.

(A) Map of IN residues predicted to be involved in electrostatic protein-DNA interactions within the STC intasome structure. All three domains and the NTD-CCD linker participate in interactions with vDNA (pink), whereas tDNA interactions (in blue) are mostly restricted to the CCD. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr. (B) Predicted interdomain H-bond interactions within the STC. Residues designated below the domain schematic refer to the interacting domain and are colored accordingly. (C) Close-up views of selected regions involved in DNA interactions. For comparing HIV-1 R231 with prototype foamy virus (PFV) R329, the two structures were aligned to tDNA. For all panels, the protein color scheme is as in Fig. 1, C and D.

To gain a more thorough understanding of the heterogeneous STC data and improve regions of density outside of the core subunits, we employed a multistep classification approach that revealed larger species containing flanking IN dimers (fig. S4) positioned in the trans configuration, similarly to RSV and mouse mammary tumor virus (MMTV) (11, 12). We then included the IN-binding domain (IBD) of LEDGF/p75 in the STC preparation, based on the rationale that IBD preferentially binds and stabilizes multimeric IN (27, 28), and performed a cryo-EM reconstruction of IBD-bound STCs (STCIBD). The resulting data contained a larger proportion of higher-order assemblies (fig. S8) but was also affected by substantial compositional heterogeneity; a cryo-EM reconstruction of the largest and best-resolved species clearly revealed 12 IN protomers within the map, with residual density contributed by a fraction of particles (Fig. 3, A and B, and figs. S9 to S11). IN can purify as tetramers from cells (29), and tetrameric INs constitute a portion of the higher-order assemblies (Fig. 3C; see also below). It is therefore likely that the heterogeneous density corresponds to additional IN protomers that may collectively constitute a hexadecamer (or tetramer of tetramers). Possibly, the Sso7d fusion, which improves IN solubility (15), affects the assembly of fully formed higher-order species by mildly disrupting interprotomer associations (fig. S12). The higher-order assemblies utilize many of the principles underlying multimerization of IN protein in the absence of DNA (Fig. 3, D to G, and fig. S13). For example, the isolated tetramer from each asymmetric unit contains positionally conserved CCDs and NTDs that were previously observed within a two-domain HIV-1 NTD-CCD (INNTD-CCD) structure (7) and Maedi-Visna virus INNTD-CCD bound to IBD (Fig. 3D) (28). Individual dimers therein are also consistent with an HIV-2 INNTD-CCD structure (30) (Fig. 3E), whereas two of the CTDs interact in a manner identical to a nuclear magnetic resonance structure of a CTD dimer (Fig. 3F). Finally, the core CTDs adopt a configuration much like the two-domain INCCD-CTD (8) (Fig. 3G). The latter demonstrates an intriguing aspect of IN structure: Whereas tetrameric HIV intasomes adopt a domain configuration much like PFV, higher-order intasomes reorganize their CTDs, utilizing them to form an interprotomer CTD-CTD interface and to engage vDNA but replacing their respective positions with additional CTDs donated by outer IN protomers (Fig. 3H). These alternative domain arrangements preserve the positional integrity of the catalytically competent intasome and demonstrate the structural plasticity of HIV-1 IN.

Fig. 3 HIV-1 STC intasomes form higher-order oligomers through distinct mechanisms of assembly.

(A) Cryo-EM density map of IBD-bound STCs (STCIBD). Densities are segmented either by IN protomers (inner core, light blue; outer core, dark blue) or IN dimers (yellow and green). The IBD is shown in red. (B) Higher-order STC model assembled by rigid-body–docking individual domain components, colored as in (A). The higher-order STC (left) is shown side by side with the tetrameric STC from Fig. 1 (right). (C) Model as in (B), colored by IN tetramers (28). The circled regions contain poorly resolved density that may harbor additional IN dimers. (D to G) Structural comparison of higher-order STCs assembled through rigid-body docking of individual domains with prior multidomain IN structures. The structural components of higher-order STCs are colored as in (A) and (B), whereas the Protein Data Bank (PDB) structures used for comparison are in gray. Comparisons include: (D) MVV INNTD-CCD tetramer (PDB ID: 3HPH, IBD has been omitted for clarity; the circled NTD arises from an IN protomer on the opposite side of vDNA), (E) HIV-2 INNTD-CCD dimer bound to IBD (PDB ID: 3F9K), (F) HIV-1 CTD dimer (PDB ID: 1IHV), and (G) HIV-1 INCCD-CTD dimer (PDB ID: 1EX4). In all panels [(D) to (G)], structural schematics above highlight the corresponding location within visible dodecameric intasome density. (H) Conformational rearrangement within the core CCD-CTD dimer between (left) the tetrameric STC and (center) a higher-order STC, both overlaid on respective filtered experimental EM density. At right, the rearranged higher-order dimer is displayed in the context of additional CTDs and vDNA within the asymmetric unit. The “synaptic” position is required to form the conserved intasome core interface present in all retroviral intasomes.

HIV-1 intasomes assembled at lower protein and DNA concentrations than those used in our cryo-EM study were reported to be tetrameric (1719). To test the relevance of the higher-order assemblies, we carefully selected IN residues that were predicted to disrupt formation of these species but not the core tetramers. The most obvious candidates resided in the CTD-CTD interface—residues L242, I257, and V259—which are solvent-exposed within tetrameric intasomes. Several other residues—including K14, E35, K240, K244, and R269—were predicted to be more relevant in the context of higher-order oligomers, although we cannot completely exclude their involvement in tetrameric intasomes (fig. S14, A to E). The selected residues were substituted in the context of both Sso7d and WT (NL4-3) INs, and the mutant proteins were assayed for concerted integration activity. Similar results were observed in the presence and absence of the Sso7d fusion protein, suggesting that Sso7d does not alter the nature of functional complexes, although it may influence their relative abundance. The selected mutants, especially those in the CTD-CTD interface, affected strand transfer activity to various extents (fig. S14, F and G). Furthermore, mildly disrupting or deleting many of the residues within the CCD-CTD linker region (amino acids ~206 to 220), which is disordered in tetrameric intasomes (and thus would not be expected to play a major role) but is completely helical in higher-order assemblies, impaired catalytic activity (fig. S14H). In addition, we tested the importance of select residues that have not been examined previously (3133) for virus replication (fig. S15). IN substitutions adversely affected virus replication, with >10-fold reductions observed for E212→K212 (E212K), K240E, and I257D mutations, and relatively less detrimental effects (twofold) seen for E35K. A mutation of K46, identified as a previously unidentified vDNA binding residue, was also included in functional assays but was relevant for all oligomeric species. Whereas the K46A substitution did not detectably affect viral growth (34), the K46E substitution reduced virus replication by ~fivefold and substantially reduced strand transfer activity in vitro. These data suggest that the higher-order HIV-1 intasomes are functionally relevant for efficient catalysis. However, because point mutations can affect protein structures and/or the intasome assembly pathway in unexpected ways, further systematic studies will be required to delineate the (likely pleiotropic) effects of single-site substitutions. The simplest explanation for the distinct structures is that tetrameric intasomes, containing intact core domains, illustrate the minimal form upon which higher-order complexes are built, although their exact relevance is not clear. They may represent minimally active species or, alternatively, may serve as structural scaffolds for higher-order assembly within the pre-integration complex (PIC) or during PIC nuclear import. Further work, especially in the context of IN dynamics, will be required to unravel the role(s) of the tetrameric and higher-order forms in vivo.

Retroviruses are closely related evolutionarily and would be expected to utilize similar nucleoprotein structures for DNA integration. In this regard, the structure of the PFV intasome (9, 10) presented a conundrum. The length of the linkers between the domains of PFV IN is longer than in most retroviral INs, and many retroviral INs have linkers that are too short to form a tetrameric intasome that is analogous to the PFV structure (11). HIV-1 IN has linker lengths that are intermediate between PFV and MMTV or RSV (fig. S16A). The recent structures of MMTV (11) and RSV (12) intasomes show that these viruses overcome this problem by assembling intasomes with the same set of positionally conserved domains in contact with DNA, but for MMTV and RSV, two of the CTDs are contributed by an additional pair of flanking dimers in an octameric arrangement. Whereas PFV intasomes assemble tetramers, MMTV and RSV intasomes assemble octamers, and HIV-1 intasomes can apparently form a range of oligomeric configurations (fig. S16, B and C). The finding that HIV-1 IN can assemble intasomes in different ways while preserving the spatial arrangement of the key set of domains required for catalysis suggests that the evolutionary jump between retroviruses that assemble tetrameric and higher-order intasomes may not be as great as it initially appears.

The higher-order HIV-1 STC described here is very similar to a hexadecameric Maedi-visna virus intasome assembled using the LEDGF/p75 cofactor in an accompanying paper (35).

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S16

Tables S1 and S2

References (3656)

References and Notes

  1. Acknowledgments: D.L. acknowledges support from NIH grant P50 GM103368 and the Leona M. and Harry B. Helmsley Charitable Trust grant 2012-PG-MED002. R.C. is supported by the Intramural Program of the National Institute of Diabetes and Digestive Diseases of the NIH and by the Intramural AIDS Targeted Antiviral Program of the Office of the Director of the NIH. These studies were also partly supported by NIH grant R01 AI062520 to M.K. Molecular graphics and analyses were performed with the University of California, San Francisco, Chimera package (supported by NIH grant P41 GM103331). We thank B. Anderson and J.-C. Ducom for help with EM data collection and network infrastructure, F. Dwyer for computational support, G. Lander and M. Herzik for help with ensemble refinements, and A. Engelman and M. Gellert for critical review of the manuscript. The data presented in this manuscript are tabulated in the main paper and in the supplementary materials. The EM maps of STC and STCIBD are deposited into the Electron Microscopy Data Bank under accession codes EMD-8481 and EMD-8483, respectively. The STC model is deposited into the Protein Data Bank under ID 5U1C. The STC model ensemble and the composite model of the higher-order STCIBD oligomers are available upon request.
View Abstract

Navigate This Article