A supramolecular assembly mediates lentiviral DNA integration

See allHide authors and affiliations

Science  06 Jan 2017:
Vol. 355, Issue 6320, pp. 93-95
DOI: 10.1126/science.aah7002

High-resolution insights into the intasome

An essential step in the life cycle of lentiviruses such as HIV-1 is when viral DNA integrates into the host genome, establishing a permanent infection of the host cell. The viral integrase enzyme catalyzes this process and is a major drug target. During viral integration, integrase binds the ends of viral DNA, forming a higher-order structure called the intasome. Passos et al. and Ballandras-Colas et al. used cryo—electron microscopy to solve the structures of the intasomes from HIV-1 and maedi-visna virus (ovine lentivirus), respectively. These structures reveal how integrase self-associates to form a functional intasome and help resolve previous conflicting models of intasome assembly.

Science, this issue p. 89, p. 93


Retroviral integrase (IN) functions within the intasome nucleoprotein complex to catalyze insertion of viral DNA into cellular chromatin. Using cryo–electron microscopy, we now visualize the functional maedi-visna lentivirus intasome at 4.9 angstrom resolution. The intasome comprises a homo-hexadecamer of IN with a tetramer-of-tetramers architecture featuring eight structurally distinct types of IN protomers supporting two catalytically competent subunits. The conserved intasomal core, previously observed in simpler retroviral systems, is formed between two IN tetramers, with a pair of C-terminal domains from flanking tetramers completing the synaptic interface. Our results explain how HIV-1 IN, which self-associates into higher-order multimers, can form a functional intasome, reconcile the bulk of early HIV-1 IN biochemical and structural data, and provide a lentiviral platform for design of HIV-1 IN inhibitors.

Integrase (IN) acts on the ends of the linear double-stranded viral DNA (vDNA) molecule produced by reverse transcription of the retroviral RNA genome. Initially, IN catalyzes 3′ processing to expose 3′-hydroxyl groups attached to invariant CA dinucleotides at the vDNA ends. After entry into the nuclear compartment, IN inserts the processed vDNA 3′ termini across the major groove of chromosomal target DNA by using the 3′-hydroxyls as nucleophiles in the strand transfer reaction. These events take place within the intasome, a stable synaptic complex comprising a multimer of IN assembled on vDNA ends (1). Characterization of prototype foamy virus (PFV, belonging to the spumavirus genus); Rous sarcoma virus (RSV, an α-retrovirus); and mouse mammary tumor virus (MMTV, a β-retrovirus) intasomes illuminated the conserved intasome core (CIC) structure composed of, minimally, a pair of IN dimers, as in the case of the PFV intasome (2, 3), or decorated by flanking IN dimers in RSV (4) and MMTV (5). The architecture of the lentiviral intasome, the genus that includes HIV-1 and HIV-2 along with highly pathogenic animal viruses, has remained elusive.

Unfavorable biochemical properties of HIV-1 IN necessitate the use of hyperactive and/or solubilizing mutations (68), which, by their nature, dramatically change the properties of the protein. Taking a more holistic approach, we sought to identify a lentiviral IN that is amenable for structural studies as a wild-type protein. We discovered that the IN from maedi-visna virus (MVV), an ovine lentivirus, displays robust strand-transfer activity when supplied with oligonucleotides mimicking the vDNA ends in the presence of the common lentiviral integration host factor LEDGF (9, 10) (fig. S1). MVV IN assembled into a functional nucleoprotein complex that could be isolated by size-exclusion chromatography (fig. S2A). In the presence of the essential Mg2+ cofactor, the purified nucleoprotein complex catalyzed strand-transfer activity and could be inhibited by the HIV-1 IN strand-transfer inhibitor (INSTI) dolutegravir (11) (fig. S2B). Sequence analysis of reaction products ascertained that they were formed by full-site integration—coordinated insertion of pairs of vDNA ends across the major groove in target DNA—leading to short duplications of target DNA sequences (fig. S2C). To confirm that the most commonly observed duplication size—6 base pairs (bp)—is representative of MVV integration, we sequenced 2526 unique integration sites in primary sheep cells infected with pathogenic MVV and compared them with in vitro integration sites obtained with purified intasomes and deproteinized sheep or bacterial plasmid DNA. Aligning the three sets of integration site sequences revealed symmetric and highly similar sequence preferences that are fully consistent with integration of vDNA ends across 6 bp in target DNA (fig. S3). As expected for a lentivirus (9), MVV displayed a strong preference for transcription units, with 70.2% of integration sites found within predicted sheep genes, compared with 43.7% in the in vitro generated sample (P < 10−150).

Inspection of the intasome by negative-stain electron microscopy (EM) revealed a planar, two-fold symmetric assembly measuring over 20 nm in the widest dimension (fig. S4), which is much larger than any of the previously characterized retroviral intasomes. To determine its structure, we acquired images of single particles in vitreous ice using a transmission electron microscope equipped with a direct detector. The final structure was refined by using a data set of 94,283 single particles to an overall resolution of 4.94 Å, with local resolution varying from 9 Å in the periphery of the structure to ~4 Å throughout the core region (Fig. 1 and figs. S5 to S7). A crystal structure spanning the N-terminal and the catalytic core domains (NTD and CCD) of MVV IN is available (12). In addition, we determined two crystal structures spanning the MVV IN C-terminal domain (CTD, table S1). Sixteen MVV IN subunits and two double-stranded DNA oligonucleotides representing the synapsed vDNA ends could be unambiguously placed in the electron density map (Fig. 1A, fig. S8, and movie S1), consistent with the observed molecular mass of ~0.5 MDa for the complex (fig. S2D).

Fig. 1 Cryo-EM reconstruction of the MVV intasome.

(A) Fitted intasome model color-coded to highlight IN subunits including 12 NTDs, 16 CCDs, and 14 CTDs. Molecules of vDNA in dark gray are surrounded by core tetramers I and II (colored in green, light green, sky blue, and blue), and flanking tetramers III and IV (red, yellow, pink, and purple). (B and C) Views of the map in two alternative orientations. The CIC structure is highlighted with a black outline in (B).

The intasome represents a tetramer of tetramers, each a pair of imperfectly symmetric IN dimers with CCD-CTD linkers in extended α-helical configurations that are similar to the HIV-1 IN dimer observed in crystals (7) (figs. S9 and S10A). Although intasome formation required the presence of LEDGF, only traces of the host factor remained after purification by two-stage chromatography (fig. S2A). Consequently, no density could be attributed to LEDGF in the structure. It is possible that the remaining LEDGF molecules are distributed over 16 possible binding positions on the intasome.

The CIC, analogous to those found in other retroviral systems (2, 4, 5), is located at the center of the assembly (Figs. 1B and 2 and fig. S7C), and each of the four MVV IN tetramers is involved in its formation: Core tetramers I and II contribute a CCD dimer each, and flanking tetramers III and IV provide a pair of synaptic CTDs that join the halves of the CIC structure. About 20 bp of each vDNA end are well defined in the electron density. The vDNA ends pass through the CIC structure approaching each other at an angle of 60°, with their terminal base pairs separated by IN CCD α4 helix (fig. S10C). Each recessed 3′ vDNA end is placed in the active site of a catalytic IN subunit (chains A and I), and the complementary nontransferred strand is threaded between the CCD and the synaptic CTD (fig. S10C). The catalytic subunits intertwine by exchanging a pair of NTDs, with CCD-NTD linkers crossing the synaptic interface and contacting vDNA minor grooves (Fig. 2, fig. S10B, and movie S1).

Fig. 2 CIC structure in MVV and previously characterized synaptic complexes.

The CIC in each structure is shown in color with the remainder in gray; yellow CTDs indicate domains donated by flanking IN subunits.

A layer of CTDs bridges the flanking and core tetramers of the intasome (fig. S8). Four pairs of CTDs belonging to the inner- and outermost IN chains of each lobe stack to form dimers nearly identical to those observed in MVV IN crystals and formed by the isolated HIV-1 CTD in solution (13) (fig. S11). The β1-β2 loops of the CTD dimers from tetramers I and II insert into minor grooves of vDNA (fig. S10D) close to the end engaged by the active site of the opposing main tetramer (Fig. 1A). The interactions made across the synaptic interface likely ensure that a stable intasome forms only when the enzyme engages both vDNA ends. The NTDs from the inner and one of the flanking IN chains interact with the vDNA backbone (fig. S10D), forming an interface previously observed in crystals of the HIV-1 IN NTD-CCD construct (14) (fig. S12).

Because of the role of DNA in synaptic interface formation, retroviral INs and closely related DNA transposases tend to assemble into functional multimers only in the presence of their DNA substrates (2, 4, 5, 15, 16). The intasome would thus be expected to contain a multiple of the minimal multimeric species found in solution, and this conjecture holds true for characterized intasomes containing tetramers of monomers (PFV) or dimers (RSV and MMTV) (2, 4, 5). HIV-1 IN forms higher-order multimers in the absence of vDNA (12, 1719), and cross-linked HIV-1 IN tetramers are functional in vitro (20). Similarly, MVV IN also forms tetramers and higher-order multimers in solution (fig. S2E). Thus, our structure explains how lentiviral INs, which are highly prone to self-associate, combine into the CIC structure. In lieu of the remarkable differences between intasome structures, it would be of interest to compare quantitative proteomes of retroviral genera, although the number of IN molecules carried by the virus is unlikely to be limiting (21, 22).

The structural basis for α- and β-retroviral intasomes to comprise more than the minimal IN dimer-of-dimers architecture is relatively short IN CCD-CTD linkers (4, 5), which prohibit the CTD from the core subunits to insert into the synaptic interface. In HIV-1 and MVV IN, the CCD-CTD linkers assume α-helical conformations (7) (figs. S9B and S11B), which likewise make it impossible for core tetramer subunits to provide the synaptic CTDs. Although the linker region is the least conserved among lentiviral INs, it is invariably predicted to form an extended α helix (fig. S9C); this argues for conservation of the higher-order state of IN within lentiviral intasomes. The high stoichiometry of IN within the lentiviral intasome may help explain the notoriously pleiotropic phenotypes of HIV-1 IN mutant viruses (23). Because the two-fold symmetric assembly contains eight structurally distinct IN subunits, each IN residue could have as many as eight distinct functions. The CTD plays the most functionally diverse roles within the intasome, contributing to intra- and intertetramer interactions, as well as DNA binding.

To visualize how the lentiviral intasome engages target DNA, we determined a cryo-EM structure of the MVV strand transfer complex to 8.6 Å resolution (Fig. 3A and fig. S13). In agreement with the analogous PFV and RSV structures (3, 4), target DNA binds between the halves of the CIC structure. The synaptic CTDs insert their β1-β2 loops into expanded major grooves, which contributes to target DNA bending (Fig. 3B). Inspection of the surface potential distribution on the target DNA side of the complex highlighted several patches of positive charge, each corresponding to the cleft at the IN CCD dimerization interface (6) (Fig. 3B), which was recently implicated in noncatalytic interactions with nucleosomal DNA in the PFV system (24). Lentiviral integration is exquisitely selective toward highly active and gene-rich genomic loci, a property that is explained, at least in part, by the direct interaction between IN and chromatin-associated LEDGF (9). The MVV intasome structure is compatible with binding as many as 16 molecules of the host factor (fig. S14). The ability to form such supermultivalent interactions may facilitate the viral integration machinery to locate chromatin highly enriched in LEDGF and possibly other marks associated with transcriptional activity. The MVV intasome system described here should be applicable to studies of HIV-1 INSTIs (fig. S2B). Moreover, the complexity of the lentiviral intasome, presenting multiple IN-IN interfaces, may be exploitable in anti-HIV/AIDS drug development. A highly similar structure assembled with a hyperactive construct of HIV-1 IN albeit missing some of the flanking tetramer subunits is described by Passos et al. (25) in this issue.

Fig. 3 Target DNA binding and surface electrostatic potential distribution.

(A) Cryo-EM reconstruction of the MVV strand transfer complex at 8.6 Å resolution viewed in two orientations. Protein, vDNA, and target DNA are shown in white, dark gray, and maroon, respectively. (B) Pseudo-atomic model of the strand transfer complex with the protein portion of the structure in space-fill mode and colored by charge.

Supplementary Materials

Materials and Methods

Figs. S1 to S14

Tables S1 and S2

Movie S1

References (2650)

References and Notes

  1. Acknowledgments: We thank G. Schoehn for help with preliminary cryo-EM; the staff of the Diamond beamlines I04 and I04-1 for assistance with x-ray data collection; P. Afonine for advice on real-space refinement; L. Collinson, R. Carzaniga, T. Pape, P. Walker, and A. Purkiss for EM, x-ray crystallography, and software support; L. Heck for expert assistance with the computer cluster; and G. Maertens for comments on the manuscript. Data presented in this manuscript are tabulated in the main paper and in the supplementary materials. The cryo-EM maps, pseudo-atomic models, and x-ray structures were deposited with the Protein Data Bank and the EM Data Bank with accessions EMD-4138, EMD-4139, PDB-5M0Q, PDB-5M0R, PDB-5LLJ, and PDB-5T3A; the integration site sequencing data were deposited with National Center of Biotechnology Information, NIH, GEO repository under accession GSE87786. This work was supported by NIH grants GM082251 and AI070042 (P.C. and A.N.E.), The Francis Crick Institute (P.C. and A.C.), The Wellcome Trust (A.K.), and Icelandic Research Fund (V.A. and S.R.J.). We acknowledge the use of the Durham University DiRAC Data Centric system, which is supported by capital grants ST/K00042X/1, ST/K00087X/1, and ST/K003267/1 from the U.K. Department for Business, Integration, and Skills. The Division of Structural Biology Particle Imaging Center EM Facility at University of Oxford was founded by The Wellcome Trust Joint Infrastructure Fund (JIF) award 060208/Z/00/Z and equipment grant 093305/Z/10/Z.
View Abstract

Navigate This Article