Report

Structure of the Vesicular Stomatitis Virus Nucleoprotein-RNA Complex

See allHide authors and affiliations

Science  21 Jul 2006:
Vol. 313, Issue 5785, pp. 357-360
DOI: 10.1126/science.1126953

This article has a correction. Please see:

Abstract

Vesicular stomatitis virus is a negative-stranded RNA virus. Its nucleoprotein (N) binds the viral genomic RNA and is involved in multiple functions including transcription, replication, and assembly. We have determined a 2.9 angstrom structure of a complex containing 10 molecules of the N protein and 90 bases of RNA. The RNA is tightly sequestered in a cavity at the interface between two lobes of the N protein. This serves to protect the RNA in the absence of polynucleotide synthesis. For the RNA to be accessed, some conformational change in the N protein should be necessary.

The viral genomic RNA (vRNA) of negative-stranded RNA viruses (NSRVs), such as vesicular stomatitis virus (VSV), do not exist as naked RNAs but rather as a ribonucleoprotein (RNP) complex in which the RNA is encapsidated with the nucleocapsid (N) protein. The RNP, rather than the naked vRNA, is the active template for transcription and replication (1, 2). NSRVs include some of the most dangerous human pathogens, such as Ebola, rabies, avian influenza, and measles viruses.

VSV is an enveloped, nonsegmented NSRV that belongs to the family Rhabdoviridae. The 11,161-nucleotide (nt) genome of VSV comprises five genes, which are the nucleocapsid (N), the phosphoprotein (P), the matrix (M), the glycoprotein (G), and the large subunit of the polymerase (L) (3). The entire RNP of VSV contains an estimated 1258 molecules of the N protein, each of which is bound to nine bases of RNA (4). Chemical probing studies have indicated that the N protein primarily binds to the ribose-phosphate backbone of the RNA (5). The large polymerase subunit (L) and the phosphoprotein (P) are the two essential viral components in the polymerase (1, 6). Viral transcription and replication by VSV are distinct processes that are defined in part by the level of the N protein in the cell. The N protein is initially in complex with the P protein, which prevents the concentration-dependent aggregation of N. This keeps the N protein in an encapsidation-competent form (7).

In addition to protection of the vRNA, the N protein complexed to RNA as the RNP has the structural role of forming the helical core of the assembled virion. The RNP in association with the matrix (M) protein condenses the vRNA into a compact bundle that is ready for packaging into the mature virus. Before being packaged into the progeny virion, the RNP exists in several states in the cell, including an undulating ribbon, a loosely coiled helix, and a tightly coiled helix that is usually found at the termini of the nucleocapsids (4, 810). The transition from the states observed in the cell to the helix observed in the intact virion is not well understood.

The inherent flexibility of the RNP has made it difficult to determine high-resolution structures by electron microscopy (EM) or x-ray crystallography. Recently, however, low-resolution EM reconstructions of ringlike structures composed of oligomeric N proteins in complex with RNA have been reported for rabies virus (11), influenza virus (12), and VSV (13). These structures showed that distinct N protein molecules are assembled on a single-strand RNA as parallel blocks. The RNA is bound in the center of the N protein ring.

We have previously reported the production of oligomers of a VSV N/P protein complex bound to RNA from an Escherichia coli expression system in which the N protein is concomitantly expressed with the P protein (14). After dissociation of the P protein, we obtained a single oligomeric species of the N protein that has a ringlike morphology and is bound to RNA (14). The size of the N-RNA complex was determined by size-exclusion chromatography and analytical ultracentrifugation analysis to be consistent with 10 copies of N molecules bound to a 90-nt strand of RNA (14). This N protein–RNA complex was crystallized, and diffraction data were collected (15).

The x-ray crystal structure of this VSV RNP-like complex (RLC) confirms the stochiometry and reveals the details of RNA binding and protection by the N protein. The RNA is tightly bound in a cavity at the interface between two lobes in the N protein with nine nucleotides associated with each N molecule. The RNA adopts a unique conformation in which some of the bases are facing toward and others away from the N protein. The structure of the RLC also shows an extensive network of interactions between neighboring N molecules where each monomer contacts three neighboring N protein molecules. The vRNA may need to be unsequestered from the N protein for transcription and replication.

The crystallographic asymmetric unit is composed of half of the ring (Fig. 1A). The second half is generated by the crystallographic two-fold on which the particle sits. Each monomer contains amino acid residues 2 to 422, a nine-nt strand of RNA, and three uranyl ions. The N protein has two lobes containing mainly α helices, which come together to form a cavity that accommodates the RNA (Fig. 1B). The structure commences with two short antiparallel β strands situated on an arm that extends downward to the core of the N-terminal lobe. The core of the N-terminal lobe consists of seven α helices and four β strands. The positions of the individual secondary-structure elements are described in the SOM Text and fig. S1. The C-terminal lobe begins at residue Ser220 and contains eight α helices. Between helices α12 and α13, a loop (Ser340-Val375) extends to interact with an adjacent molecule in the N protein ring. This extension rests upon the upper surface of the adjacent C-terminal lobe before eventually turning back, after residue Val350, toward its own C-terminal lobe. In three of the five N protein monomers, some residues on the extended loop, ranging from five to eight in each monomer, are disordered.

Fig. 1.

The structure of the RNP complex of VSV. (A) Overall structure of the decamer of the N protein–RNA complex. The decamer has a near 10-fold rotational symmetry; alternating monomers within the ring are colored red and blue. The 90-nt strand RNA is represented by a yellow tube following along the ribose-phosphate backbone. The C-terminal lobe of the boxed monomer has been removed. (B) Ribbon representation of the N protein monomer bound with a 9-nt strand of RNA shown in ball-and-stick model. The N protein has two lobes, N-terminal and C-terminal, colored green and yellow, respectively. The view is from the inside of the ring looking out. Ribbon drawings throughout the manuscript were illustrated with Ribbons (17).

Functionally, the N protein has multiple roles with regard to its interaction with RNA. RNA is a structural element that contributes to the stability of the assembled nucleocapsid. In the N-RNA complex structure, a cavity about 10 Å wide and 20 Å deep is formed at the junction of the two lobes of the N protein to enclose the RNA (Fig. 2A, fig. S2). Of the nine RNA bases that contact the N protein, bases 1 to 4 and base 6 point toward the solvent side of the cavity (Fig. 2, A and B), whereas bases 5, 7, and 8 face the protein interior. The final nucleotide, base 9, extends to join with nucleotide 1 from the adjacent N molecule. The overall structure of the RNA forms two individual quasi-helical structures (bases 1 to 4 and bases 5, 7, and 8) that are split by a bulge in the RNA at base 6. The bases have been modeled as the pyrimidine U. However, many of the bases have a density that is slightly larger, which suggests that the position could accommodate a purine base.

Fig. 2.

The RNA binding cavity. (A) The side chains of positively charged residues that interact with the phosphate groups in the backbone of the single-stranded RNA are labeled and are shown in ball-and-stick representation. The nine nucleotides that bind to one monomer are shown in the same style, with the pyrimidine bases shaded in cyan. Lobe colors and orientation are as in Fig. 1B. (B) A top-view of RNA bases stacking in the two structural motifs. Bases are numbered 1 to 9 [right to left with respect to the view in (A), corresponding to the 5′ to 3′ direction]. Bases 1 to 4 along with Tyr215 from a neighboring N molecule on the 5′ end of the RNA stack to form a structural motif similar to a half duplex of RNA and are facing away from the interior of the protein cavity. Nucleotides 5, 6, and 8 also have a base-stacking arrangement but are rotated to face the inside of the protein. The N-terminal lobe is represented by the surface contour (green), whereas the C-terminal lobe is removed to reveal the RNA content. The image was generated with PyMOL (18).

The interior of the N protein cavity is mainly hydrophobic, reflecting its role to accommodate the RNA bases from the second quasi-helix. However, the region that is occupied by the first quasi-helix displays many positively charged and polar residues that interact with the negatively charged RNA backbone (Fig. 2A). Residues whose side chains are involved in binding to the phosphate groups of the RNA are contributed by both lobes of the N protein. Residues from the N-terminal lobe include Arg143, Arg146, and Lys155, whereas the C-terminal lobe donates residues Lys286, Arg317, and Arg408 (Fig. 2A). A sequence alignment of several members of the family Rhabdoviridae shows that four of the six residues are conserved (Arg143, Arg146, Lys286, and Arg408) and one (Arg317) is only partially conserved (fig. S3). In addition, Arg214 and Arg312, two residues that are found in the RNA binding cavity but that do not make contact with the RNA in VSV, are conserved among all six viruses. Arg312 is involved in stabilizing the N protein–to–N protein interaction. Arg214 is adjacent to the RNA but is bonded to Asp199 by a salt-bridge. Perhaps with some rearrangement of its side-chain rotamer, Arg214 could be involved in RNA binding. Tyr215, which stacks against nucleotide 1 of the RNA to extend the first quasi-helix (Fig. 2B), is conservatively substituted in five of the six viruses as an aromatic residue, Phe or Trp. Each of these residues could stack in a similar way to what is observed in the structure.

The assembled RNP is composed of more than 1200 copies of the N protein bound to the genomic RNA. In this polymeric state, the monomeric N protein must interact with adjacent N protein molecules to maintain the protective stability of the nucleocapsid. The most substantial interaction (1954 Å2) occurs at the side-to-side interface between the neighboring N protein molecules with three-fourths of the contact between the C-terminal lobes. This interface is primarily hydrophobic, but with a number of electrostatic and polar interactions. Between the neighboring C-terminal lobes, the side chains of Tyr324 and Tyr415 form a hydrophobic patch that interacts with the hydrophobic portion of the side chain of Arg309 from the neighboring C-terminal lobe. The positive charge of the Arg309 side chain is neutralized by the side chain of Glu419 so that both side chains are buried in the hydrophobic interface. Between the N-terminal lobes, the hydrophobic interaction is mainly between a patch consisting of the side chain of Met166 and part of Lys207, and a shallow pocket formed by the side chain of Val184, part of Asn187, and part of Asp188 from the neighboring N-terminal lobe. There are three additional contacts made by the regions extended from the N monomer core that are clearly discernable in the RLC structure spanning four neighboring N molecules (labeled I, II, and III in Fig. 3). Each monomer has an extended N-terminal arm that interacts with the C-terminal lobe of the monomer to the left when viewed from the outside of the ring (I). The C-terminal lobe has an extended loop that interacts with the C-terminal lobe of the molecule to the right (II). Contact III is between the N-terminal arm of molecule 1 and the extended loop of molecule 2′ (Fig. 3). The network of contacts among the four molecules would be repetitive throughout the N-RNA complex. Such a complexity implies that the orderly assembly of the N protein on the nascent RNA during replication requires the N protein to be delivered and correctly oriented in the replication site. New N molecules that are added to the growing RNP that contains the nascent vRNA must be kept in an RNA binding–competent conformation by chaperone factors such as the viral P protein before latching onto N molecules previously bound to the vRNA. Upon assembling on the nascent chain of newly synthesized RNA during replication, the N protein is tightly wrapped around the vRNA. The hinge between the two lobes of the N protein observed in the crystal structure may provide the necessary flexibility to allow the N protein to adopt alternative conformational states at different stages of RNP assembly. The sequestering of the RNA within the RLC structure is consistent with data showing that RNA bound to the N protein is resistant to ribonuclease as well as base-catalyzed hydrolysis (5, 14, 16).

Fig. 3.

Association of the N protein. Four N molecules of the N-RNA ring (colored yellow, green, red, and blue) are shown from the outside-in view. The C terminus is distal to the view and is not visible from this orientation. A single N protein monomer has unique contacts, labeled I, II, and III, with three other monomers within the ring in addition to the side-by-side contacts between the neighboring molecules. Each monomer has an N-terminal and C-terminal lobe (top and bottom, respectively). The molecule in red is designated as molecule 1, and the N molecules along the direction of 5′→3′ of the vRNA in the RNP are sequentially labeled 2. The N molecules on the opposite side are labeled 1′, 2′, accordingly. The extended arm at the N terminus of the N-terminal lobe in molecule 1 makes contacts with the C-terminal lobe of molecule 1′ at contact I. The C-terminal lobe has an extended loop that interacts with the C-terminal lobe of the molecule to the right (Contact II). Contact III is between the N-terminal arm of molecule 1 and the extended loop of molecule 2′. RNA is not displayed in this illustration.

The radius (80 Å) of the decameric N-RNA ring is substantially smaller than that (245 Å) of the RNP superhelix in the virus, as determined by EM (10). The N protein–vRNA complex in the RNP may be required to have considerable flexibility in order to act like a coil being wound up in the superhelical core. The hydrophobic interfaces between neighboring N molecules could tolerate rigid body rotations that may expand the RLC. Nucleotides 1 to 8 should be associated with the N protein as a rigid body during the structural changes, on the basis of their tight binding with the N protein. Nucleotide 9 and the hinge between the two lobes of the N protein may be the points of notable rotation. We estimate that about 38.5 maneuvered N-RNA complex units are required to complete one round of the RNP structure with a radius of 245 Å.

The sequence of the vRNA in the RNP must be read by the polymerase complex during RNA synthesis. There are three possible mechanisms for copying the RNA sequence: (i) The vRNA is completely exposed in the RNP, so Watson-Crick base-pairing can occur without any change of the RNP structure. (ii) The vRNA is completely dissociated from the RNP, so it serves as a template like a naked RNA molecule. (iii) A pronounced structural change occurs in the RNP to allow the sequence of the vRNA to be read by the polymerase complex without disrupting the integrity of the RNP. The conformational arrangement of the RNA in the N protein as revealed by our structure suggests that Watson-Crick base-pairing could not occur when the N protein is closed on the vRNA. Bases in positions 5, 7, and 8 are completely shielded by the N protein such that their backbone conformation is held rigidly by the N protein, thus preventing the formation of an RNA duplex. The second possibility is also unlikely because the RNP remains intact after one round of RNA synthesis and may be used as the template again. Thus, it is likely that the vRNA is temporarily dissociated from the N protein within the active polymerase complex.

Supporting Online Material

www.sciencemag.org/cgi/content/full/1126953/DC1

Materials and Methods

SOM Text

Figs. S1 to S3

Table S1

References

References and Notes

View Abstract

Navigate This Article