Structural basis for RNA replication by the hepatitis C virus polymerase

See allHide authors and affiliations

Science  13 Feb 2015:
Vol. 347, Issue 6223, pp. 771-775
DOI: 10.1126/science.1259210

A view of the HCV polymerase at work

More than 3% of the world's population is infected with hepatitis C virus (HCV), a predisposing factor for life-threatening liver diseases such as cirrhosis and cancer. HCV encodes a polymerase called NS5B that catalyzes replication of the viral RNA genome. Drugs inhibiting NS5B have shown impressive antiviral activity in recent clinical trials. Appleby et al. (see the Perspective by Bressanelli) reveal the inner workings of HCV RNA replication by analyzing crystal structures of stalled NS5B polymerase ternary complexes during the initiation and elongation of RNA synthesis. They also define the way in which sofosbuvir, a drug with potent clinical efficacy, interacts with the NS5B active site.

Science, this issue p. 771; see also p. 715


Nucleotide analog inhibitors have shown clinical success in the treatment of hepatitis C virus (HCV) infection, despite an incomplete mechanistic understanding of NS5B, the viral RNA-dependent RNA polymerase. Here we study the details of HCV RNA replication by determining crystal structures of stalled polymerase ternary complexes with enzymes, RNA templates, RNA primers, incoming nucleotides, and catalytic metal ions during both primed initiation and elongation of RNA synthesis. Our analysis revealed that highly conserved active-site residues in NS5B position the primer for in-line attack on the incoming nucleotide. A β loop and a C-terminal membrane–anchoring linker occlude the active-site cavity in the apo state, retract in the primed initiation assembly to enforce replication of the HCV genome from the 3′ terminus, and vacate the active-site cavity during elongation. We investigated the incorporation of nucleotide analog inhibitors, including the clinically active metabolite formed by sofosbuvir, to elucidate key molecular interactions in the active site.

Hepatitis C virus (HCV) is a positive-sense, single-stranded RNA virus of the family Flaviviridae and genus Hepacivirus and is the cause of hepatitis C in humans (1). Long-term infection with HCV can lead to end-stage liver disease, including hepatocellular carcinoma and cirrhosis, making hepatitis C the leading cause of liver transplantation in the United States (2). Direct-acting antiviral drugs were approved in 2011, but they exhibited limited efficacy and had the potential for adverse side effects (3). The catalytic core of the viral replication complex, the NS5B RNA-dependent RNA polymerase (RdRp), supports a staggering rate of viral production, estimated to be 1.3 × 1012 virions produced per day in each infected patient (4). Because the NS5B polymerase active site is highly conserved, nucleotide analog inhibitors offer advantages over other classes of HCV drugs, including activity across different viral genotypes and a high barrier to the development of resistance (5, 6). The nucleotide prodrug sofosbuvir was recently approved for combination treatment of chronic HCV (7, 8).

One substantial obstacle for the rapid discovery of effective nucleotide-based drugs for HCV was the lack of molecular detail concerning substrate recognition during replication. NS5B contains several noncanonical polymerase elements, including a C-terminal membrane anchoring tail and a thumb domain β-loop insertion (911), that are implicated in RNA synthesis initiation (12). To gain insight into the mechanism of HCV RNA replication and its inhibition by nucleotide analog inhibitors, we determined atomic-resolution ternary structures of NS5B in both primed initiation and elongation states.

Because traditional approaches failed to yield ternary complexes (see the supplementary materials), we prepared multiple stalled enzyme-RNA-nucleotide ternary complex structures containing several designed features. First, we used NS5B from the JFH-1 genotype 2a isolate of HCV, which is extraordinarily efficient at RNA synthesis (13). Second, we exploited a conformational stabilization strategy that had been developed for structural analysis of G protein–coupled receptors (14). We hypothesized that a triple resistance NS5B mutant isolated under selective pressure of a guanosine analog inhibitor that exhibits 1.5 times the initiation activity of the wild type (15) might stabilize a specific conformational state along the initiation pathway. Indeed, this triple mutant exhibits a substantial structural rearrangement of the polymerase (15), which is consistent with the structural rearrangement observed in binary complexes of a β-loop deletion mutant bound to primer-template RNA (16). The triple mutant was able to incorporate native and nucleotide analog inhibitors with the RNA samples used in structure determination (fig. S1). The use of nucleotide diphosphate substrates rather than nucleotide triphosphates (fig. S2) generates stalled polymerase complexes in a catalytically relevant conformation. Ternary complexes could be obtained only with Mn2+, which lowers the Michaelis constant (Km) of the initiating nucleotide (17) and increases the activity of NS5B 20-fold relative to Mg2+ (18), and only with a nucleotide/Mn2+/double-stranded RNA ratio of 1.0/0.6/0.2. These approaches designed to stabilize the incoming nucleotide allowed for soaking experiments targeting several distinct assemblies.

Hepatitis C virus NS5B initiates RNA synthesis by a primer-independent mechanism. Two slow steps in the catalytic pathway have been identified, including the formation of an initial dinucleotide primer and the transition from the dinucleotide-primed state to a rapid, processive elongation state (19, 20). We obtained crystal structures of ternary complexes containing NS5B, two Mn2+ ions, an RNA template, and a dinucleotide primer by soaking nucleic acid and manganese ions into a previously elucidated apo crystal form (15) (see supplementary materials). Stalled ternary structures were obtained either with a 5′-pGG RNA primer and an incoming adenosine diphosphate (ADP) (2.2 Å resolution), cytidine diphosphate (CDP) (2.5 Å), or uridine diphosphate (UDP) (2.0 Å) (Fig. 1) or with a 5′-pCC RNA primer and an incoming UDP (2.15 Å) or guanosine diphosphate (GDP) (2.8 Å) (fig. S3 and tables S1 to S3). These results show that primed assemblies form via both purine and pyrimidine dinucleotide primers, with all possible natural ribonucleotides as incoming substrates.

Fig. 1 NS5B primed initiation assembly.

(A) Overall structure of the stalled NS5B ternary primed initiation complex. The protein is represented by ribbons and colored by subdomain (pink, fingers; light blue, palm; pale green, thumb), with the β loop highlighted in yellow and the position of the last visible residue at the C terminus labeled “C.” The RNA template (5′-UACC; cyan carbons), 5′-pGG dinucleotide primer (magenta carbons), and incoming UDP nucleotide (green carbons) are represented by sticks and colored according to atom type (red, oxygen; blue, nitrogen; orange, phosphate). The two catalytic Mn2+ ions are shown as purple spheres. (B) ϕ6 RdRp de novo initiation assembly, with the priming nucleotide colored by atom type with brown carbons, the incoming nucleotide colored by atom type with magenta carbons, and an active-site loop shown in yellow. (C) Primed initiation assembly of HCV, which is one catalytic step after the de novo initiation assembly. Thus, the dinucleotide primer is colored as in (B) after catalysis, with the next incoming nucleotide colored by atom type with green carbons. The 2|Fo| – |Fc| electron density map is contoured at 1σ and superimposed on the refined ligand and β-loop atoms.

The 5′-untranslated region of the HCV genome contains an internal ribosomal entry site, necessitating replication at the exact 3′ end of the viral genome before copyback of the (–) strand into a (+) strand. The NS5B thumb domain β-loop insertion has been proposed to position the 3′ terminus of the genomic template during initiation (12), yet the β loop and the C-terminal membrane–anchoring linker appear in a conformation too deep within the active site to be compatible with binding to the RNA template and incoming nucleotides (911). In the five primed initiation ternary assemblies presented here, the fingertips and thumb domains have undergone substantial rearrangements to accommodate the nucleic acid (fig. S4). Furthermore, the β loop retracted 5 Å relative to several apo HCV polymerase genotype 2a structures (13, 15), providing space for RNA replication initiation (Fig. 1A) (fig. S4). These structures reveal molecular details of a platform for RNA synthesis in which the tip of the β loop now buttresses the end of the short RNA duplex. A similar global arrangement has been observed previously in the de novo initiation assembly of RNA bacteriophage ϕ6 RdRp (21) (Fig. 1B), which illustrates the catalytic event preceding the primed initiation state of HCV (Fig. 1C). These β-loop interactions appear critical for setting the register to ensure that the polymerase initiates transcription at the 3′ end of the viral genome. In general, the β-loop residues exhibit increased temperature factors and contain weaker electron density compared with the apo enzyme, indicating that the β loop starts to become disordered during primed initiation (fig. S5). In addition to retraction of the β loop, the C terminus vacated the active-site cavity now occupied by nucleic acid and appeared disordered beyond residue T552 (22) in the primed assemblies. These movements generate space to accommodate only two Watson-Crick pairs upstream of the incoming nucleotide, suggesting that further conformational changes are required to accommodate additional phosphotransfer and translocation events. This includes, at a minimum, an opening of the thumb domain via reorientation of the β loop. Thus, these structures demonstrate the polymerase assembly before the second slow step in RNA replication (19).

Mutational analysis of NS5B revealed R386 of the primer grip motif to be important for dinucleotide-initiated RNA synthesis (12), and the primed-state assemblies show that both R386 and R394 of the primer grip helix form salt bridges with the 5′-phosphate of the dinucleotide primer (Fig. 2A). The conserved catalytic residues D220, D318, and D319 coordinate the two catalytic Mn2+ ions, which in turn coordinate the α and β phosphates of the incoming nucleotide. Conserved basic residues R48 and R158 coordinate the α and β phosphates opposite the metal ions. The incoming nucleotide forms a Watson-Crick interaction with the pairing residue of the template strand, which packs against conserved hydrophobic residues I160 and Y162 (Fig. 2B). The 3′-hydroxyl of the dinucleotide primer forms an inner-sphere coordination with a catalytic metal ion in an in-line conformation with the scissile bond of the incoming nucleotide, nearly identical to that observed for the Norwalk virus RdRp (23) (Fig. 2C). Thus, the nucleotide diphosphates exhibit enzymatically competent conformations consistent with the common polymerase mechanism (24) (Fig. 2D).

Fig. 2 Recognition of the incoming nucleotide.

(A) Stereoscopic view of the NS5B active site during primed initiation. Select protein atoms are represented by sticks and are colored by atom type with gray carbons, except for the β-loop residues, which are highlighted in yellow. RNA bases are labeled according to standard polymerase numbering conventions. Protein-ligand hydrogen bonds are shown as gray dashed lines, whereas base-pair hydrogen bonds are shown as red dashed lines. The proposed path of in-line attack by the 3′-hydroxyl on the α phosphate of the incoming nucleotide is illustrated by a green dashed line. (B) Close-up view of the RNA template binding site. (C) Comparison of the 3′ end of RNA primers, metal ions, and incoming nucleotides from the Norwalk virus polymerase ternary complex containing cytidine triphosphate (CTP) as the incoming nucleotide [PDB ID: 3BSO (23)] and the NS5B ternary complex containing UDP as the incoming nucleotide. The Norwalk structure atoms are colored according to atom type with yellow carbons and gold metal ions, whereas the NS5B ternary complex atoms are colored by atom type with green carbons and purple metal ions. (D) Common chemical mechanism of polymerases (24). (E) Molecular mechanism for recognizing ribonucleotide substrates. Protein atoms of the apo enzyme are colored with gray carbons, whereas protein atoms of the substrate complex are colored with yellow carbons. Dashed lines represent the hydrogen bonding network formed upon binding to an incoming ribonucleotide.

In HIV-1 reverse transcriptase, Y115 provides specificity for deoxynucleotide triphosphates by serving as a steric gate to prevent the binding of ribonucleotide triphosphates (rNTPs) (25), and it was predicted that the structurally homologous residue in HCV, conserved D225, would be involved in recognition of the 2′-hydroxyl of incoming rNTPs (911). In the NS5B ternary complexes, the main chain of conserved S282 flips, allowing its side chain to swing out and hydrogen bond with the 2′-hydroxyl of the substrate and the carboxylic acid of D225, which moves away from the nucleotide substrate during binding (Fig. 2E). The 2′-hydroxyl of the incoming ribonucleotide also forms a direct hydrogen bond with the side chain amine of N291 on the opposite face of the ribose ring. This network of hydrogen bonds, together with complementary base-pairing to the template, provides the structural basis for recognition of the correct ribonucleotide substrate.

Crystal structures of a β-loop deletion construct of the HCV NS5B polymerase were solved as apo (2.5 Å resolution) or via soaking (see supplementary materials and methods) with a symmetrical RNA primer-template pair (16) and an incoming UDP (2.8 Å), CDP (2.75 Å), ADP (2.7 Å), or GDP (2.9 Å) (Fig. 3, figs. S6 to S8, and tables S4 to S6). These ternary complexes probably represent the highly processive elongation phase of viral genome replication after the transition from the primed state in the second slow step of polymerization (19). These high-resolution elongation state structures were obtained via soaking into the same crystal form as the triple-mutant structures with the intact β loop but could only be obtained with a construct containing both the triple mutant (15) and the β-loop deletion (16) (see supplementary materials). Overall, there is excellent overlap between the catalytic residues, the 3′ end of the primer, and the incoming nucleotide when comparing the elongation complexes with the primed initiation assemblies, including the same in-line conformation of the 3′-hydroxyl of the primer with the scissile bond. The thumb domain moved away from the palm and fingers domains by an additional 1.5 Å for similar Cα atoms, demonstrating a slightly more relaxed state of the polymerase during elongation. In addition, the C-terminal residues downstream of A534 have evacuated the RNA binding groove and become disordered, preventing overlap with the template strand. Thus, these structures provide further evidence for concerted movements of the β loop, the thumb domain, and the C terminus once RNA elongation begins. Moreover, they provide the structural basis for the hypothesis that these elements provide a “swinging gate” that allows the polymerase to initiate at the terminus of the RNA genome and then transition to a processive elongation state, thereby replicating the complete genome (12).

Fig. 3 NS5B elongation assembly and 2′ modified nucleotide inhibitor recognition.

(A) Overall structure of the stalled NS5B ternary elongation complex. The protein is represented by ribbons and is colored and labeled as in Fig. 1A. The sequence and position of the truncated β loop (16) are highlighted in yellow. The self-complementary RNA is depicted with the template strand in cyan and the primer strand in magenta and numbered according to convention; residues lacking electron density are depicted in gray. The incoming UDP nucleotide is represented in green. The two catalytic Mn2+ ions are shown as purple spheres. (B) Close-up view of the active site. Substrates are colored as in (A), and select protein residues are represented by sticks, colored by beige carbons, and labeled accordingly. The hydrogen bonding network involved in 2′-hydroxyl recognition of the incoming nucleotide is indicated with dashed lines. (C to E) Close-in views comparing the active sites with (C) UDP (gray carbons), (D) 2′-OH/2′-CH3-UDP (yellow carbons), or (E) 2′-F/2′-CH3-UDP (diphosphate metabolite of sofosbuvir; brown carbons). The hydrogen bond networks involved in recognizing the 2′-hydroxyl of ribonucleotide substrates are shown by dashed lines. Binding of 2′-F/2′-CH3-UDP reveals a disruption in the normal hydrogen bonding pattern observed for natural nucleotide substrates and 2′-OH/2′-CH3–containing analogs.

The crystal structures presented here lead us to propose a model of the structural events involved in HCV genome replication (Fig. 4). At the outset, the β loop and the C-terminal membrane–anchoring linker are buried within the encircled active-site cavity. In the first of two slow steps in HCV RNA replication (19), the 3′ end of the viral RNA template and the incoming nucleotides enter the active site, possibly with accompanying conformational changes, and the initial phosphoryl transfer step generates a dinucleotide primer. This de novo initiation step immediately precedes the primed initiation assembly captured here. At this early stage, the complex remains unstable, which may account for the observed large quantity of two- to four-nucleotide-long abortive transcripts (26, 27). As the dinucleotide primer is extended by another one to three nucleotides, the build-up of tension displaces the β loop and the remaining C-terminal residues, further opening the cavity and allowing the RNA duplex to exit during the second slow step in replication (19). With both the β-hairpin loop and the C terminus expelled from the active-site cavity, the polymerase transitions into the highly processive elongation mode also captured here.

Fig. 4 Model of HCV replication by NS5B.

Schematic of representative steps during RNA synthesis by HCV NS5B RdRp. In the apo form, a portion of the RNA binding groove is occluded by both the C terminus (blue line) and the β loop (yellow). During de novo initiation, the RNA template and incoming and priming nucleotides enter the active site. Catalysis results in the formation of an initial dinucleotide primer (first slow step). The next nucleotide is incorporated into the dinucleotide primed initiation assembly. De novo initiation and primed initiation are often referred to collectively as “initiation,” although they are distinct states. Further conformational changes result in movement of the β loop and C terminus out of the RNA binding groove (second slow step), allowing the enzyme to transition into a processive elongation state.

By using an extensive hydrogen bond network to recognize the 2′-hydroxyl of the incoming nucleotide (Fig. 3C), HCV NS5B displays stringent selectivity for ribonucleotides. Consequently, 2′-deoxyribonucleotide chain terminators such as azidothymidine are ineffective against HCV, whereas 2′-modified nucleotides are effective HCV antivirals (6). The nucleotide analog inhibitor sofosbuvir is a 2′-F-2′-C-methyluridine monophosphate prodrug (2830) and is approved for the treatment of chronic HCV infection (7, 8). Efficacy of chain-terminating nucleotide analogs requires viral RdRps to recognize and successfully incorporate the inhibitors into the growing RNA strand. To gain insight into the molecular details of 2′-modified nucleotide analog recognition, we determined elongation-phase ternary complexes of both 2′-OH/2′-CH3-UDP and 2′-F/2′-CH3-UDP at 2.65 and 2.9 Å resolution, respectively (table S7). The stalled ternary complex with 2′-OH/2′-CH3-UDP as the incoming nucleotide was essentially identical to the UDP-bound elongation assembly, with S282 undergoing the same conformational change to interact directly with the 2′-hydroxyl group and hydrogen bond with D225 (Fig. 3D). Although the addition of the 2′-C-methyl group of the inhibitor places it within 3.1 Å of the S282 Oγ, previous biochemical studies using 2′-C-methyl nucleotides reveal that these analogs are readily incorporated into the growing chain with Km values approaching those of the natural ribonucleotide substrates (31). In contrast, the trapped elongation assembly containing 2′-F/2′-CH3-UDP (i.e., sofosbuvir diphosphate) as the incoming nucleotide reveals that the hydrogen bonding network is disrupted (Fig. 3E). D225 is oriented away from the incoming nucleotide, but S282 remains in the same conformation as in the apo enzyme. The loss of the hydrogen bonding network involving S282 results in a substantially higher Km for 2′-F/2′-CH3–modified nucleotides. Nevertheless, recognition of the 2′-F by N291 and Watson-Crick pairing with the template allow sofosbuvir to form the in-line conformation necessary for incorporation into the growing chain, thereby promoting nonobligate chain termination. Key contacts formed by S282 with the incoming nucleotide and the surrounding environment give insight into the in vitro selection of a threonine as a potential resistance mutation to some 2′-CH3–modified nucleotides (32, 33), although S282→T282 has been infrequently observed in the clinic (34). In particular, a steric clash between the T282 side chain and the 2′-CH3 would be predicted based on the structure determined with 2′-OH/2′-CH3-UDP.

The results presented here define the structural requirements for HCV genomic replication from primed initiation to elongation and demonstrate the structural basis for inhibitor recognition. The primed initiation state explains much of the known biochemistry behind the slow steps in the enzymatic pathway and highlights the differences between Flaviviridae RdRps and other polymerases. These structural and mechanistic differences have been exploited for the development of HCV nucleotide therapeutics that feature pangenomic activity and a high barrier to the development of drug resistance. Thus, they may provide an avenue for the development of therapeutics against related viruses.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S8

Tables S1 to S7

References (3543)

References and Notes

  1. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.
  2. Acknowledgments: We thank D. Smith at the Advanced Photon Source, Argonne National Labs, Life Sciences Collaborative Access Team (LS-CAT), for assistance in data collection, as well as our co-workers from Pharmasset, Beryllium, and Gilead Sciences, who have participated on the HCV polymerase collaboration. Coordinates and structure factors have been deposited with the Protein Data Bank under accession codes 4WT9, 4WTA, 4WTC, 4WTD, 4WTE, 4WTF, 4WTG, 4WTI, 4WTJ, 4WTK, 4WTL, and 4WTM. Expression plasmids are freely available from the authors. Gilead Sciences and the authors (T.E.E. and E.M.) have filed international patent application number PCT/US2013/021130, which relates to crystal structures of HCV polymerase complexes and their methods of use.
View Abstract

Stay Connected to Science

Navigate This Article