The piRNA targeting rules and the resistance to piRNA silencing in endogenous genes

See allHide authors and affiliations

Science  02 Feb 2018:
Vol. 359, Issue 6375, pp. 587-592
DOI: 10.1126/science.aao2840

Self-defense by avoiding self-targeting

By silencing transposons, Piwi-interacting RNAs (piRNAs) protect the stability of animal genomes in germ lines. However, many piRNAs do not map to transposons, and their functions have remained undefined. Zhang et al. described the piRNA targeting logic in Caenorhabditis elegans and identified an intrinsic sequence signal in endogenous germline genes that confer resistance to piRNA silencing. Thus, diverse piRNAs silence foreign nucleic acids but spare self genes to defend the C. elegans genome. In addition, multiple foreign transgenes can be engineered to escape piRNA targeting, allowing successful expression in the germline.

Science, this issue p. 587


Piwi-interacting RNAs (piRNAs) silence transposons to safeguard genome integrity in animals. However, the functions of the many piRNAs that do not map to transposons remain unknown. Here, we show that piRNA targeting in Caenorhabditis elegans can tolerate a few mismatches but prefer perfect pairing at the seed region. The broad targeting capacity of piRNAs underlies the germline silencing of transgenes in C. elegans. Transgenes engineered to avoid piRNA recognition are stably expressed. Many endogenous germline-expressed genes also contain predicted piRNA targeting sites, and periodic An/Tn clusters (PATCs) are an intrinsic signal that provides resistance to piRNA silencing. Together, our study revealed the piRNA targeting rules and highlights a distinct strategy that C. elegans uses to distinguish endogenous from foreign nucleic acids.

P-element induced wimpy testis (Piwi) proteins and their associated Piwi-interacting RNAs (piRNAs) function as a guardian of animal genomes through transposon silencing in various animals (15). However, many animals produce piRNAs that do not match transposon sequences. For example, the vast majority of the 15,000 piRNAs encoded by the Caenorhabditis elegans genome do not exhibit extensive complementarity to transposons (3, 4, 6). In mice, tens of thousands of distinct piRNAs produced at the pachytene stage during spermatogenesis do not map to transposons (7). These observations suggest additional targets and functions of piRNAs.

Identification of piRNA targets and the piRNA targeting rules has proven to be rather difficult. Cross-linking immunoprecipitation (CLIP) analyses of Piwi proteins suggest that they associate with diverse mRNAs (810). However, because diverse piRNAs engage with many mRNAs, it is difficult to infer the target of a given piRNA from these CLIP analyses. Therefore, additional approaches are required to identify piRNA sites in vivo. In some cases, targets of piRNAs can be inferred if the mRNA target is cleaved by Piwi (11, 12). However, these cleaved mRNAs likely present only a fraction of piRNA targets in vivo because the slicer activity of Piwi is dispensable for silencing in some animals, including C. elegans (1316). Because only few piRNA targets other than transposons have been identified, the piRNA targeting rules remain undefined, and both sequence-specific and sequence-nonspecific functions of the Piwi/piRNA complex have been proposed (810, 12, 17, 18).

To gain insight into the piRNA targeting mechanism, we identified the targets of a single piRNA and examined how the piRNA recognizes its targets. In C. elegans, piRNA targeting leads to the recruitment of RNA-dependent RNA polymerases (RdRPs) that produce secondary small RNAs named 22G-RNAs (fig. S1A) (3, 13, 15). These 22G-RNAs are loaded onto worm-specific Argonautes (WAGOs) to induce gene silencing (1921). Because these 22G-RNAs are produced around the targeting site, the 22G-RNAs can serve as a “signature” for piRNA targeting sites in vivo (13, 15). Therefore, we identified the targets of a piRNA by examining the 22G-RNA species gained in animals expressing a synthetic piRNA or the 22G-RNA species lost in animals carrying a deletion of a specific piRNA (fig. S1B). We obtained animals expressing a synthetic piRNA or losing an endogenous piRNA through a CRISPR/Cas9–based genome-editing strategy that modified the locus of an endogenous piRNA (fig. S1B). Small-RNA sequencing confirmed the expression or loss of specific piRNAs in these animals (fig. S1, C to F) and was used to identify changes in 22G-RNA levels. Together, we identified six RNA targets in the animals producing the synthetic piRNAs and 11 RNA targets in animals lacking the endogenous piRNAs (Fig. 1, A to C, and table S1). We noticed that a region of the piRNAs, from the second to seventh nucleotide, pairs well to the identified targeting sites (Fig. 1D). This implies a critical role for the pairing of this region in piRNA targeting, which we define as the piRNA seed. The piRNA seed is reminiscent of the microRNA (miRNA) seed, which is essential for miRNA target recognition (22). In addition, we observed apparent pairing outside of the piRNA seed region (Fig. 1D and fig. S1G). These observations suggest that base pairing outside of the seed region also contributes to piRNA target recognition, but a few mismatches can be tolerated. Furthermore, we noticed that GU wobble pairs are over represented, relative to other non–Watson-Crick pairs, in these targeting events (table S1). Last, the first nucleotide does not appear to contribute to piRNA targeting (Fig. 1D).

Fig. 1 22G-RNA loci as a proxy to identify the targets of specific piRNAs.

(A) An example of 22G-RNA distributions at one of the RNA targets of the synthetic piRNA (GFP-targeting piRNA #1) in the indicated strains with biological replicates. Each pink bar indicates the first nucleotide position and abundance of 22G-RNAs. The red bar indicates the position targeted by the synthetic piRNA. rpm, reads per million. (B) A scatter plot showing the abundance of 22G-RNAs around each potential targeting site [100-nucleotide (nt) window centered with each target site] of the synthetic piRNA (GFP-targeting #1) in the control strain and in the strain expressing the synthetic piRNA. The potential targeting sites are sites of RNA transcripts that pair to the specific piRNA with six or fewer mismatches. Marked in red are sites at which 22G-RNA levels increased more than fourfold in the strain expressing the synthetic piRNA relative to the control strain. (C) A scatter plot showing the abundance of 22G-RNAs at each potential targeting site of 21U-RNA-X1 in the N2 (wild-type) strain and in the strain containing a deletion of the 21U-RNA-X1 coding loci. Marked in green are sites at which 22G-RNA levels decreased more than 75% in the strain loss of 21U-RNA-X1 relative to the N2 wild type. (D) The pairing between piRNAs and identified targets. (Top) Examples of pairings between the piRNAs and their targets. (Bottom) A bar graph showing the percentage of base pairing at each position within the piRNAs with all 17 identified targets. GU wobble pairing is considered as paired here in order to highlight the near-perfect pairing at the seed region when all GU pairs are allowed.

In light of these findings, we developed a piRNA reporter assay so as to gain further insights into the piRNA targeting rules. In this assay, we examined whether synthetic gfp-targeting piRNAs with various mismatches to the green fluorescent protein (GFP) sequence can trigger the silencing of an expressed GFPdpiRNA::CDK-1 transgene (dpiRNA stands for depletion of piRNA targeting sites, in which the GFP sequence has been recoded to avoid silencing by endogenous piRNAs) (Fig. 2A, left, and fig. S2A). Because we noticed that the synthetic piRNAs can be produced from animals carrying extrachromosomal arrays with synthetic piRNA loci, we chose this method to systemically produce various gfp-targeting piRNAs (fig. S2A). We observed that GFPdpiRNA::CDK-1 was silenced in the animals expressing synthetic piRNAs that are perfectly complementary to GFP mRNAs or contain two mismatches outside of the piRNA seed region (Fig. 2, A to C, and fig. S2B). On the contrary, we failed to observe the silencing of GFPdpiRNA::CDK-1 when one or two mismatches were located at the piRNA seed region (Fig. 2, B and C, and fig. S2B). In addition, our reporter assay suggests that piRNAs tolerate up to three nonseed mismatches but not RNA bulges (fig. S2, C to E). We also observed that one GU wobble pair is tolerated in the seed region, and GU pairs are moderately more tolerated than mismatches in the nonseed region (fig. S2, C and D). Last, we obtained consistent results in our reporter analyses using gene-edited worms that express synthetic gfp-targeting piRNAs from an endogenous piRNA locus (fig. S2F). Overall, our reporter assay revealed a similar but more stringent piRNA-targeting logic than that from our analyses of synthetic piRNA targets. Together, our analyses suggest that piRNA targeting in C. elegans prefers near-perfect pairing at the piRNA seed region. In addition, supplementary pairing outside of the seed region also contributes to piRNA targeting, but few mismatches are tolerated.

Fig. 2 A piRNA reporter assay to investigate the piRNA targeting rules.

(A) Fluorescence micrographs showing the expression of transgene GFPdpiRNA::CDK-1 in worms carrying an extrachromosomal array to express the gfp-targeting piRNA with perfect pairing or in the control strain that does not express the synthetic piRNA. Arrows indicate the germline nuclei with expressed transgene. Circles indicate the germline nuclei with silenced transgene. The unmarked green fluorescent signals are autofluorescent signals generated from worm intestinal granules. (B) The sequences of the gfp-targeting piRNAs, the positions of the mismatches (red), and their effects on the expression of GFPdpiRNA::CDK-1. Asterisk indicates gfp-targeting piRNAs produced by gene-edited animals modified at an endogenous piRNA locus (21U-5499). (C) Percentage of transgenic animals that exhibit the silencing of GFPdpiRNA::CDK-1 in animals expressing specific gfp-targeting piRNAs. At least eight independent strains carrying extrachromosomal arrays (roller) are examined for each piRNA.

It has been known for decades that transgenes carrying various foreign sequences, such as GFP or mCherry, are frequently silenced in the germline of C. elegans (23, 24). A previous study has shown that Piwi protein PRG-1 is required for the silencing of the transgene GFP::CDK-1 (19). If piRNAs recognize GFP sequences, then removal of piRNA targeting sites from the GFP sequences should allow transgene GFP::CDK-1 expression in the germline. To predict more piRNA targeting sites on transgenes, we used the relaxed piRNA targeting criteria similar to those derived from our analyses of synthetic piRNA targets (Data File S1, algorithms of target prediction). These criteria predicted 17 piRNA targeting sites on GFP mRNA (Fig. 3A and table S2). We introduced silent mutations in the GFP sequences so that we no longer identified piRNA targeting sites, yielding the recoded-GFPdpiRNA sequences. Remarkably, although the GFP::CDK-1 transgene is always silenced in the germline of wild-type animals, we observed strong GFP expression from all five independent transgenic strains that we obtained with recoded GFPdpiRNA::CDK-1 inserted at the same locus (Fig. 3B).

Fig. 3 Silencing-prone transgenes can be expressed in the germline by avoiding piRNA targeting.

(A) Predicted piRNA sites in GFP mRNA sequence. The numbers of piRNA sites that contain different types of mismatches are shown. The relaxed criteria are used to predict piRNA sites on transgenes: All GU wobble pairing is allowed (considered as paired), and up to three non-GU mismatches are allowed when sites have perfect seed pairing, or up to one non-GU mismatches are allowed when sites have one non-GU mismatches in the seed region. The mismatch at the first nucleotide of a piRNA is not counted or considered. (B) The expression of original GFP::CDK-1 that contains the predicted piRNA targeting sites, or the modified GFPdpiRNA::CDK-1 where all predicted piRNA sites have been removed by introducing silent mutations (right). Arrows indicate the germline nuclei with expressed GFPdpiRNA::CDK-1. Circles indicate the germline nuclei with silenced GFP::CDK-1. (C) Predicted piRNA sites in mCherry mRNA sequence (left). (D) The expression of original mCherry::ANI-1681-1159 that contains the predicted piRNA targeting sites, or the modified mCherrydpiRNA::ANI-1681-1159 in which the predicted piRNA sites have been removed by introducing silent mutations (right). Arrows indicate the expression of mCherrydpiRNA::ANI-11681-1159 at cleavage furrows of the one-cell embryo. (E) Predicted piRNA sites in Cas9 mRNA sequence. (F) A schematic showing the procedure followed to examine if genome editing occurs in transgenic animals that carry the original or modified Cas9 transgenes. Plasmids containing unc-22 sgRNA and rol-6(su1006) dominant transformation marker plasmid are coinjected into transgenic animals that have been carrying the Cas9 transgene for more than five generations. F1 transformed roller animals are picked, and their F2 progeny are scored for unc-22 gene editing through twitcher phenotype. (G) Sequences of various unc-22 edited alleles obtained in the animals carrying the modified Cas9 transgene injected with plasmid encoding unc-22 sgRNA. Indels are highlighted in red.

To test whether our approach can be generally applied to other transgenes in order to avoid gene silencing, we chose to modify the mCherry-tagged C-terminal region of Anillin (mCherry::ANI-1681-1159), another transgene that is always silenced in the germline (25). We predicted 10 piRNA targeting sites in mCherry mRNA and introduced silent mutations to disrupt predicted piRNA targeting sites (Fig. 3C and table S2). Whereas the original mCherry::ANI-1681-1159 transgene was silenced in all six transgenic lines, the modified mCherrydpiRNA::ANI-1681-1159 was robustly expressed at the cleavage furrow of the one-cell embryo in all six transgenic lines we obtained (Fig. 3D).

As a last test, we applied this approach to modify Cas9 sequences. Transgenic C. elegans strains stably expressing Cas9 have not been successfully obtained (26). Again, we introduced silent mutations in order to remove all predicted piRNA targeting sites (Fig. 3E and table S2) and obtained transgenic animals carrying the original or the modified Cas9 transgene. To test whether Cas9 is stably expressed, we injected the transgenic animals with an unc-22 single-guide RNA (sgRNA)–expressing plasmid and a rol-6(su1006) plasmid as a dominant transformation marker (Fig. 3F). The animals carrying unc-22 mutations exhibit a visible twitcher phenotype and can be easily identified (27). Out of 30 F1 transformed progeny (roller), nine animals produced F2 twitcher progeny from animals carrying the modified Cas9dpiRNA transgene, whereas no F2 twitcher progeny were observed from animals carrying the original Cas9 transgene. DNA sequencing of these F2 twitcher animals confirmed that they carry various unc-22–edited alleles (Fig. 3F). These observations functionally demonstrate that the modified Cas9dpiRNA transgene is stably expressed and thus can create edited alleles. Taken together, these experiments verify that our predictions of piRNA targeting sites encompass the critical sites that trigger gene silencing.

We next wondered whether endogenous germline genes have evolved to avoid piRNA recognition. Previous studies have shown that most C. elegans germline transcripts are targeted by either WAGO Argonaute–associated 22G-RNA, which correlates with silencing of the transcript, or CSR-1 Argonaute–associated 22G-RNA, which correlates with expression of the transcript (28, 29). Using stringent piRNA targeting criteria corresponding to the ones derived from our reporter analyses, we predicted that around half of germline-expressed genes (CSR-1 targets), as well as germline-silenced genes (WAGO-1 targets), contain at least one piRNA targeting site (Fig. 4A and fig. S3A), which is sufficient for silencing, at least in our gfp reporter assay. In addition, the density of piRNA targeting sites in germline-expressed genes is only slightly less than that of somatic-specific genes and control sequences (fig. S3B). Although the germline-silenced genes contain more predicted piRNA sites than germline-expressed genes, such differences alone cannot explain why only one set of genes is silenced (fig. S3B). Taken together, these predictions implied that some endogenously expressed germline genes are resistant to piRNA silencing. To test this hypothesis, we obtained animals that produce synthetic piRNAs that are perfectly complementary to several of these genes, including pie-1, nop-1, cdk-1, and oma-1. We engineered these piRNAs using the same locus (21U-5499) we used for producing gfp-targeting piRNAs, and these synthetic piRNAs were expressed at similar levels (Fig. 4B). In the animals expressing synthetic piRNAs targeting endogenous genes, we did not observe a reduction of mRNA levels or the phenotypes associated with silencing of these genes (Fig. 4B and fig. S4, A to D). In addition, no phenotype associated with silencing was observed in animals expressing any of six additional synthetic piRNAs that target various regions of the pie-1 or nop-1 genes (fig. S4, A and B). This is in stark contrast to the animals expressing seven distinct gfp or mCherry-targeting piRNAs, which all trigger potent silencing of GFPdpiRNA::CDK-1 or mCherrydpiRNAANI-1, respectively (Fig. 4B and figs. S4, E and F). Together, our results suggested that at least some endogenous germline genes exhibit resistance to piRNA-mediated gene silencing in C. elegans.

Fig. 4 Germline-expressed genes exhibit resistance to piRNA silencing through their intrinsic signals, such as PATCs.

(A) Numbers of predicted piRNA sites on germline-expressed RNA transcripts. To predict more confident targeting sites, the stringent piRNA targeting criteria are used here, in which up to one GU wobble pair was allowed in the seed region, and overall only up to two mismatches plus an additional GU mismatch were allowed. In addition, the mismatch at the first nucleotide of the piRNA is not counted or considered. The RNA targets of CSR-1 Argonaute (CSR-1 targets) are used to define the germline-expressed genes. (B) Quantitative reverse transcription polymerase chain reaction measurements of (left) the abundance of the synthetic piRNAs in comparison with the level of endogenous 21U-5499 (value = 1) in the control strain and (right) the expression levels of corresponding mRNA targets in the indicated strains. nop-1–, cdk-1–, and oma-1–targeting piRNAs were produced by gene-edited animals, whereas pie-1–targeting piRNAs were produced by animals carrying extrachromosomal arrays. Error bars represent standard error of the mean from biological duplicated samples. The statistics for synthetic piRNA expression were calculated by comparing the levels of specific piRNAs and 21U-5499. n.s., not significant; *P < 0.05, **P < 0.01, ***P < 0.001, Student’s t test. (C) A box and whisker blot showing the density of PATC in the germline-specific and somatic-specific genes. ***P < 0.001, Wilcoxon rank sum test. (D) The density of 22G-RNAs within a 100-nt window around predicted piRNA target sites of germline-specific transcripts with high PATC density (PATC > 50) or low PATC density (PATC < 10). The plots are centered at the sequence targeted by piRNAs (green). The stringent piRNA targeting criteria were used here to predict piRNA target sites. n = number of predicted piRNA sites. (E) The box-and-whisker plots showing the number of predicted piRNA-targeted sites on germline-expressed genes that contain the indicated range of PATC density. The stringent piRNA targeting criteria were used here to predict piRNA target sites. n.s., not significant; ***P < 0.001, Wilcoxon rank sum test. (F) Fluorescence micrographs showing the expression of the original mCherry::ANI-1681-1159 harboring synthetic introns (no PATC) and mCherryPATC::ANI-1681-1159 harboring PATC-containing introns. (G) 22G-RNA distribution at mCherry coding sequence of the indicated transgenes. Each bar indicates the first nucleotide position and abundance of 22G-RNAs. The red bars mark the location of piRNA targeting sites predicted by using the relaxed piRNA targeting criteria.

Previous studies have proposed that CSR-1 Argonaute–associated 22G-RNAs may form an epigenetic memory of “self” to promote gene expression in the germline (30, 31). We therefore examined whether the nop-1–targeting piRNA can trigger nop-1 silencing in csr-1 mutants. In either csr-1 F2 one-cell embryos with or without the treatment of csr-1 RNAi, we did not observe that the synthetic piRNA conferred the phenotype associated with nop-1 silencing (fig S5A). To further test whether the piRNA resistance of endogenous genes is mediated by epigenetic signals, we used a Cas9-based gene-editing strategy to delete the nop-1 gene and its untranslated regions (UTRs) from the genome, which would remove all chromatin-based signals as well as the DNA/RNA templates that are required to produce CSR-1 22G-RNAs targeting nop-1 (fig S5B). We then reinserted the nop-1 gene back to its original locus or to the locus where our silenced transgenes are inserted. The reinserted nop-1 gene remained resistant to silencing by endogenous or synthetic nop-1–targeting piRNA (fig. S6, C and D). Although these data represented negative results, our analyses provided no evidence for epigenetic mechanisms in licensing germline gene expression.

We therefore investigated whether intrinsic sequences of germline genes provide resistance to piRNA silencing. One such candidate is 10-base periodic An/Tn clusters (PATCs), an intrinsic DNA sequence element found in the introns or promoters of some germline genes in C. elegans (32). A recent study has reported that PATCs can promote the expression of transgenes inserted at heterochromatin in C. elegans, but whether PATCs can provide resistance to piRNA silencing has not been explored (33). We found that PATCs are enriched in germline genes and particularly enriched in the germline-expressed genes (Fig. 4C and fig. S6, A and B). To examine the global effect of PATCs on piRNA silencing, we compared the local 22G-RNA distribution at predicted piRNA sites in germline-specific transcripts with high or low PATC density. We observed that local 22G-RNAs accumulated around the targets only for genes with low, but not high, PATC density (Fig. 4D and fig. S6, C and D). These observations imply that PATCs negatively affect the ability of the piRNA pathway to induce and/or maintain 22G-RNAs at the piRNA targeting sites. Furthermore, we observed that germline-specific genes of higher PATC density contain more predicted piRNA targeting sites than genes with lower PATC density (Fig. 4E and fig. S7, A and B), implying that PATCs allow the expression of germline genes despite harboring piRNA targeting sites. Last, if PATCs can provide resistance to piRNA silencing, insertion of PATC introns to a silencing-prone transgene should license its expression. Indeed, replacing the mCherry introns of the mCherry::ANI-1 transgene with PATC-containing introns from the smu-1 gene led to the stable expression of the transgene (Fig. 4F). Small-RNA sequencing further showed that dramatically fewer mCherry antisense 22G-RNAs are produced in the worms carrying mCherryPATC::ANI-1681-1159 than in those carrying the original mCherry::ANI-1681-1159 (Fig. 4G). Together, our findings suggest that PATCs act as a licensing signal that provides resistance to piRNA silencing.

Overall, our study revealed the piRNA targeting logic in C. elegans. In addition, our research suggested that diverse piRNAs can recognize and silence various foreign nucleic acids because of their broad targeting capacity. Because several different modes of miRNA targeting have been described in animals and plants (22, 3438), additional modes of piRNA-targeting are likely to exist as well. Nonetheless, our study demonstrated that piRNA-mediated gene silencing underlies the transgene silencing phenomenon in the germline of C. elegans and provided a simple solution to achieve transgene expression by avoiding piRNA recognition.

Our study showed that many endogenous genes also contain piRNA targeting sites but exhibit resistance to piRNA silencing. Our analyses suggested PATCs to be a licensing signal protecting endogenous genes from piRNA silencing. How PATCs counter against piRNA silencing remains unknown. A recent study showed that PATCs are enriched in germline genes within repressive chromatin domain, suggesting that PATCs may prevent piRNAs from establishing heterochromatin at their target (33). Our data suggest that PATCs function not simply by promoting euchromatin formation but also by inhibiting the production of 22G-RNA at piRNA targeting sites (Fig. 4, D and G). If so, it will suggest that the formation of heterochromatin may feedback to promote the production of 22G-RNAs. Such a relationship between chromatin and small-RNA production is reminiscent of RNA-induced transcriptional silencing described in Schizosaccharomyces pombe, in which small-RNA–guided heterochromatin recruits RdRPs to produce more small RNAs (39). In addition, because our data suggested that some endogenous genes—such as nop-1, cdk-1, or oma-1—exhibit resistance to piRNA silencing despite low PATC density (Fig. 4B) (33), other mechanisms may exist to license self genes for expression. Taken together, our studies revealed a strategy by which C. elegans defends its genome against foreign nucleic acids, whereby diverse piRNAs silence foreign genes that are not licensed for expression.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S9

Tables S1 to S4

References (4045)

Data File S1

  • * These authors contributed equally to this work.

References and Notes

Acknowledgments: We thank C. Mello for his guidance of H.-C.L. during his postdoctoral training and for resources to initiate the project; J. Staley, E. Ferguson, A. Ruthenburg, and J. Brown for critical comments on the manuscripts; E. Xiao for assistance on designing and cloning transgenes; M. Glotzer and K. Longhini for sharing reagents and unpublished results; and members of the H.-C.L. laboratory and J. Staley laboratory for helpful discussions. The deep sequencing data described in this manuscript are available at the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (accession no. SRP108932). This work is supported by Ministry of Science of Technology of Taiwan grants (MOST-105-2918-I-006-002, MOST-105-2221-E-006-203-MY2, and MOST-106-2628-E-006–006-MY2) to W.-S.W., an NIH P01 grant (HD078253) to Z.W., and an NIH R00 grant (GM108866) to H.-C.L.
View Abstract

Navigate This Article