Arrested replication forks guide retrotransposon integration

See allHide authors and affiliations

Science  25 Sep 2015:
Vol. 349, Issue 6255, pp. 1549-1553
DOI: 10.1126/science.aaa3810

Parasitic DNA targets a genomic home

Long-terminal-repeat (LTR) retrotransposons are a form of parasitic DNA that can jump around within the host's genome. To avoid damaging resident genes, they have been selected to integrate away from protein-coding sequences. For instance, the fission yeast LTR retrotransposon Tf1 inserts at nucleosome-free regions in gene promoters. Jacobs et al. show that Tf1 is directed to these insertion sites by a specific DNA binding protein, Sap1, which forms DNA replication–fork barriers.

Science, this issue p. 1549


Long terminal repeat (LTR) retrotransposons are an abundant class of genomic parasites that replicate by insertion of new copies into the host genome. Fungal LTR retrotransposons prevent mutagenic insertions through diverse targeting mechanisms that avoid coding sequences, but conserved principles guiding their target site selection have not been established. Here, we show that insertion of the fission yeast LTR retrotransposon Tf1 is guided by the DNA binding protein Sap1 and that the efficiency and location of the targeting depend on the activity of Sap1 as a replication fork barrier. We propose that Sap1 and the fork arrest it causes guide insertion of Tf1 by tethering the integration complex to target sites.

Retrotransposons are mobile genetic elements that replicate through an RNA intermediate that is reverse transcribed into a cDNA capable of insertion elsewhere in the genome. By virtue of this amplifying mechanism, retrotransposons constitute large portions of many eukaryotic genomes and have a critical influence on their evolution (1). Fungal LTR retrotransposons minimize their mutagenic potential by carefully selecting integration sites away from protein coding sequences (2). The different families of LTR retrotransposons employ a variety of strategies for this target site selection, but current models posit tethering interactions between retrotransposon proteins and host DNA binding factors.

The fission yeast genome shows signs of ancient and persistent colonization by the LTR retrotransposons Tf1 and Tf2, members of the Metaviridae/Ty3-gypsy–like group of transposable elements (3). Both Tf1 and Tf2 exhibit a preference for insertion into promoters of RNA polymerase (Pol) II–transcribed genes (4, 5) coinciding with the nucleosome-free region (NFR) that usually precedes the transcription start site. The main determinant of NFR presence in fission yeast promoters is Sap1 (6), which binds DNA as homopolymers to clusters of a 5– base pair (bp) sequence motif (7, 8). To determine whether Sap1 binding coincided with transposition hotspots, we performed high-throughput sequencing of transposon–host genome junctions in cultures overexpressing a genetically marked Tf1 transposon (4). Genome-wide correlation analysis shows a strong association of Sap1 enrichment (9) with insertion sites (Fig. 1, A and B, and fig. S1a). Sap1 is strongly enriched at the previously described Tf1 hotspots, such as the promoters of class II genes (Fig. 1A and fig. S1b). Peaks of significant Sap1 enrichment [MACS (10)] account for 63.1% of transposition points, while covering only 5.1% of the host genome, and contain more efficient insertion points than the rest of the genome (fig. S1c). Logistic regression analysis revealed that Sap1 binding is a strong predictor of insertion position [area under the curve (AUC) – 0.5WT = 0.217; fig. S2, a and b]. However, correlation between Sap1 fold enrichment and number of insertion points, though significant (Spearman’s rho = 0.70, P = 1 × 10–10), shows a wide variability beyond the threshold of significant enrichment (fig. S1, a and b), suggesting that Sap1 binding is not the only factor affecting target site competence. Insertion points coincide precisely with a maximum of Sap1 enrichment (9), strongly indicating that Sap1 determines Tf1 target site selection (Fig. 1C). To investigate the involvement of Sap1 in Tf1 transposition, we performed high-throughput insertion analysis in a sap1 mutant with a lower affinity for DNA (sap1-c) (9). sap1-c mutants exhibited a markedly reduced transposition frequency (t test, P < 0.001, n = 21; Fig. 1D). Additionally, the strong association of insertion points with Sap1 was decreased (Fig. 1C), the portion of insertions in Sap1-enriched regions fell to 49.9%, and the accuracy of Sap1 binding as a predictor of insertion dropped (AUC – 0.5sap1-c = 0.097, fig. S2a), indicating that transpositions are dispersed away from Sap1 binding peaks. The sap1-c background showed no defects in cDNA processing or altered levels of integrase, suggesting that the transposition defect is due to impaired integration (fig. S3). Together, these data show that Sap1 is a major determinant of Tf1 insertion target site selection.

Fig. 1 Tf1 transposition into Sap1 binding regions.

(A) Sap1, nucleosome positioning, and average insertion number in reads per million (rpm) at type II genes aligned at the transcription start site (TSS). (B) Genome-wide correlation between transposition (insertion number, rpm) and Sap1 binding in 500-bp windows. Black: genomic windows; red: randomized value pairs. (C) WT Sap1 enrichment around WT insertions (blue) and sap1-c insertions (red). (D) Transposition frequency in WT and sap1-c mutant of integrase + (+) and catalytic dead (CD) Tf1. Error bars represent SDs and asterisks depict statistically significant differences (***P < 0.001).

Sap1 is essential for maintaining genome integrity during DNA replication (11). It has a demonstrated role in forming directional replication fork barriers (RFBs) (12, 13). We plotted Tf1 insertion density around Sap1 binding motifs, taking into account their orientation (Fig. 2A). Insertions were enriched around Sap1 binding motifs, (Fig. 2B), indicating that Sap1 binding directs transposition but protects its footprint. Notably, most insertion events occurred 3′ of the Sap1 binding motif [Wilcoxon signed rank test (5, 99 < mu <7, 99), 95% confidence interval, P < 2 × 10–16, n = 888], displaying a prominent periodicity of peaks (Fig. 2B and fig. S4). Sap1-dependent RFBs have been shown to cause fork arrest on this side of the motif (8, 9, 12). Moreover, in both known Sap1-dependent RFBs, the replication terminator Ter1 located at ribosomal DNA (rDNA) (12, 13) and the solo LTR interspersed in the genome (9), most insertions occurred on the blocking side of the Sap1 barrier, suggesting that the RFB influences site selection (Fig. 2, C and D). Consistently, Tf1 insertion hotspots and Sap1 binding regions coincide with domains of phosphorylated histone H2A (γH2A) deposition and with DNA Pol ε (Cdc20) maxima in undisturbed S phase, both of which are markers of replication fork arrest (14, 15) (fig. S5). Because Sap1 fork barrier activity depends on binding site structure rather than on binding affinity (8), this observation could explain the variability in transposition competence of Sap1 binding sites (fig. S1, a and b) and why the sap1-c allele, which only modestly lowers DNA binding but strongly affects RFB activity (9), decreases Tf1 transposition to such a notable extent (Fig. 1D).

Fig. 2 Insertion profile around Sap1 binding motifs.

(A) Sap1 binding motifs are oriented as blocking replication forks advancing in the right-to-left orientation. (B) Averaged transposition frequency around Sap1 binding motifs (n = 888). Upper panel: 500-bp window; middle panel: 100-bp zoom-in window; lower panel: 100-bp heat-map of individual motifs. (C) Averaged transposition frequency around Tf2 LTR (n = 152). (D) Averaged transposition frequency around Ter1 (n = 3).

To test the hypothesis that Sap1-dependent fork arrest guides Tf1 insertion, we assessed whether the transposition competence of Sap1 binding sites correlates with the intensity of Sap1 binding signal or with their RFB activity. We examined the influence of Sap1 binding site orientation with respect to fork progression on transposition efficiency in wild-type (WT) cells, using three well-characterized Sap1 binding sites: (i) the rDNA replication terminator Ter1 (12, 13), a very efficient RFB; (ii) the synthetic sequence DR2, derived from in vitro Sap1 binding selection (16) but an inefficient RFB; and (iii) DR2D, a mutation of DR2 that enhances its RFB activity (8). We introduced these Sap1 binding sites in one of the two orientations [blocking (B) or nonblocking (NB)] into autonomously replicating plasmids, in close proximity to a replication origin so as to control the predominant direction of fork progression over the motif. We then used these plasmids as transposition acceptors in a targeting assay (17) (Fig. 3A). The results are summarized in Fig. 3B. Two-dimensional (2D) native-native gel electrophoresis of replication intermediates confirmed that Ter1 and DR2D, but not DR2, are efficient RFBs in their blocking orientation (Fig. 3B). Chromatin immunoprecipitation (ChIP) analysis showed little difference in Sap1 enrichment between the two orientations of each motif but revealed that DR2 binds to Sap1 more strongly than do the other binding sites, with DR2D showing the lowest and Ter1 showing intermediate enrichment (fig. S6A). The results of the transposition trap experiment show that RFB competency (Ter1 and DR2D, but not DR2), as well as blocking ability (B orientation), determined higher transposition frequency into the target site (n = 3 biological replicates per condition, Tukey Range test, P = 0.0006). All insertions displayed 5-bp target site duplications (TSDs, not shown), indicating that they were integrase-mediated transpositions. These results indicate that transposition into Sap1 binding regions depends not on their Sap1 binding affinity but on their efficiency as RFBs.

Fig. 3 Transposition competence of Sap1 binding sites depends on RFB activity.

(A) Plasmid transposition trap strategy. (B) Transposition into Ter1, DR2, DR2D, and Scrambled binding motifs in plasmid transposition trap assay. Left column: 2D gel electrophoresis; RFB signals are marked with an arrowhead. Middle column: diagram of target site with insertion sites as columns (blue in forward, red in reverse orientation) at the insertion position of height proportional to number of insertions. Sap1 binding sites depicted by triangles, in blocking (B, pointing left) and nonblocking (NB, pointing right) orientations. Number of insertions into the target site (150-bp window around the Sap1 binding motif) divided by the total number of insertions in black and percentage in red numbers. Right column: Frequency of transposition into the plasmid (KanR/AmpR plasmids divided by AmpR total plasmids), with insertions into the target site in red, and insertions into the plasmid backbone in blue. Error bars represent SDs. Asterisks depict statistically significant differences (***P < 0.001). ns, not significant (P < 0.05). (C) Intron transposition trap strategy. (D) Insertion into Ter1 and Scrambled binding motifs in intron transposition trap assay. Diagram of motif arrangement and insertions as in (B). Proportion of 5-FOA–resistant colonies due to transposition into ura4 are shown in red; those due to other mutations are in blue.

We next examined if the effect of target site orientation extended to genomic positions. We set up a transposon trap system in which the target site is placed inside an artificial intron in the reporter gene ura4, allowing selection of insertions by treatment with the counterselection drug 5-fluoroorotic acid (5-FOA). ura4 is passively replicated by forks approaching from two nearby replication origins on its centromeric side (18), which allowed us to correlate the target site efficiency with its competence as a RFB (Fig. 3C). Blocking or nonblocking orientations of Ter1 showed equal binding of Sap1 (fig. S6b). However, insertion frequency was 10-fold larger in the Ter1 motif placed in the blocking orientation (t test, P < 0.001, n = 4 biological replicates per condition). Again, all insertions exhibited TSD (not shown). We conclude that the efficiency of insertion near a Sap1 binding motif depends on its ability to cause fork arrest.

These observations prompted us to examine how Sap1, the Tf1 intasome (integration complex), and the replication fork interact. We split Sap1 into functional domains and tested for interactions with the full-length Tf1 integrase with a yeast two-hybrid assay, which revealed that the Tf1 integrase interacts directly with the C-terminal dimerization domain of Sap1 (16) (fig. S7). To evaluate the role of this interaction and the arrested fork in Tf1 transposition, we used chromosome conformation capture (3C) to measure tethering of mature cDNA at Sap1-dependent and -independent RFBs (Fig. 4A). Sap1 bound to Ter1 in the blocking orientation led to prominent recruitment of Tf1 cDNA, whereas the nonblocking orientation was unable to recruit (Fig. 4B, Ter1 B/NB). Tethering to Ter1 was also dependent on WT Sap1 and the presence of Tf1 integrase in the intasome (Fig. 4B, sap1-c, intΔ panels). This suggests that the direct interaction of Sap1 with integrase (fig. S7) participates in intasome recruitment and that Sap1 bound to cDNA (fig. S8) through its cognate binding sequences in the LTR (9) is not sufficient to localize the intasome by multimerization with genome-bound Sap1. A Sap1-independent RFB [Ter2, dependent on the DNA binding factor Reb1 (19)] did not tether the cDNA [Fig. 4B, Ter2 (B) and Ter2 (NB)]. This is consistent with our genome-wide observations that class III genes and other Sap1-independent RFBs are not hotspots for Tf1 transposition despite causing polar fork arrest (not shown) (15, 19). Combined, these results suggest that integrase, Sap1, and fork barrier activity must be present to tether Tf1 cDNA to the target site and guide insertion.

Fig. 4 Sap1 binding and RFB activity collaborate to tether the intasome.

(A) 3C analysis of cDNA tethering. (B) cDNA tethering at Sap1-dependent (Ter1) and -independent (Ter2) RFB in WT, sap1-c mutant, and integrase frameshift Tf1 mutant. (C) Separation of Sap1 binding and RFB activities. (D) Results of plasmid trap assay. Left column: 2D gel electrophoresis. Middle column: Diagram of arrangement of Sap1 (DR2 and Scrambled) and Reb1 (Ter2) binding motifs, insertion points, and frequency depicted as in Fig. 3B. Right panel: Transposition frequency into target plasmid as in Fig. 3B.

We next investigated whether the RFB and Sap1 binding requirements can be separated. If so, we could rescue insertion into a non-RFB Sap1 binding site by providing an independent RFB in cis. We cloned Ter2 next to the DR2 binding site placed in the nonblocking orientation (Fig. 4C and fig. S9). Ter2 rescued the targeting efficiency of DR2 only when placed in the blocking orientation (Mann-Whitney U test, P < 0.001, n = 3 biological replicates per condition, Fig. 4D). Part of the increase in targeting to DR2 could be caused by replication forks converging onto the Ter2 blocked fork to complete S phase, approaching DR2 in the blocking orientation. Accordingly, insertions were detectable on the blocking side of DR2 (Fig. 4D). However, transposition also occurred near Ter2 into the side of the motif where Reb1 stops the fork, suggesting that features of the arrested fork, and not the location of binding sites, are the major determinants of target site choice. Together, these results reveal that Tf1 transposition targeting requires two separable conditions—Sap1 binding and an active RFB—both of which are necessary but neither of which is sufficient by itself.

Insertion of Tf1 into an arrested replication fork could be a result of strand-transfer reactions between the cDNA and replication intermediates, like the insertion of the bacterial Tn7 transposon into the lagging-strand template at replication terminators (20). Alternatively, fork arrest or restart may fire DNA damage checkpoint and repair mechanisms that participate in insertion (21), or leave epigenetic marks to guide it (22, 23) (figs. S5 and S10). Other LTR retrotransposons may display a similar preference for RFBs. The Saccharomyces cerevisiae LTR retrotransposons Ty1 (Copia group) and Ty3 (Gypsy group) insert upstream of RNA Pol III–transcribed genes like tRNA and 5S (2225), which are confirmed RFBs (26) (fig. S10). Similarly, other LTR retrotransposons with an insertion preference for heterochromatin might use fork stalling at satellite repeats in pericentromeric DNA (27).

Supplementary Materials

Figs. S1 to S10

Materials and Methods

Tables S1 to S3

References (2839)

References and Notes

  1. Acknowledgments: We are indebted to H. Levin for strains, reagents, and technical advice as well as critical discussions. This work was supported by a Searle Scholar Award and NIH grant 1R01GM105831 (to M.Z.) and a Rutgers Biotechnology Training Program Grant 5T32GM008339 (to J.Z.J.). The high-throughput sequencing data reported in this paper are archived at the Gene Expression Omnibus repository ( under accession no. GSE67692. The authors declare no financial conflicts of interest.

Stay Connected to Science


Navigate This Article