Abstract
Gene marking with replication-defective retroviral vectors has been used for more than 20 years to track the in vivo fate of cell clones. We demonstrate that retroviral integrations themselves may trigger nonmalignant clonal expansion in murine long-term hematopoiesis. All 29 insertions recovered from clones dominating in serially transplanted recipients affected loci with an established or potential role in the self-renewal or survival of hematopoietic stem cells. Transcriptional dysregulation occurred in all 12 insertion sites analyzed. These findings have major implications for diagnostic gene marking and the discovery of genes regulating stem cell turnover.
Because of their unspecific insertion properties, replication-defective retroviral vectors represent unique tools for genetic marking studies, enabling the numbers and progeny of transplanted (stem) cells to be determined (1, 2). However, retrovirally marked cells can only be expected to behave normally if vector insertion itself does not confer a selective advantage or disadvantage.
Most marking studies published to date have used vectors based on murine leukemia virus (MLV), which is a simple gammaretrovirus with strong enhancer-promoters in the long terminal repeats (LTRs). In replication-competent retroviruses, these sequences may trigger up-regulation of randomly hit protooncogenes (3, 4). Given the multistep nature of oncogenesis (5), the accumulation of several such events in single cells potentially results in malignant transformation (3). Retrovirally marked genes that have been repeatedly associated with experimentally induced murine tumors have been summarized in a database of ∼300 common insertion sites (CISs), representing proposed or established cellular proto-oncogenes (4). Besides these, there may be many other genes with the potential to influence stem cell kinetics when up-regulated or disrupted. The identification of such genes is of great interest for regenerative medicine, as exemplified by the powerful effect of Hoxb4 on the self-renewal of hematopoietic stem cells (HSCs) (6).
In light of the recent demonstration that MLV vectors preferentially target the promoter regions of active genes (7), proto-oncogene activation might be more frequent (>1% of all insertions) than previously anticipated (8). However, in the numerous studies performed with retroviral vectors in human subjects, nonhuman primates, dogs, and mice (9), few cases have been identified in which malignant complications were caused by insertional mutagenesis. We previously observed an acute myeloid leukemia after insertion of a vector expressing an engineered low-affinity nerve growth factor receptor (dLNGFR) into the Evi1 proto-oncogene (10). Evi1 encodes a transcription factor with a role in both selfrenewal and transformation of HSCs (11). Because dLNGFR was not associated with side effects in other studies (12), it remained unclear whether vector insertion into Evi1 was sufficient for leukemogenesis in mice. Similarly, unusual lymphatic leukemias occurred after insertional up-regulation of the LMO2 proto-oncogene by a vector expressing the interleukin-2 receptor common γ-chain, in a clinical trial performed to correct inborn severe combined immunodeficiency (13). We have also shown that leukemias reproducibly evolve in mice if vector insertions occur in independent proto-oncogenes of a single clone (14), consistent with the hypothesis that at least two collaborating signal alterations are required for leukemogenesis (15).
To examine the effect of retroviral gene hits in normal HSCs, we analyzed cohorts of healthy mice in which a single or very few clones dominated hematopoiesis after serial bone marrow transplantation (BMT) (16, 17) (Fig. 1A). The vectors expressed either a signal-deficient variant of human CD34 (tCD34) or its full-length variant (flCD34); there was no evidence of a selective advantage related to transgene expression (16, 18). Using ligation-mediated polymerase chain reaction (LM-PCR), which introduces a bias for dominant clones (19, 20), we detected several clones contributing to hematopoiesis in each transplant recipient of the first cohort (after 28 weeks observation) (Fig. 1B). We mapped 22 different insertions in the six primary recipients of flCD34-marked cells (mice fl1 to fl6) and 19 insertions in the six primary recipients of tCD34-marked cells (mice t1 to t6) (Table 1 and table S1).
Gene marking studies and LM-PCR. (A) Retroviral vectors expressed tCD34, flCD34, or dsRED. Waves indicate flanking chromosomal DNA, and arrows represent LM-PCR primers. Mice were observed during serial BMT, with detailed analysis of transgene expression and histopathology. ψ, retroviral packaging signal; MACS, magnetic cell sorting. (B) LM-PCR gel showing insertions recovered from primary (1°) and secondary (2°) recipients of cells marked with tCD34. Arrows indicate internal control bands. M, molecular weight marker; H, H2O control.
Insertion sites recovered by LM-PCR from long-term repopulated primary and secondary recipients. Twenty-two other insertion sites were from other or unknown gene classes (table S1). Genome hits and sequence codes (Gene ID) are according to the National Center for Biotechnology Information (NCBI) mouse genome database (frozen December 2004). Underlined loci recovered from secondary recipients are listed in an HSC transcriptome database (26). Insertions are defined with respect to the transcriptional start sites (TSS, the mRNA start according to NCBI) of neighboring genes. In total, we analyzed 69 locus insertions. Six of the 25 insertions for tCD34, 1 of the 23 insertions for flCD34, and 10 of the 21 insertions for dsRed were newly detected in secondary recipients. t1 to t6 represent tCD34-marked primary mice 1 to 6; fl1 to fl6, flCD34-marked primary mice 1 to 6; r1 to r4, dsRED-marked primary mice 1 to 4; t.MS1 to t.MS6, tCD34-marked secondary recipients 1 to 6 of magnetically sorted cells; t.NS1 to t.NS5, tCD34-marked secondary recipients 1 to 5 of nonselected cells; fl.MS1 to fl.MS5, flCD34-marked secondary recipients 1 to 5 of magnetically sorted cells; r1.1, r1.2, etc., secondary recipients of r1, etc. (17); F, forward (with respect to the gene's transcriptional direction); R, reverse. TF, transcription factor; ND, not detected (below the sensitivity of LM-PCR); GF, growth factor; Znf, Zinc finger. Refer to the NCBI database for gene name abbreviations.
Locus | Gene ID | Chromosome | Definition or (proposed) function | Position to TSS (intron) | Orientation | Recipients | |
---|---|---|---|---|---|---|---|
1° BMT | 2° BMT | ||||||
CIS/proto-oncogenes (n = 13) | |||||||
Ccnd3 | 12445 | 17B4 | cyclin D3, protein kinase activity | 1629(1) | R | ND | t.MS1-t.MS6 |
Elk4 | 13714 | 1E3-G | ELK4, TF of ETS family | -1669 | F | t6 | ND |
Evi1 (hit #1) | 14013 | 3A3 | Znf TF, murine and human leukemia | -6571 | F | t1 | ND |
Evi1 (hit #2) | 14013 | 3A3 | Znf TF, murine and human leukemia | -155802 | R | t5 | ND |
Evi1 (hit #3) | 14013 | 3A3 | Znf TF, murine and human leukemia | -149613 | R | t6 | ND |
Evi1 (hit #4) | 14013 | 3A3 | Znf TF, murine and human leukemia | -11311 | F | ND | t.NS1-t.NS5 |
Evi1 (hit #5) | 14013 | 3A3 | Znf TF, murine and human leukemia | -547816 | F | ND | fl.MS1 |
Bcl11a | 14025 | 11A3.2 | Znf TF essential for lymphopoiesis | 167714 | F | r3 | r3.1 |
BC031781/Lefty2 | 208768 | 1H4 | unknown function/morphogen | -1388/-44275 | R/R | ND | r1.2 |
Hoxa7 | 15404 | 6B3 | homeobox A7 TF | -986 | R | ND | r2.1, r2.2 |
Pdcd1lg1 | 60533 | 19C2 | programmed cell death 1 ligand 1 | 34046 | R | r3 | ND |
Runx2 | 12393 | 17B3 | runt TF, essential for hematopoiesis | 141712(4) | R | r3 | ND |
Scl=Tal1 | 21349 | 4D1 | Ebox TF, essential for hematopoiesis | -2012 | F | ND | r4.1, r4.2 |
Signaling genes (n = 34) | |||||||
Atf6 | 22664 | 1H3 | activating TF 6 | 136053(21) | R | t5 | ND |
Ccl27 | 20301 | 4A5 | chemokine ligand 27, cytokine | 3688 | R | t6 | ND |
Cnn2 | 12798 | 10C1 | calponin 2, calmodulin binding | 373(1) | R | t1 | ND |
Map3k14 | 53859 | 11E1 | mitogen-activating protein kinase 14 | 20716(6) | R | t5 | ND |
lgfbp4 | 16010 | 11D | insulin-like GF binding protein 4 | -1010 | R | t2 | ND |
LOC381899 | 381899 | 7E1 | similar to adaptor molecule Gab2 | -1365 | F | t6 | ND |
Socs5 | 56468 | 17E4 | suppressor of cytokine signaling 5 | 87655 | R | t3 | ND |
Srebf2 | 20788 | 15E1 | sterol regulating element binding factor 2 | 7652(1) | F | t3 | ND |
Hoxb5/Hoxb4 | 15413 | 11D | homeobox B5 and B4 TF | 8555/-6630 | F/F | t6 | t.NS1, t.MS2-t.MS6 |
Ly78 | 17079 | 13D1 | lymphocyte antigen 78, receptor | -26986 | F | ND | t.MS1-t.MS6 |
Pip5k2a | 18718 | 2A3 | PI-4-phosphate 5-kinase, II alfa | 46080(1) | R | ND | t.MS3 |
S100a3 cluster | 20197 | 3F1 | S100 calcium binding protein A3 | -13251 | F | ND | t.MS1, t.MS5 |
5832424M12 | 218503 | 13D1 | hypothetical cell cycle—associated protein | 6242(1) | R | ND | t.NS3, t.NS4, t.MS6 |
Dapk1 | 69635 | 13B2 | death-associated protein kinase 1 | 71753(1) | F | fl5 | ND |
Shb | 230126 | 4B1 | Src homology 2 adaptor protein B | 43028(2) | F | fl1 | ND |
Stxbp4 | 20913 | 11C | insulin receptor signaling pathway | 84916(11) | R | fl4 | ND |
Irf2bp1 | 272359 | 7A2 | interferon regulating factor 2 binding | -1335 | R | fl2 | fl.MS1 |
Cflar=c-Flip | 12633 | 1C1.3 | CASP8/FADD-like apoptosis regulation | -15374 | R | fl5 | fl.MS2 |
Map3k5 locus | 26408 | 10A3 | mitogen-activated protein kinase 5 | 4122(1) | F | fl2, fl6 | fl.MS1-fl.MS5 |
Vegfa | 22339 | 17B3 | vascular endothelial growth factor A | -14963 | F | fl2, fl6 | fl.MS1-fl.MS5 |
Abr | 109934 | 11B5 | active BCR-related | 28918(2) | R | ND | r3.1 |
Btd | 26363 | 14A3 | biotinidase, involved in proliferation | 27328 | F | r4 | r4.1, r4.2 |
Crsp6 | 234959 | 9A3 | cofactor for TF Sp1 | -25697 | R | r1, r4 | r4.1, r4.3 |
Depdc2 | 76135 | 1A3 | DEP domain—containing 2 | 18431(1) | R | ND | r1.1 |
Dlx2 | 13392 | 2C3 | distal-less homeobox TF | -116230 | R | ND | r4.2 |
Ifnb1 | 15977 | 4C4 | interferon beta 1; cytokine | 72106 | R | r2 | r2.1, r2.2 |
Ing1 | 26356 | 8A2 | inhibitor of growth protein 1, TF | -1143 | F | ND | r1.2, r4.2 |
LOC433959 | 433959 | 5 | transcriptional domain-associated | -100466 | R | ND | r3.1 |
Mybbp1a | 18432 | 11B4 | Myb-binding protein | 46577 | R | r1, r4 | r1.1, r4.1, r4.3 |
Phemx | 27027 | 7F5 | absence enhances T cell proliferation | 2431(2) | F | r3 | ND |
Rtn4r | 65079 | 16 | nogo (reticulon 4) receptor | 34527 | R | ND | r1.2 |
Slfn2 | 20556 | 11C | schlafen 2, negative regulation of proliferation | 13391 | R | r4 | r4.2 |
Stk38 | 106504 | 17A3.3 | serine/threonine kinase 38 | 55555 | F | r2 | r2.1, r2.2 |
2410075D05Rik | 73681 | 10A4 | methyltransferase activity | -650 | R | ND | r1.2, r4.1, r4.2 |
In the tCD34 primary cohort, we observed four insertions in CIS/proto-oncogenes (Table 1), three of which occurred in Evi1 (Fig. 2A). We have seen a similar incidence of Evi1 insertions in primary recipients after retroviral expression of the human multidrug resistance 1 (MDR1) cDNA (14). Such a recovery of independent hits in identical loci, or pathways, strongly suggested in vivo selection (4). Another hit was located between Hoxb5 and Hoxb4 (Fig. 2B), marking a clone that we could detect even in two independent primary recipients by locus-specific PCR (Table 1 and table S2).
Selected insertion sites. (A) Distribution of Evi1 insertions. F and R represent forward and reverse orientations of the vector, respectively; E1 to E3 represent exons 1 to 3. Insertion coordinates are given based on the first start codon (ATG) of the NCBI database (June 2004). Exon definition follows the ENSEMBL database (April 2005). The asterisk indicates an insertion from the leukemia associated with the use of a dLNGFR vector; in 2002, this insertion was annotated to exon 1 according to GenBank record M64494 (10). bp, base pairs. (B) Insertion within the Hoxb cluster. Schematic representation of the retroviral vector insertion in the Hoxb cluster, with distance to the transcriptional start sites of neighboring Hoxb genes. Transcripts and their expression levels are indicated above the scheme of the locus. SD, splice donor.
In the flCD34 primary cohort, LM-PCR showed no insertions in CIS/proto-oncogenes and also fewer insertions within or close to signaling genes (Table 1 and table S1). Among the latter, Map3k5 is highly related to Map3k14 detected in primary tCD34 recipients, and both Stxbp4 (flCD34) and Igfbp4 (tCD34) map to insulin pathways. Even in this small set of dominant clones from primary flCD34 recipients, we found two hits belonging to the same growth factor pathway (Vegfa, Shb) (21). Vascular/endothelial growth factor encoded by Vegfa enhances HSC self-renewal (22), and again, the affected clone was detected in two independent primary recipients (Table 1).
To further determine the fate of marked clones, we examined 16 secondary recipients (11 for tCD34, 5 for flCD34) observed for 22 weeks after receiving bone marrow cells pooled from the primary recipients (Fig. 1A). This analysis thus focused on true HSCs, characterized by serial repopulation activity. Two of the three secondary groups received cells after magnetic sorting (MS) for expression of vector-encoded tCD34 or flCD34, resulting in a high frequency of transgene expression in vivo (16). Table S2 summarizes transgene expression and clonal distribution over serial BMT.
In the tCD34 MS group (t.MS), we detected only five clones by Southern blot (16). LM-PCR, a more sensitive method (19), detected six distinct insertions (Fig. 1B). All animals harbored the clone with the insertion between Hoxb5 and Hoxb4 (Table 1 and table S2). To date, Hoxb4 is the best investigated of the transcription factors that, when upregulated, promote self-renewal of HSCs without malignant transformation (6). Hoxb5 is closely related but differently regulated by enhancer elements located between these two genes (23). Although we detected a fusion transcript between the upstream LTR and the second exon of Hoxb4 (17), real-time reverse transcription (RT)–PCR revealed no substantial change of wild-type Hoxb4 transcripts (Figs. 2B and 3). The upstream primer of this real-time RT-PCR was located in the first (coding) exon, thus ignoring the fusion transcript. When examining neighboring genes of the Hoxb cluster, we found that Hoxb5 was up-regulated by 65.3 times and there was a mild induction of Hoxb3 (Figs. 2B and 3) (17).
Transcriptional dysregulation of targeted loci. Real-time RT-PCR shows dysregulation of targeted alleles in selected recipients. Each analysis was performed in triplicate. Error bars indicate standard deviations between independent determinations. RNA from normal hematopoietic cells served as controls (RNA level = 1). With the exception of the dLNGFR leukemia (Evi1 dLNGFR), RNAs were from uncloned cells, potentially underreporting transcriptional changes.
All remaining insertions obtained from secondary recipients of tCD34-transduced cells also affected potential growth-regulatory genes (Table 1 and table S1). One dominant clone in this group showed another insertion into Evi1 (Evi1#4), strongly (by 876.1 times) upregulating the transcript (Figs. 2A and 3). As the RNA used for RT-PCR studies was not prepared from recloned cells, Evi1#4 and the cloned dLNGFR-associated leukemia showed similar high levels of up-regulation (Fig. 3). In contrast, the three clones with different Evi1 insertions detected in primary recipients were counterselected in serially transplanted animals (Table 1) (17). Like Evi1#4, a clone with an insertion in Ccnd3 (Table 1) also showed delayed dominance. Because this insertion was located in the first intron with reverse orientation, it was no surprise to find the transcript between exons 1 and 2 down-regulated (Fig. 3). Repression of Ccnd3, one of the D cyclins that are essential for hematopoiesis, may occur in leukemias (24, 25). The other loci recovered from secondary recipients of tCD34-marked cells were all up-regulated (Ly78, Pip5k2a, S100a3, and 5832424M12) (Fig. 3).
Southern blot data of the five recipients of MS cells expressing flCD34 (fl.MS) revealed only six different clones (16). As in the tCD34 cohorts, one clone with delayed dominance had an Evi1 insertion, Evi1#5 (Table 1). This occurred in the neighboring Mds1 locus and thus as far as 548 kb upstream of Evi1's first exon, nevertheless resulting in its up-regulation (by 5.8 times) (Figs. 2A and 3) (17). In all secondary mice, we recovered the clone with the Vegfa insertion (Table 1), also associated with up-regulation (by 3.6 times) (Fig. 3) (17). As in the primary recipients, this was associated with a hit upstream of Map3k5, suggesting clonal linkage (Table 1 and table S2). Furthermore, two additional clones had insertions in signaling genes, Cflar=c-Flip and Irf2bp1 (Table 1 and table S2). RT-PCR indicated down-regulation of these loci, which in the case of Cflar was consistent with its proapoptotic function (Fig. 3).
In all, using LM-PCR to analyze longterm repopulating hematopoietic cells marked with vectors encoding tCD34 or flCD34 (Table 1), we identified 48 different insertions that matched the murine genome, 41 in primary recipients and 7 additional hits only in secondaries. All of the 12 hits recovered from secondary recipients identified CIS/proto-oncogenes or other (predicted) signaling genes. While there were several clones with insertions in (and dysregulation of) Evi1 (Fig. 2A and table S3), none of the 21 clones with hits in nonsignaling genes or gene-free regions (51% of all insertions detected in tCD34 and flCD34 primary animals) (table S2) dominated in secondary recipients.
To test whether the vector-encoded transgenes derived from human CD34 were required for insertional selection, we examined cohorts of mice that received cells marked with a related vector expressing the dsRED fluorescent protein (Fig. 1A and Table 1). The results were similar. All insertions of clones dominating secondary recipients affected CIS/proto-oncogenes (Bcl11a, Hoxa7, BC031781, and Scl=Tal1) or other signaling genes. In a leukemic clone associated with the use of a retroviral vector encoding MDR1, we recently identified very similar insertions in Hoxa7 and Bcl11a (14). We chose Hoxa7 for RT-PCR analysis and verified transcriptional induction (by 3.5 times) (Fig. 3). Hoxa7 as well as 20 of the other 28 genes marked in secondary recipients of all cohorts (72.4%) are also found in a transcriptome database of highly enriched mouse HSCs (Table 1) (26, 27). This database also contains numerous nonsignaling genes that were not marked by vector insertions in dominant clones.
Although we would not question that the transgene product might be involved in the selection process (10, 14), it is safe to conclude that insertional mutagenesis gives rise to clonal imbalance in the context of different transgenes. Clonal dominance after retroviral gene marking of hematopoietic cells has been observed since the beginning of such studies in the mouse model (1, 27). Our exploration of the associated insertion sites strongly suggests a selection process in which preferential survival of longterm repopulating clones is triggered by insertional dysregulation of genes that enhance their “fitness,” without necessarily resulting in malignant transformation (17). Similar insertional effects may have contributed to the sequential activation of cell clones noted in earlier marking studies. Thus, clonal succession (1, 27) may not necessarily reflect a normal hematopoietic property.
Related marking studies performed in nonhuman primates so far have provided little evidence of clonal dominance, despite hits in potential or established growth-regulatory genes (28). This may reflect experimental and/or genetic differences. The latter may affect cell cycle regulation, apoptosis, and senescence, all known to be more complexly regulated in primates (5). Our mouse model may represent a situation of accelerated stress hematopoiesis that is not expected to occur in the majority of clinical applications of retroviral gene transfer. Thus, the insertional bias of replication-defective retroviral vectors for actively transcribed genes (7) may be exploited to literally mark genes and entire pathways involved in developmental decisions of defined cell populations, depending on a specific milieu. Consequently, we would suggest the use of vectors lacking strong enhancer-promoter elements for future clonal tracking studies.
Supporting Online Material
www.sciencemag.org/cgi/content/full/308/5725/1171/DC1
Materials and Methods
SOM Text
Fig. S1
Tables S1 to S3
References and Notes