A Transposon-Based Genetic Screen in Mice Identifies Genes Altered in Colorectal Cancer

See allHide authors and affiliations

Science  27 Mar 2009:
Vol. 323, Issue 5922, pp. 1747-1750
DOI: 10.1126/science.1163040


Human colorectal cancers (CRCs) display a large number of genetic and epigenetic alterations, some of which are causally involved in tumorigenesis (drivers) and others that have little functional impact (passengers). To help distinguish between these two classes of alterations, we used a transposon-based genetic screen in mice to identify candidate genes for CRC. Mice harboring mutagenic Sleeping Beauty (SB) transposons were crossed with mice expressing SB transposase in gastrointestinal tract epithelium. Most of the offspring developed intestinal lesions, including intraepithelial neoplasia, adenomas, and adenocarcinomas. Analysis of over 16,000 transposon insertions identified 77 candidate CRC genes, 60 of which are mutated and/or dysregulated in human CRC and thus are most likely to drive tumorigenesis. These genes include APC, PTEN, and SMAD4. The screen also identified 17 candidate genes that had not previously been implicated in CRC, including POLI, PTPRK, and RSPO2.

Recent genomic studies have revealed that human colorectal cancers (CRCs) undergo numerous genetic and epigenetic alterations (14). These alterations probably derive from a mixture of “drivers” that play a causal role in tumor formation and progression and “passengers” that have little or no effect on tumor growth. The design of targeted therapeutics for CRCs is dependent on the ability to distinguish drivers from passengers.

To help identify potential driver genes in CRC, we developed a forward genetic screen in mice by using a Sleeping Beauty (SB) system to generate insertional mutations. To confine transposition to the gastrointestinal tract, SB11 transposase cDNA, preceded by a LoxP-flanked stop cassette, was knocked into the Rosa26 locus (fig. S1) (5). These mice were then crossed with Villin-Cre transgenic mice to activate SB transposase in epithelial cells of the gastrointestinal tract (6). Once expressed, SB transposase catalyzed the transposition of T2/Onc, a mutagenic SB transposon (Fig. 1A) (7). T2/Onc contains a murine stem-cell virus long terminal repeat and splice donor site (MSCV-LTR-SD), which can deregulate the expression of a nearby proto-oncogene. T2/Onc also carries splice acceptor sites in both DNA strands and a bidirectional polyadenylate signal, which can inactivate the expression of a tumor suppressor gene. Because SB transposition is biased toward reintegration of the transposon into the same chromosome as the donor transposon (a phenomenon referred to as “local hopping”), we used two T2/Onc transgenic lines that each carried approximately 25 copies of the T2/Onc transposon in a concatamer on different donor chromosomes (chrs 1 and 15) (7).

Fig. 1.

Triple transgenic mice develop intestinal tumors and become moribund faster than double transgenic controls. (A) Breeding scheme for generating triple transgenics. Vil, Villin promoter; RLS, Rosa26-Lox-stop-Lox-Sleeping Beauty 11; Tg, transgenic. (B) Kaplan-Meier survival curve comparing triple transgenics with double transgenic controls. (C to E) Photomicrographs of hematoxylin- and eosin-stained representative tissue showing an adenoma (C); an adenocarcinoma (D), the arrow pointing to a cluster of glands that has invaded through the serosa, and gastrointestinal intraepithelial neoplasia (E), the arrow indicating a cluster of dysplastic glands accompanied by fusion of villi. Scale bars, 250 μm in (C), 500 μm in (D), and 100 μm in (E).

A histochemical analysis of the triple transgenic mice (Rosa26-LsL-SB11, T2/Onc, and Villin-Cre) showed that SB transposase was strongly expressed in epithelial cells of the gut and pancreas but undetectable in other tissues (fig. S2). We created a cohort of 28 triple transgenic mice and 72 double transgenic control mice carrying all possible dual combinations of the three transgenes. Mice in this first cohort were monitored daily for 18 months. We generated a second cohort of 50 triple transgenic mice that were maintained in a separate facility for 12 months and also monitored daily.

Triple transgenic mice died at a faster rate than double transgenic controls, beginning around one year of age (Fig. 1B). Examination of the gastrointestinal tract of moribund animals revealed discrete raised lesions ranging from 2 mm to as large as 5 mm in diameter in the small and large intestine. In the first cohort, 100% (12 out of 12) of the experimental mice that died before 18 months harbored intestinal lesions (table S1), whereas none of the control mice killed before 18 months had lesions. In the second cohort, 72% (36 out of 50) of triple transgenic mice had intestinal lesions, with an average of 1.9 intestinal lesions in the small intestines and 0.2 lesion in the large intestine.

We performed histopathologic analyses of tumor tissue sections from 11 mice. These analyses identified 39 and 16 intraepithelial neoplasias, 50 and 15 adenomas, and 3 and 0 adeocarcinomas in the small and large intestines, respectively (Fig. 1, C to E). An additional adenocarcinoma was identified whose site of origin was undetermined. We also selected six large tumors (≥5 mm) from six additional mice and found that three were adenocarcinomas and three were adenomas.

For use in DNA isolation and sequencing experiments, we harvested 135 tumors: 42 tumors from 11 of the triple transgenic mice from the first cohort (data set 1) and 93 tumors from 36 mice in the second cohort (data set 2). The majority of the tumors were small, and the entire tumor was used for DNA isolation. This precluded our ability to perform histological analysis and to link the molecular data to the histopathology of specific tumors. However, given the distribution of frank intestinal lesions from the histopathological analysis, the majority of tumors were likely to be adenomas. We then performed linker-mediated polymerase chain reaction (PCR) on DNA from these 135 harvested tumors in order to generate PCR products containing transposon-genomic junction fragments. We sequenced over 195,000 of these PCR products, of which 99,624 could be specifically mapped to thymine-adenine (TA) dinucleotides in the mouse genome, which is consistent with SB insertion-site requirements. After combining duplicate insertions within a given tumor, we found that 45% of the insertions mapped to the same chromosome as the donor concatamer (chr 1 or 15), which is consistent with the local hopping seen in previous SB screens. We removed these insertions from further analysis to eliminate statistical bias due to local hopping. We also eliminated insertions mapping to the precise TA dinucleotide in tumors from two or more different mice because these insertions could represent a PCR artifact. The consummate total of 16,690 mapped, nonredundant, genomic loci (table S2) equates to an average of 124 mapped insertions per tumor.

To define common insertion sites (CISs), we performed Monte Carlo simulations using randomly assigned insertions. Genomic window sizes were chosen on the basis of simulations that used the same number of insertions as the data sets so that one would not expect to find a single CIS after randomly distributing transposon insertions throughout the genome (expected value E <1). For example, in a random assignment of 16,690 insertions we would not expect to find a single cluster of five insertions within 25 kb, six insertions within 50 kb, seven insertions within 80 kb, and so on. (8). Any cluster of insertions meeting or exceeding these parameters was defined as a CIS. We removed one CIS from this list because it was composed entirely of insertions from a single mouse, indicating that those tumors may be related.

As a final control, we amplified and sequenced 15,556 SB insertions present in tail DNA derived from 89 double transgenic weanling mice. These mice contained an ubiquitously expressed SB11 transposase transgene and the T2/Onc transposon concatamers (7, 9). Because there was no selection pressure for tumor outgrowth in these mice and because SB integration does not have a strong bias for any individual TA dinucleotide (10), we expected the insertions to be distributed randomly throughout the genome, except for local hopping. From this control data set, we identified six CISs. This was more than expected but considerably less than was observed in the tumor DNA (table S3). These CISs could be previously unknown hotspots for transposon integration. Alternatively, they could reflect incipient clonal neoplastic growth because these genetically manipulated mice eventually develop lymphoma (9). Two of these six CISs were also identified in the tumor data sets; thus, they were eliminated from the list of tumor CISs, which left us with 77 CISs (table S4).

Candidate genes were assigned to the 77 CISs according to the percentage of insertions in or near a gene within the CIS boundaries. Insertions were mainly located within introns (51%), only 2% in an exon, and the remaining 47% either upstream or downstream of a coding region. The top 10 CISs, as ranked by the number of insertions found, are listed in Table 1.

Table 1.

Top 10 CIS candidate genes, ranked according to the number of distinct insertions defining the CIS.

View this table:

The goal of this study was to identify genes that are drivers of tumorigenesis in order to identify new candidate genes whose mutational status in human CRC can then be tested. We compared our list of mouse CIS genes with the human genes listed in the Catalog of Somatic Mutations in Cancer (COSMIC) database (11). Among our list of 77 CIS genes, 38 have human homologs present in the COSMIC database, and 18 (47%) of these 38 homologs have documented nonsilent mutations in human cancers (table S5), which would not be expected by chance (P < 0.05). Furthermore, if we limit our analysis of COSMIC to genes mutated in human CRC, the overlap has a lower probability of being due to chance (P < 0.001) (8).

Similarly, our CIS list overlaps with a recent large-scale exon-resequencing project that cataloged mutations in 18,191 human genes in 11 colorectal tumor samples (1). That project identified 848 human gene mutations, 140 of which were considered likely to be driver mutations for CRC. Of the 77 CIS mouse genes identified in our study, 74 have human homologs that were included in the exon-resequencing study. Among these 74 homologs, 10 had a mutation and four were identified as candidate driver mutations in human CRC (table S6), making these findings highly significant (P <0.005) (8).

We then investigated whether the human homologs of the mouse CIS genes were amplified or deleted in human CRC. We examined a data set that identified 482 deletions and 224 amplifications in human CRCs (8). The human homologs of 10 CIS genes were located in deleted regions, and 23 were located in amplified regions (tables S7 and S8). This represents a significant overlap (P <0.05) (8) and suggests that the candidates found in this screen are relevant to human CRC.

Finally, we analyzed cDNA microarray data to determine whether CIS genes were differentially expressed in human CRC versus normal colonic tissue. The Oncomine database (12) contains five microarray data sets that compare gene expression levels in 138 CRCs and 88 normal samples. Of the 77 CIS human homologs, 50 were identified as being differentially regulated (P < 0.05) in one or more of these studies (table S9).

By comparing our list of mouse CIS genes with human genes that are (i) mutated in CRC, (ii) listed in COSMIC, (iii) amplified or deleted in CRC, (iv) aberrantly expressed in CRC, or (v) known cancer genes identified by the Cancer Genome Project (CGP) (13), we identified 15 CIS genes that are the most likely to be driver mutations in human CRC (Table 2) by virtue of being present in at least three of the five categories listed above. Among these 15 genes is adenomatous polyposis coli (Apc), a member of the Wntsignaling pathway and the most commonly mutated gene in human CRC (70 to 80%) (14). Also included in this list are bone morphogenetic protein receptor, type IA (Bmpr1a), Smad4, and phosphatase and tensin homolog (Pten), which are responsible for juvenile polyposis syndrome, juvenile intestinal polyposis, and Cowden disease, respectively. Another gene on the list, F-box and WD repeat domain–containing 7 (Fbxw7), is a component of the SKP1/cullin/F-box protein (SCF) ubiquitin ligase complex, which is mutated in 11.5% of human CRCs (15). Thus, of the 15 prioritized genes in our study, five are validated human CRC genes and together represent some of the most commonly mutated genes identified in human CRC.

Table 2.

Candidate CIS genes likely to be drivers of human CRC.

View this table:

Three other genes on the complete CIS list (table S4) are also implicated in human CRC: cyclin-dependent kinase 8 (CDK8), mutated in colorectal cancers (MCC), and Staphylococcal nuclease and tudor domain–containing 1 (SND1). CDK8, which encodes cell division protein kinase 8, is commonly amplified in human CRC and plays a direct role in β catenin–driven cell transformation (16). MCC encodes the colorectal mutant cancer protein and, in addition to finding somatic mutations in MCC (17), a recent study found that 50% of primary CRCs exhibited MCC-promoter methylation (18). Furthermore, MCC interacts with β-catenin and its reexpression in CRC cells inhibits Wnt signaling and proliferation, suggesting that MCC is a tumor suppressor (19). SND1, a component of the RNA-induced silencing complex, is highly expressed in CRC, and its overexpression in rat epithelial cells leads to a loss of contact inhibition and increased cell growth (20). SND1 overexpression leads to a down-regulation of APC protein, even though mRNA levels are unchanged.

In addition to identifying genes whose human homologs are known to be altered in cancer, our screen identified a number of other candidate CRC genes that could, on the basis of their function, be drivers of CRC. These candidate CRC genes include polymerase (DNA directed) iota (POLI), protein phosphatase 1 regulatory (inhibitor) subunit 13B (PPP1R13B), and R-spondin 2 (RSPO2), which affect DNA stability, p53-induced apoptosis, and Wnt signaling, respectively. POLI, the product of POLI, is an error-prone DNA polymerase responsible for the high frequency of ultraviolet-induced mutations in xeroderma pigmentosum variant cells (21). PPP1R13B, the product of PPP1R13B, enhances the ability of p53 to stimulate the expression of pro-apoptotic genes (22). RSPO2 is a member of a previously undiscovered family of Wnt-signaling regulators, the R-spondins (23). Finally, two microRNA genes not previously associated with CRC, microRNA 181b-2 (Mirn181b-2) and Mirn181a-2, reside within an intron of nuclear receptor subfamily 6, group A, member 1 (Nr6a1), one of the CIS genes identified in our screen. Both of these microRNAs are aberrantly expressed in CRC (24, 25) and inhibit glioma cell growth in vitro (26).

Our transposon-based forward genetic screen encountered some limitations. We believe the screen was unable to recapitulate the effect of certain activating point mutations, such as the KrasG12V mutation that is found in a large percentage of CRCs. In addition, random transposon insertions could potentially miss small genetic loci such as microRNAs. By design, the statistical method we used to determine CISs in order to identify likely candidate driver mutations ignores the majority of mapped transposon insertions that occurred in only one or two tumors. These non-CIS insertions may also have contributed to carcinogenesis by creating CRC driver or cooperating mutations or by causing some other level of genomic instability.

Our transposon-mediated forward genetic screen in mice identified genetic mutations that lead to the development of an epithelial cancer. The discovery of a significant overlap of mouse candidate genes and human genes that are altered in cancer indicates that this mouse model will be useful for distinguishing between driver and passenger mutations. In addition, the large number of CISs uncovered in this screen affirms the hypothesis that the growth of human CRC is driven by a few commonly mutated genes and a much larger number of genes that are rarely mutated (1).

Supporting Online Material

Materials and Methods

Figs. S1 and S2

Tables S1 to S9


References and Notes

View Abstract

Stay Connected to Science

Navigate This Article