Report

Identification of the Tuberous Sclerosis Gene TSC1 on Chromosome 9q34

See allHide authors and affiliations

Science  08 Aug 1997:
Vol. 277, Issue 5327, pp. 805-808
DOI: 10.1126/science.277.5327.805

Abstract

Tuberous sclerosis complex (TSC) is an autosomal dominant disorder characterized by the widespread development of distinctive tumors termed hamartomas. TSC-determining loci have been mapped to chromosomes 9q34 (TSC1) and 16p13 (TSC2). TheTSC1 gene was identified from a 900-kilobase region containing at least 30 genes. The 8.6-kilobase TSC1transcript is widely expressed and encodes a protein of 130 kilodaltons (hamartin) that has homology to a putative yeast protein of unknown function. Thirty-two distinct mutations were identified inTSC1, 30 of which were truncating, and a single mutation (2105delAAAG) was seen in six apparently unrelated patients. In one of these six, a somatic mutation in the wild-type allele was found in a TSC-associated renal carcinoma, which suggests that hamartin acts as a tumor suppressor.

TSC is a systemic disorder in which hamartomas occur in multiple organ systems, particularly the brain, skin, heart, lungs, and kidneys (1, 2). In addition to its distinct clinical presentation, two features serve to distinguish TSC from other familial tumor syndromes. First, the tumors that occur in TSC are very rare in the general population, such that several TSC lesions are, by themselves, diagnostic of TSC. Second, TSC hamartomas rarely progress to malignancy. Only renal cell carcinoma occurs at increased frequency in TSC (∼2.5%) and with earlier age of onset; it appears to arise in TSC renal hamartomas, termed angiomyolipomas (3). Nonetheless, TSC can be a devastating condition, as the cortical tubers (brain hamartomas) frequently cause epilepsy, mental retardation, autism, or attention deficit–hyperactive disorder, or a combination of these conditions (1, 4).

TSC affects about 1 in 6000 individuals, and ∼65% of cases are sporadic (5). Linkage of TSC to chromosome 9q34 was first reported in 1987, and this locus was denoted TSC1(6). Later studies provided strong evidence for locus heterogeneity (7) and led to the identification of chromosome 16p13 as the site of a second TSC locus (denotedTSC2) (8). The TSC2 gene was identified by positional cloning, and the encoded protein, denoted tuberin, contains a domain near the COOH-terminus with homology to a guanosine triphosphatase (GTPase) activating protein (GAP) for rap1, a Ras-related GTPase (9).

The focal nature of TSC-associated hamartomas has suggested thatTSC1 and TSC2 may function as tumor suppressor genes. The occurrence of inactivating germline mutations ofTSC2 in patients with tuberous sclerosis (9-11) and of loss of heterozygosity (LOH) at the TSC2 locus in about 50% of TSC-associated hamartomas (12-14) supports a tumor suppressor function for TSC2. In contrast, LOH at theTSC1 locus has been detected in <10% of TSC-associated hamartomas (13, 14), suggesting the possibility of an alternative pathogenic mechanism for lesion development in patients with TSC1 disease.

As part of a comprehensive strategy to identify TSC1, we identified 11 microsatellite markers from the 1.4-Mb TSC1region and developed an overlapping contig (with only a single gap of 20 kb) of cosmid, P1 artificial chromosome (PAC), and bacterial artificial chromosome (BAC) clones (15). Figure1 shows the TSC1 region (16, 17), including limiting centromeric and telomeric markers, as derived from analyses of affected individuals (solid arrows) from families with individual lod scores of >2 (18). These limits are also consistent with the information available from LOH studies (13). Two additional recombination events were identified in unaffected individuals (open arrows), also from families with lod scores of >2 (19). In each of these families, two individuals from different generations carried the same recombinant chromosome, and all four had no evidence of TSC. Because the penetrance of TSC is nearly 100% (2), we concentrated our search within the 900-kb region between markers D9S2127 and DBH.

Figure 1

The TSC1 region on chromosome 9. The ideogram (top) represents a normal G-banded metaphase chromosome 9, with the TSC1 region located at 9q34. The male genetic map (next line) shows selected anchor polymorphic loci mapped to 9q34. The detailed physical map of the candidate region (next level) shows the positions of polymorphic markers and key recombination events in affected members (filled arrows) and unaffected members (open arrows) of families showing linkage of TSC to 9q34; the approximate positions of Mlu I (M) sites (with sites that partially cut in genomic DNA shown in parentheses) and of probes used to screen the region for rearrangements in patients with TSC by means of pulsed-field gel electrophoresis (orange boxes); genes previously mapped to theTSC1 candidate region (blue boxes); novel cDNAs isolated from the region (red boxes); ESTs mapped to the region (green); and additional putative genes predicted by GRAIL analysis of genomic sequence (light blue boxes). There was a single 20-kb gap in the contig near D9S1793. The map of the TSC1 gene (bottom) shows the 23 exons, of which exons 3 to 23 are coding.

In a search for further positional information, we looked for large deletions and rearrangements by means of pulsed-field gel electrophoresis (Fig. 1) (9) and through analysis of patient-derived hybrid cell lines retaining a single chromosome 9 bearing a TSC1 mutation (20). No abnormalities were detected, and we therefore began a systematic gene-by-gene analysis.

Several techniques were used to identify genes in the TSC1region, which proved to be relatively gene-rich. Using a combination of exon trapping (21), cDNA selection, expressed sequence tag (EST) mapping, and whole-cosmid hybridization (22), we identified 142 exons and 13 genes between D9S1199 and D9S114. In all, 30 genes were identified or mapped to the 900-kb critical region.

In parallel, we began sequencing the entire contig (23). We used the polymerase chain reaction (PCR) to amplify putative (24) and confirmed exons found in 208 kb of sequence on a screening set of 60 DNA samples from 20 unrelated familial TSC cases with linkage to 9q34, and 40 sporadic TSC cases (18). Amplification products were analyzed for heteroduplex formation using weakly denaturing polyacrylamide gels (25). The 62nd exon screened demonstrated mobility shifts in 10 of the 60 patient samples (Fig. 2A).

Figure 2

Identification of mutations in TSC1 exon 15. (A) Heteroduplex analysis. Control sample (left lane) is followed by 10 samples with a shift. (B) Sequence analysis demonstrating 2105delAAAG mutation. The sequence reactions were done in antisense orientation, so that reading from the top down (b2083 to 2124 of the normal sequence is shown), the allele sequenced on the left has the deletion, the middle sequence is a normal allele, and the sequence on the right is the heteroduplex product with both alleles. (C) In a sporadic case, the heteroduplex mobility shift is not present in either parent. (D) Segregation of heteroduplex mobility shifts in a large family with TSC (left) and digestion of amplification products with Mwo I in another family (right) demonstrates segregation of the 2105delAAAG mutation with the disease.

Sequence analysis revealed seven small frameshifting deletions (three identical), one nonsense mutation, one missense change, and one polymorphism that did not change the encoded amino acid (Fig. 2B). Eight of the nine mutations were from the 20 familial cases tested, and only one mutation was seen among the 40 sporadic cases (Fig. 2C). Analysis of samples from other family members confirmed that each of the familial mutations segregated with TSC and that a frameshift mutation had occurred de novo in the sporadic case (Fig. 2D). The recurrent mutation, 2105delAAAG, was identified in two apparently unrelated familial cases and a sporadic case. Haplotype analysis of the families, using markers flanking the mutation (D9S2126, D9S1830, and D9S1199, Fig. 1), confirmed that the three mutations were of independent origin.

The exon with mutations was part of a transcriptional unit identified by earlier gene discovery efforts (26). The full sequence of the TSC1 gene was determined by comparison of genomic sequence and cDNA clone sequence, including clones obtained by 5′ rapid amplification of cDNA ends (RACE). The TSC1 gene consists of 23 exons, of which the last 21 contain coding sequence and the second is alternatively spliced (Fig. 1, bottom). The open reading frame (ORF) of the longest transcript begins at nucleotide 162, and the likely initiator ATG codon occurs at nucleotide 222. The first stop codon is at nucleotide 3738, leaving a 4.5-kb 3′ untranslated region. Northern (RNA) blot analysis with a coding region probe (nucleotides 1100 to 2100) revealed a major 8.6-kb transcript that was widely expressed and was particularly abundant in skeletal muscle (Fig.3).

Figure 3

Northern blot analysis of TSC1expression. Each lane contained 2 μg of polyadenylated RNA from adult human organs, and the probe consisted of base pairs 1100 to 2100 of theTSC1 gene. Minor hybridization signals of size 4 and 2.5 kb are also seen.

The predicted TSC1 protein, which we call hamartin, consists of 1164 amino acids with a calculated mass of 130 kD (Fig.4). The protein is generally hydrophilic and has a single potential transmembrane domain at amino acids 127 to 144 (27) as well as a probable 266–amino acid coiled-coil region beginning at position 730 (28). Database searches identified a possible homolog of TSC1 in the yeastSchizosaccharomyces pombe (GenBank accession number Q09778), a hypothetical 103-kD protein, but there were no strong matches with vertebrate proteins (29).

Figure 4

Predicted amino acid sequence of the TSC1 protein, hamartin. A potential transmembrane domain (amino acids 127 to 144) and a coiled-coil domain (amino acids 730 to 965) are underlined. The TSC1genomic sequence and the cDNA sequence have been deposited in GenBank (accession numbers AC002096 and AF013168, respectively).

Because the initial screen identified a high frequency of mutations in exon 15, we studied this exon in a large sample of patients. Mutations in exon 15 [559 base pairs (bp), 16% coding region] were identified in 8 of 55 (15%) familial DNA samples with linkage to the TSC1 region, and in 15 of 607 (2.5%) DNA samples from sporadic patients or families uninformative for linkage (Table 1). A screen for mutations in all coding exons in 20 familial cases and 152 sporadic patients yielded eight mutations in each group (40% and 5%, respectively). In total, 19 mutations were found in coding exons other than exon 15. No mutations have been detected thus far in exons 3 to 6, 8, 11 to 14, 16, or 21 to 23. Of the 32 distinct mutations seen in 42 different patients or families, five were recurrent. Thirty were predicted to be truncating, one was a possible missense mutation, and one was a splice site mutation. Analysis of a renal cell carcinoma from a TSC patient with germline mutation 2105delAAAG revealed a somatic mutation, 1957delG, in the wild-type TSC1 allele (30). A giant cell astrocytoma from another patient with germline mutation 1942delGGinsTTGA had retained the mutant allele but lost the wild-type allele.

Table 1

All mutations found in TSC1. Both heteroduplex and single-strand conformation polymorphism (33) gels were used to search for mutations after the initial screening. F, familial; S, sporadic.

View this table:

Our results support the hypothesis that TSC1functions as a tumor suppressor gene. First, the majority of mutations are likely to inactivate protein function. Second, in two TSC-associated tumors we have shown that loss of the wild-typeTSC1 allele occurred through LOH or intragenic somatic mutation. The paucity of LOH for the TSC1 region found in patient lesions (13, 14) may reflect the same mutational spectrum seen in the germline of TSC patients with a high frequency of small mutations causing inactivation of the second allele. It is also possible that there is a greater frequency of TSC2- versusTSC1-associated disease among the sporadic cases providing the lesions analyzed. This is suggested by the low frequency of mutations we have detected in TSC1 in sporadic cases. However, in families suitable for linkage analysis, about half show linkage to TSC1 and half to TSC2 (16,31).

The mutations observed in TSC1 consist of small deletions, small insertions, and point mutations. No genomic deletions or rearrangements in TSC1 were detected by Southern (DNA) blot analysis of 250 TSC patients. This restricted mutational spectrum may reflect an intrinsic tendency for this type of mutation in this region of the genome. Alternatively, it may reflect selection against more disruptive mutations such as large deletions, which would involve neighboring genes.

The mechanism by which loss of hamartin expression produces TSC lesions is unknown. It is likely that hamartin and tuberin participate in the same pathway of cellular growth control, because the clinical features of TSC1 and TSC2 disease are so similar (31). Tuberin has modest GAP activity for both rap1 and rab5, members of the Ras superfamily of small GTPases. The physiological function of the rap1 GTPase is not understood, whereas rab5 is thought to be involved in early endosomal transport. Tuberin-deficient rat embryo fibroblasts display increased endocytosis, which suggests that the rab5 interaction of tuberin has physiological relevance (32). It is unclear how a deficiency of GAP activity for rap1 or rab5, if that is the critical function of tuberin, leads to hamartoma development. The sequence homology of hamartin to a putative S. pombe protein suggests that it may participate in an evolutionarily conserved pathway of eukaryotic cell growth regulation. The identification of TSC1 will enable analysis of the functions of both hamartin and tuberin, and may permit further insight into the molecular pathogenesis of TSC.

  • * To whom correspondence should be addressed. E-mail: kwiatkowski{at}calvin.bwh.harvard.edu

REFERENCES AND NOTES

View Abstract

Navigate This Article