Report

Nonmethylated Transposable Elements and Methylated Genes in a Chordate Genome

See allHide authors and affiliations

Science  19 Feb 1999:
Vol. 283, Issue 5405, pp. 1164-1167
DOI: 10.1126/science.283.5405.1164

Abstract

The genome of the invertebrate chordate Ciona intestinalis was found to be a stable mosaic of methylated and nonmethylated domains. Multiple copies of an apparently active long terminal repeat retrotransposon and a long interspersed element are nonmethylated and a large fraction of abundant short interspersed elements are also methylation free. Genes, by contrast, are predominantly methylated. These data are incompatible with the genome defense model, which proposes that DNA methylation in animals is primarily targeted to endogenous transposable elements. Cytosine methylation in this urochordate may be preferentially directed to genes.

DNA methylation in the dinucleotide sequence 5′-CpG can silence transcription. The genome defense model (1) posits that the primary role of methylation in animal genomes is to repress potentially damaging transposition of endogenous elements. By analogy with fungal systems (2), the elements are hypothesized to be targets for methylation because of their repetition in the genome (3). The hypothesis is difficult to test through analysis of the globally methylated mammalian genome. We therefore studied the specificity of methylation in a fractionally methylated genome belonging to the sea squirt Ciona intestinalis, an invertebrate member of the chordate phylum. Like most invertebrate genomes, that of C. intestinalis contains comparable amounts of methylated and nonmethylated DNA (4). We reasoned that any bias in the distribution of transposons or genes between the two fractions should therefore be readily detectable.

Three cosmids containing C. intestinalis genomic DNA were studied in detail (5). The cosmids were sequenced, and the locations of likely protein-coding regions were determined by GENEFINDER exon-prediction software (6) and database homology searches (Fig. 1, A through C). Putative proteins encoded by 10 of 13 potential genes showed similarity to known proteins. A systematic search for repetitive elements among the cosmid sequences and 1486 short random genomic sequences (5) identified four transposable elements belonging to recognizable families (7): a gypsy/Ty3–like long terminal repeat (LTR) retrotransposon (Cigr-1), a long interspersed element (LINE)–like element (Cili-1), a miniature inverted repeat (Cimi-1), and a composite short interspersed element (SINE) (Cics-1) with consensus RNA polymerase III promoter A and B box sites (Table 1) (8). Cigr-1, Cimi-1, andCics-1 elements occurred in the cosmids, the latter two being very abundant (Fig. 1, A through C).

Figure 1

Genes, repetitive elements, and endogenous methylation patterns in three cosmids containing C. intestinalis genomic DNA. (A through C) Diagrams of cosmids cicos2, cicos46, and cicos41, showing (from the top down) (i) repeats: the repetitive element families Cigr-1(yellow), Cimi-1 (red), and Cics-1 (blue). (ii) Genes: predicted coding sequences (linked exon boxes) on 5′ (above the line) and 3′ (below the line) strands. C2.3 (A) and C46.3 (B) correspond to homologs of AND-1 (EMBL accession numberX98884) and a zinc-finger protein gene, respectively. (iii) Nonmethylated sites for a range of methyl- CpG–sensitive restriction enzymes project above the line, and methylated sites project below the line. Horizontal bars denote probes used to study methylation. (iv) CpG frequency is shown as plots of the observed-over-expected (o/e) ratio across the cosmid using a 1000–base pair (bp) window and a step size of 200 bp. The dotted line shows a ratio of 1. (D) Examples of data used to establish methylation patterns of cicos2 and cicos41. The enzymes used to digest genomic DNA were Msp I (M), Hpa II (H), Hha I (Hh), and Aci I (A). Probe coordinates are given in (10). The ∼12-kb band that appears in carcass DNA with probe p2.20 (cicos2 panel) corresponds to the methylated domain in the center of this sequence (A).

Table 1

Major repetitive elements in the genome of C. intestinalis. Elements were identified according to (7). The copy number estimate is based on extrapolation from the number of copies found in 0.88 Mbp of genomic sequence in 1486 fragments of mean size 592 bp, assuming a haploid genome size of 162 Mbp.

View this table:

Mosaic methylation of invertebrate genomes has been inferred previously (4, 9), but domain maps have not been reported. DNA methylation was analyzed in genomic DNA from C. intestinalis by means of methylation-sensitive restriction endonucleases (10). The 39-kb cicos2 and cicos46 inserts each contained a domain of several kilobase pairs that is methylated in the genome and flanked by nonmethylated DNA (Fig. 1, A and B). Most of the cicos41 insert was methylated in the genome, but the right extremity was nonmethylated (Fig. 1C). Methylation patterns in sperm and carcass DNA were similar. Projection of the methylation patterns onto a map of CpG frequency showed striking coincidence between methylated domains and domains of CpG deficiency (Fig. 1, A through C). Given strong evidence that CpG deficiency is due to hypermutability of methyl-CpG (11), the data suggest that the observed methylation patterns are evolutionarily stable. The C. intestinalis genome appears to be equally partitioned between methylated and nonmethylated domains, based on the separation of methylated and nonmethylated DNA after restriction endonuclease treatment (4, 8). The presence of methylation at about 25% of genomic CpGs (4) is compatible with this estimate, as the CpG density of methylated domains is about half that of nonmethylated domains.

The Cigr-1 element was identified in cicos41, where it is flanked by methylated DNA (Fig. 1C). Methylation of the element (Fig. 2A) was investigated by Southern (DNA) blotting and bisulfite sequencing (12). We estimated that there were approximately 75 copies of Cigr-1 per genome, based on Southern blots (Fig. 2C) and element frequency in a database of C. intestinalis DNA sequences (Table 1). All tested CpG restriction sites were nonmethylated in most Cigr-1 copies (Fig. 2, B and D). Because transposition depends on transcription from the LTR promoter (13), we tested methylation of all 20 CpGs in this region and found that 8 out of 10 sampled elements were methylation free, with 2 showing low-level methylation (Fig. 2B). Comparison of the genomic contexts of Cigr-1 elements in four individuals showed a high degree of heterogeneity, which suggests that Cigr-1 is transpositionally active (Fig. 2C). The presence of an internal 1.9-kb Pst I fragment confirmed that the probes detect bona fide Cigr-1 elements and that heterogeneity is not due to incomplete digestion with Pst I. A Cigr-1transcript was detected by reverse transcriptase polymerase chain reaction (8). Thus, an apparently active multicopy transposon escapes DNA methylation in C. intestinalis.

Figure 2

Nonmethylated transposable elements in the C. intestinalis genome. (A) Diagram of theCigr-1 element, showing LTRs, an open reading frame (ORF), and hybridization probes (RT1 and RT2). (B) The upper map shows the absence of methylation at sites in Cigr-1 for CpG methylation–sensitive restriction endonucleases (open lollipops). Hpa II sites detected by probing with RT1 are labeled H. The lower maps show the analysis of methylation at all CpGs in the LTR promoter region (nucleotides 19 through 405) of Cigr-1 by bisulfite sequencing (12). Solid lollipops indicate methylated CpGs. (C) Variable genomic locations of Cigr-1 in fourC. intestinalis individuals collected from the same location. Pst I digests were probed with RT2 to detect fragments extending downstream from the rightmost Pst I site in Cigr-1to the next site in flanking genomic DNA, or with RT1 to show the internal 1.9-kb band. (D) The absence of methylation at Hpa II sites in Cigr-1. DNA from carcass and sperm were digested with no enzyme (−), Msp I (M), and Hpa II (H), and blots were probed with RT1. (E) Identical Hpa II and Msp I fragment patterns in genomic DNA probed with part of the LINE-like elementCili-1. An absence of methylation is also seen at Hha I (Hh) sites. (F) Analysis of methylated and nonmethylated sites at seven copies of the miniature inverted repeat Cimi-1 by bisulfite sequencing. Missing CpG sites are due to sequence heterogeneity. Solid lollipops indicate methylated sites; open lollipops indicate nonmethylated sites.

Similar results were obtained with a probe againstCili-1. Hpa II and Msp I patterns were identical and the element was susceptible to Hha I, indicating the absence of methylation (Fig. 2E). The short element Cimi-1 is abundant in both methylated and nonmethylated domains of the cosmids (Fig. 1, A through C). Bisulfite sequencing confirmed that many copies are nonmethylated at all CpGs but that methylated copies also occur (Fig. 2F). The abundant element Cics-1 occurs in methylated and nonmethylated domains of the cosmids (Fig. 1, A through C). Correspondingly, five out of eight genomic copies of Cics-1were nonmethylated by bisulfite analysis (8). In the sequenced cosmids, six tested CpGs were within Cics-1elements and all conformed to the methylation status of the surrounding domain. Four of the elements were intergenic (Fig. 1, A through C) and one was within an intron (see Fig. 3B). These findings suggest that the methylated copies of these elements are those that happen to lie within methylated domains.

Figure 3

Most C. intestinalisgenes lie within methylated DNA. (A) Comparison of Msp I (M) and Hpa II (H) digestion patterns for five cDNAs that have good matches to known gene transcripts. The matches are to translation initiation factor EIF-5A, isoleucyl-tRNA synthetase (ile.synth), GPI anchor biosynthesis protein PIG-A, epidermal surface antigen (epi.s.a.), and chymotrypsinogen (chymo). All but the chymotrypsinogen probes hybridize to Hpa II–resistant DNA sequences. (B) Methylation at CpGs across exon 6 of a guanylate cyclase–like gene identified in the cosmid cicos1 (EMBL accession number Z80904), as determined by bisulfite sequencing (12). Solid and open lollipops represent methylated and nonmethylated sites, respectively. The site labeled by an asterisk lies within a Cics-1element.

Most of the C. intestinalis genes found in the three cosmids (9 out of 13) occur in methylated domains (Fig. 1, A through C). The methylated domain in cicos2 covers the central region of theAND1-related gene but does not extend to either the promoter or the last exon (Fig. 1A, gene C2.3). Similarly, the downstream transcribed region and upstream flanks of the zinc finger protein gene in cicos46 are methylated (Fig. 1B, gene C46.3). The 5′ end of the gene is within a nonmethylated, CpG-rich sequence patch that resembles a mammalian CpG island. All seven genes in cicos41 are within heavily methylated DNA (Fig. 1C). The fact that exons of genes can be heavily methylated was confirmed by bisulfite sequencing of exon 6 of a guanylate-cyclase–like gene (Fig. 3B). We also examined methylation of 10 C. intestinalis cDNAs with strong homology to known genes (Fig. 3A) (8). Eight hybridized to highly methylated sequences on the basis of Hpa II/Msp I susceptibility and two were nonmethylated. The two independent estimates suggest that about three-quarters of C. intestinalis genes, including many housekeeping genes, lie within methylated DNA.

The origin and significance of mosaic methylation patterns in invertebrate genomes are uncertain. No correlation between methylation and expression of genes is yet evident, although promoter methylation has not been specifically studied. Methylated genes of several invertebrates are clearly transcribed, whereas nonmethylated genes can be silent (4). If the primary function of methylation in these organisms is to repress transcription, then a role in reducing transcriptional interference (14) is compatible with the data. Active gene domains often generate incorrectly initiated transcripts whose synthesis may interfere with authentic transcription (15). Methylation of genes might reduce interference and thereby focus initiation on the genuine promoter. Why then are some genes nonmethylated? Perhaps genes that are highly expressed in a few cell types, which appear to be overrepresented among the nonmethylated class (4), have robust promoters that are less susceptible to transcriptional interference.

Genome defense can explain the role of DNA methylation in preventing sequence duplication in certain fungi (2). The mechanism is not universal, however, as the nonmethylated genomes of bothDrosophila and Caenorhabditis elegans harbor transposons. The present study shows that where methylation is present in an invertebrate genome, it is not necessarily targeted to transposable elements. DNA sequence repetition does not provoke methylation in C. intestinalis, as multicopy transposons and genes [for example, ribosomal RNA genes (4)] often are nonmethylated, whereas single-copy genes are methylated. Thus, for invertebrates, the idea that transposition is controlled by DNA methylation lacks supporting evidence [see also (16)]. In the human genome, transposon sequences are heavily methylated (3), but methylation affects most genomic DNA (about 85% of those CpGs that are not within CpG islands), including exons of genes (17). Also, several retrotransposons are reportedly nonmethylated for many cell generations in mammalian germ cells and early embryos, where protection against transposition might be thought to be important (18). In mammals, therefore, the evidence for a relationship between DNA methylation and the restraint of endogenous elements remains inconclusive.

  • * These authors contributed equally to this work.

  • To whom correspondence should be addressed. E-mail: A.Bird{at}ed.ac.uk

REFERENCES AND NOTES

View Abstract

Navigate This Article