Extensive Demethylation of Repetitive Elements During Seed Development Underlies Gene Imprinting

See allHide authors and affiliations

Science  12 Jun 2009:
Vol. 324, Issue 5933, pp. 1447-1451
DOI: 10.1126/science.1171609


DNA methylation is an epigenetic mark associated with transposable element silencing and gene imprinting in flowering plants and mammals. In plants, imprinting occurs in the endosperm, which nourishes the embryo during seed development. We have profiled Arabidopsis DNA methylation genome-wide in the embryo and endosperm and found that large-scale methylation changes accompany endosperm development and endosperm-specific gene expression. Transposable element fragments are extensively demethylated in the endosperm. We discovered new imprinted genes by the identification of candidates associated with regions of reduced endosperm methylation and preferential expression in endosperm relative to other parts of the plant. These data suggest that imprinting in plants evolved from targeted methylation of transposable element insertions near genic regulatory elements followed by positive selection when the resulting expression change was advantageous.

Cytosine DNA methylation is a stable epigenetic modification that has roles in transposable element silencing and gene imprinting in plant and animals. In plants, gene imprinting occurs in the endosperm during seed development (1). At fertilization, one sperm fertilizes the haploid egg cell, which becomes the diploid embryo, and the other sperm fertilizes the diploid central cell, generating the triploid endosperm. In Arabidopsis, the 5-methylcytosine DNA glycosylase DEMETER (DME) demethylates maternal alleles of imprinted genes in the central cell before fertilization, thus establishing methylation asymmetry between embryo and endosperm. Similarly, in maize an imprinted gene is less methylated in the central cell than in the egg cell or sperm (2). The asymmetry between embryo and endosperm represents an opportunity to characterize DNA methylation in parallel genomes established simultaneously at fertilization within the same seed.

To compare tissue-specific methylation patterns within developing seeds, we dissected embryo and endosperm from torpedo-stage seeds of two Arabidopsis thaliana accessions, Col-gl and Ler (fig. S1A) (3). Methylated DNA was immunoprecipitated with an antibody to 5-methylcytosine, sequenced with Illumina Genome Analyzer technology (Illumina, San Diego, CA), and aligned to the reference Col-0 genome. We created methylation profiles using high-quality reads that mapped to only one position in the genome (table S1). Embryo and endosperm methylation profiles were highly correlated (Pearson’s R = 0.91 for Col-gl and 0.89 for Ler) and share similar features with other whole-genome methylation profiles that have been generated for Arabidopsis with other platforms (Fig. 1 and fig. S1B) (4, 5). Methylation levels are relatively high in gene-poor heterochromatic regions around centromeres and decrease in gene-rich chromosome arms (Fig. 1A). Transposable element genes (a set of 3900 elements with open reading frames) are more heavily methylated than protein-coding genes, and genes are more methylated within their bodies than at their 5′ and 3′ ends (Fig. 1B). However, transposable element genes and regions flanking genes are on average less methylated in the endosperm than embryo (Fig. 1B). If the calculation of average methylation profiles 5′ and 3′ of protein-coding genes excludes methylation from transposable elements or their fragments, the methylation difference between embryo and endosperm in regions flanking genes almost entirely disappears, which indicates that repetitive elements are hypomethylated in the endosperm (Fig. 1C). These results suggest that there is a genome-wide decrease in methylation in the endosperm as compared with the embryo. Reduction in methylated DNA in the endosperm is also observed when immunoprecipitated methylated DNA is hybridized to genomic tiling arrays (fig. S2). This small genome-wide reduction is consistent with the report that maize endosperm has 13% less 5-methylcytosine than do embryos or leaves (6).

Fig. 1

Endosperm is less methylated than embryo. (A) Typical methylation profiles from a gene-rich and gene-poor region of the genome showing highly similar methylation patterns in embryo, endosperm, and dme endosperm from the Ler and Col-gl accessions. (B) Average methylation in 100-bp windows of protein-coding genes and transposable element genes aligned at either their 5′ or 3′ ends in Col-gl embryo and endosperm. (C) Methylation profiles of protein-coding genes when a set of 31,076 transposable element fragments (12) are and are not excluded from the averaging.

To identify regions of the genome subject to the largest changes in DNA methylation [differentially methylated regions (DMRs)], we calculated an embryo-endosperm difference score in overlapping 300–base pair (bp) segments and set the cutoff to include the top 0.5% of differences (top DMRs) (Fig. 2A). This cutoff detects the previously described methylation differences 5′ and 3′ of the imprinted MEA gene (7), whereas those 5′ of FWA (8) fall just below it (+1.18 versus a cutoff of 1.20) (Fig. 2B). This cutoff also detects previously hypothesized methylation differences (9, 10) at the PHE1 and FIS2 imprinted genes (Fig. 2B). About 90% of the top DMRs have a positive score, which indicates greater methylation in the embryo than endosperm (fig. S3). In plants, DNA methylation is actively targeted by small RNAs, which often arise from and target repetitive elements (11). The top DMRs were almost threefold enriched in regions of the genome corresponding to transposable elements (12) and small RNAs (13) (table S2), which indicates that demethylation occurs at regions that are actively targeted for DNA methylation in other tissues. The distribution of the DMRs along the chromosomes parallels the distribution of transposable elements (fig. S4).

Fig. 2

Known imprinted genes are associated with top DMRs. (A) Histogram of Col-gl embryo–endosperm difference scores for 1.2 million overlapping 300-bp segments. Dashed lines represent the cutoff for the top 0.5% of methylation differences (~6000 300-bp segments). (B) Embryo and endosperm methylation profiles of known imprinted genes. Red arrows indicate regions within the top 0.5% of methylation differences; the gray arrow indicates a region below the cutoff.

Bisulfite sequencing around 15 different regions that fell above and below the top 0.5% cutoff largely validated the predictions from the deep-sequencing analysis (fig. S5). For DMRs with a positive score, methylation of individual bisulfite clones from the endosperm was more variable than from the embryo. Often, two distinct subpopulations of clones were observed in the endosperm, including clones with no methylation, which were never observed in the embryo. Therefore, unmethylated clones in the endosperm might represent specific demethylation of the maternal genome by DME in the central cell before fertilization. In support of this possibility, clones with no methylation were nearly eliminated in dme mutant endosperm.

More than half of the top positive DMRs (endosperm less methylated than embryo) occur within 2 kilobase (kb) upstream or 2 kb downstream of genes (fig. S3) (7, 14). We identified all protein-coding genes in which a top positive DMR fell within the body of the gene or 1 kb 5′ or 3′. This yielded 1276 genes for the Col-gl embryo-endosperm comparison (table S3) and 1163 for Ler embryo-endosperm comparison (table S4). Embryo methylation profiles for these genes display prominent peaks centered at ~700 bp both upstream and downstream that are largely absent in the endosperm (Fig. 3 and fig. S6). This change represents loss of methylation in the endosperm and not gain of methylation in the embryo because these genes are similarly methylated in embryos and adult plants (figs. S1C and S6C). Upstream and downstream methylation peaks are partially restored in dme mutant endosperm (Fig. 3), despite the fact that dme endosperm is overall hypomethylated as compared with wild type [supporting online material (SOM) text]. Thus, much of the methylation 5′ and 3′ of genes that is depleted in the endosperm is probably lost because of active demethylation by DME in the central cell before fertilization, although other mechanisms probably also contribute (15).

Fig. 3

Methylation is lost 5′ and 3′ of genes in the endosperm. Shown are average methylation profiles in embryo, endosperm, and dme endosperm of the (A) 1276 Col-gl and (B) 1163 Ler genes associated with more methylation in embryo than endosperm. Genes were aligned at their 5′ or 3′ end, and the average methylation was determined every 100 bp.

DNA methylation of promoters inhibits transcriptional initiation. The most prominent losses of gene-associated methylation in endosperm occur well upstream of the transcriptional start site (Fig. 3) and, for most genes, the presence of nearby DNA methylation apparently has little effect on gene expression (fig. S7). However, we found that genes with endosperm-preferred expression (16) are less methylated at 5′ sequences in the endosperm than embryo (fig. S8), which suggests that 5′ loss of methylation is associated with increased expression of a subset of genes in the endosperm.

Known imprinted genes are less methylated in the endosperm than embryo (Fig. 2B) and exhibit endosperm-preferred expression (16). To identify previously unkown imprinted genes, we chose genes with top DMRs in comparisons between embryo and endosperm and between wild type and dme endosperm for Col-gl and Ler data sets. A set of 113 genes have top DMRs in three of the four comparisons, including the known imprinted gene MEA (table S5). Two of these genes, HDG3 and HDG9, belong to the 16-member class IV homeodomain leucine zipper transcription (HD-ZIP) factor gene family that also includes the known imprinted gene FWA. As with FWA, three HD-ZIP genes are expressed primarily or exclusively in siliques: HDG3, HDG8, and HDG9 (17).

To test parent-of-origin–specific expression of putative imprinted genes, we performed reciprocal crosses between Ler and Col-gl and assayed expression patterns by means of reverse transcriptase polymerase chain reaction (RT-PCR). Only the maternal HDG9 allele was expressed in the endosperm (Fig. 4A). We confirmed that methylation was lost around the 5′ end of the gene specifically on maternal alleles in the endosperm, a region annotated as overlapping the remnant of a Helitron transposable element (Fig. 4C). HDG8 is also primarily, but not exclusively, expressed from the maternal allele (Fig. 4A and SOM text). HDG3 is reciprocally imprinted; expression is predominantly paternal (Fig. 4B and SOM text). Methylation is lost from maternal alleles on a 1.4-kb Helitron remnant that begins 100 bp 5′ of the gene (Fig. 4D). We confirmed allele-specific expression of two other genes in the endosperm: the ATMYB3R2 transcription factor, which is maternally expressed, and AT5G62110, a gene annotated as containing a homeodomain-like domain, which is predominantly paternally expressed (Fig. 4 and SOM text).

Fig. 4

Expression and methylation analysis of new imprinted genes. (A) RT-PCR allele-specific expression analysis from endosperm RNA of Col-gl females crossed with Ler males and Ler females crossed with Col-gl males. FWA is a control imprinted gene; AT3G25260 is biallelic. (B) RT-PCR sequencing chromatograms for the paternally expressed genes AT2G32370 and AT5G62110. The biallelically expressed gene αVPE is shown as a control. (C and D) Methylation profiling and allele-specific bisulfite sequencing analysis of new imprinted genes. Dashed lines represent the cutoff for the top 0.5% of differences. Each line of circles represent a bisulfite sequencing clone from embryo or endosperm. Filled circles indicate methylated cytosine. Red, CG; blue, CHG; gray, CHH.

We tested several other genes that were less methylated in the endosperm than embryo for imprinting (table S6). CYCA1;1 is a differentially methylated A-type cyclin that exhibits endosperm-preferred expression but is also expressed at many other stages of development (fig. S8C). It is biallelically expressed in the endosperm. All of the differentially methylated genes with endosperm-preferred expression that were not imprinted were expressed in other tissues throughout development, whereas the five genes with top positive DMRs that we confirmed as imprinted were expressed at very low levels or not at all in other tissues (fig. S9). This suggests that genes with low levels of expression in other plant tissues require loss of methylation for gene expression in the endosperm. Thus, the best candidates for imprinted genes are those that are less methylated in the endosperm than embryo, exhibit endosperm-preferred expression (16), and are expressed at low levels in other parts of the plant (table S7). On the basis of these considerations, we estimate that there are ~50 imprinted genes in Arabidopsis. Gene ontology analysis of molecular function indicates that these genes are enriched in what is termed “DNA or RNA binding,” which includes transcription factors and genes with chromatin-related functions (fig. S9).

Our identification and verification of five previously unknown imprinted genes, all of which are flanked by repetitive elements or harbor them in coding sequences, doubles the number of known imprinted genes in Arabidopsis. Four of the 10 imprinted genes are class IV homeodomain transcription factors, which offers the opportunity to study the evolution of imprinting (SOM text). Our results support the theory that imprinting arose as a byproduct of silencing invading foreign DNA (18). Transposable elements (TEs) have demonstrated toxic effects on genomes, but they are also a source of genetic and epigenetic material that can be utilized by the host (19). Insertion of a TE near a gene will have little functional impact in most instances, and strongly deleterious TE insertions will be selected against. However, a subset of genes, perhaps depending on promoter strength, is susceptible to epigenetic regulation by TEs. Regulation of gene expression by means of DNA methylation could be selected for if imprinting of these genes is adaptive in the context of parental conflict or gene dosage balance in the triploid endosperm.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S9

Tables S1 to S9


References and Notes

  1. Materials and methods are available as supporting material on Science Online.
  2. We thank the FHCRC Genomics Shared Resources for performing microarray hybridizations and Illumina sequencing, J. Henikoff for assistance with computational analyses, R. Deal and J. Cooper for comments on the manuscript, and D. Zilberman and R. Fischer for sharing unpublished data. M.G. is a HHMI Fellow of the Life Sciences Research Foundation. Illumina and NimbleGen data are deposited in the Genome Expression Omnibus (GEO) (accession number GSE14570).

Stay Connected to Science

Navigate This Article