Report

Antisense Transcription in the Mammalian Transcriptome

See allHide authors and affiliations

Science  02 Sep 2005:
Vol. 309, Issue 5740, pp. 1564-1566
DOI: 10.1126/science.1112009

Abstract

Antisense transcription (transcription from the opposite strand to a protein-coding or sense strand) has been ascribed roles in gene regulation involving degradation of the corresponding sense transcripts (RNA interference), as well as gene silencing at the chromatin level. Global transcriptome analysis provides evidence that a large proportion of the genome can produce transcripts from both strands, and that antisense transcripts commonly link neighboring “genes” in complex loci into chains of linked transcriptional units. Expression profiling reveals frequent concordant regulation of sense/antisense pairs. We present experimental evidence that perturbation of an antisense RNA can alter the expression of sense messenger RNAs, suggesting that antisense transcription contributes to control of transcriptional outputs in mammals.

The sense strand of DNA generally provides the template for production of mRNA, which in turn encodes proteins. Transcription from the opposite (antisense) strand can produce transcripts that hybridize with the coding DNA strand, or with the antisense transcript, to interfere with transcription or mRNA stability.

Although previous analysis of the mammalian transcriptome suggested that up to 20% of transcripts may contribute to sense-antisense (S/AS) pairs (1-3), large-scale cDNA sequencing in the FANTOM3 project (4) suggests that antisense transcription is more widespread. To elucidate the function of S/AS pairs, we used the FANTOM3 data set to analyze their location in the mouse genome, the extent and position of their overlap, and promoter architecture and regulation (4).

Analysis of the imprinted gnas locus in mice demonstrated numerous sense and antisense transcripts expressed selectively depending on parental chromosomal origin (5). However, paired S/AS expression is not restricted to imprinted loci. For example, fig. S1 shows the complex transcript overlap patterns of the HoxA locus and complex transcript overlap patterns. To analyze such complex loci on a genomewide scale, we generated a cDNA set comprising 158,807 full-length transcripts obtained by merging the 102,801 Fantom-3 cDNA set (http://fantom3.gsc.riken.jp/db/) with mouse cDNAs from GenBank (www.ncbi.nlm.nih.gov/Genbank/) and clustering the cDNAs into transcriptional units (TUs), in which members share sequence transcribed from the same strand. There were 50,111 overlapping transcript pairs, grouped into 29,780 nonredundant different overlapping regions in 8331 TU pairs (9713 distinct representative overlapping regions).

In the accompanying paper (4), transcription and termination sites were identified. On the basis of this information, more than 72% of all genome-mapped TUs (43,553) overlap with some cDNA, 5′ or 3′ expressed sequence tag (EST) sequence, or tag or tag-pair region mapped to the opposite strand (Table 1). From the above data, 4520 TU pairs contain full-length transcripts, which form S/AS pairs on exons (Table 2). S/AS interaction might also occur between immature RNAs (heterogeneous nuclear RNA, hnRNA) in the nucleus. Furthermore, introns themselves can originate smaller RNA with biological activity (6). In addition to transcript pairs that share exons in opposite orientations, 4129 TU pairs were transcribed from different strands of the same locus without apparently sharing overlapping exons (Table 2). Although conservative, the combined S/AS prediction is 1.5- to 2-fold greater than that from previous studies of mouse (1) and human (2, 3, 7) transcripts.

Table 1.

Number of individual TUs showing S/AS overlap. “Single or multiple evidence”: at least one type of evidence was used for classification. “Multiple evidence”: at least two independent transcripts were detected. “Overlapping cDNA”: overlap using only the cDNA data set. Noncoding TUs do not have any coding cDNA in the cluster. Coding TUs may contain noncoding variants of coding transcripts.

TU Total no. of TUs TUs potentially involved in S/AS pair
Overlapping cDNA, tag, or tag pair. Single or multiple evidence Multiple evidence TUs with overlapping cDNA evidence
Coding TU 20,714 18,021 (87.0%) 13,711 (66.2%) 7,223 (34.9%)
Noncoding TU 22,839 13,401 (58.7%) 8,593 (37.6%) 5,296 (23.2%)
Total 43,553 31,422 (72.1%) 22,304 (51.2%) 12,519 (28.7%)
Table 2.

Pairwise analysis of S/AS TUs with cDNA support. Nonexon overlapping bidirectional pairs indicate S/AS pairs having exons overlapping introns of the counterpart, but no exon-exon overlaps, including TUs within antisense TUs.

TU pairing types Cis-S/AS pairs Nonexon overlapping S/AS pairs
Coding-coding 1687 1081
Coding-noncoding 2478 2452
Noncoding-noncoding 355 596
Total 4520 4129

Overlaps of cis S/AS pairs can target different portions of the corresponding TU, giving rise to three basic types of S/AS pairs (fig. S2): head-to-head or divergent (D), tail-to-tail or convergent (C), and fully overlapping (F). The relative abundance of these classes is shown in Table 3. The divergent (head-to-head) classes are the most prevalent, contrasting to previous studies emphasizing convergent cis S/AS pairs (3′-3′ end) (2, 8, 9). For example, the insulin-like growth factor 1 receptor (IGF1R) shows a very strong antisense CAGE tag overlapping the promoter of the sense transcript, which provides a parallel to the AIR noncoding RNA (ncRNA) in the IGF2R loci (10).

Table 3.

Directionality of S/AS TUs. Type of S/AS pairs overlap, as in fig. S2. Convergent S/AS pairs overlap tail to tail (3′-3′), and divergent S/AS pairs overlap on the promoter (5′-5′ overlap). Numbers in parentheses: relative difference from expected values (coefficient of variation). The plus or minus sign indicates direction of deviation. Discrepancies from Table 2 derive from counting in all corresponding columns TUs having more than one transcript overlap.

TU pairing types Divergent Convergent Full overlap
Coding-coding 727 (+0.03) 885 (+0.31) 375 (−0.38)
Coding-noncoding 1092 (0.00) 832 (−0.20) 1129 (+0.22)
Noncoding-noncoding 113 (−0.18) 132 (+0.01) 140 (+0.20)
Total 1932 [+0.07] 1849 [+0.02] 1644 [−0.09]

S/AS phenomena affect different types of genes (tables S1 and S2) and are unevenly distributed across the genome (table S3). Mouse chromosomes 4 and 17 show a S/AS pair density that is greater than average, whereas chromosomes 6, 9, and 13 show a S/AS pair density that is significantly lower than average (table S3). Chromosome 6 is largely homologous to human chromosome 7, which is known to be rich in RNAs transcribed by RNA polymerases I and III, a facet not captured by our approach (11). The X chromosome contains the fewest bidirectional pairs, which could be related to monoallelic inactivation because S/AS pairs are also enriched in imprinted regions (5). We have identified 2114 transcripts (1985 TUs) that are potentially imprinted (12). Among them, on the basis of directly overlapping cDNAs, EST, CAGE tags, and GSC/GIS ditags, 1281 (23% more than expected, P = 2.26 × 10-15) showed S/AS pairs, and up to 81% of all imprinted TUs showed S/AS pairs when AS sequences to introns were included. This result suggests that S/AS pairing is almost universally associated with candidate imprinted loci. In view of the low frequency of S/AS pairs on the X chromosome, it should be noted that random allelic inactivation also occurs on the autosomes, albeit at lower density than on the X-chromosome (13).

Together with microarray analysis (table S4), CAGE tag frequency data represent a de facto expression profiling approach, and allowed further validation of the coexpression of S/AS pairs (table S5). Randomly primed CAGE libraries identified more S/AS pairs than did oligo-dT primed CAGE libraries, suggesting that some polyadenylate [poly(A)] minus RNA transcripts or very long non-coding RNA transcripts are involved in S/AS (fig. S3, table S5). In keeping with an earlier report (14), S/AS CAGE tags were detected concordantly at greater than the expected frequency. These coexpressed S/AS pairs (table S6) show complex and tissue-specific regulation. Specific examples are considered in fig S4. Different types of genes are preferentially involved in S/AS regulation, with particular overrepresentation for cytoplasmic proteins and underrepresentation for membrane and extracellular proteins (tables S1 and S2) (15).

Possible regulatory interactions between S/AS pairs can be assessed by monitoring correlation of expression with time. To assess such patterns of regulation, we selected S/AS pairs for transcripts where the expression was substantially increased or decreased during the activation of bone marrow-derived macrophages by bacterial lipopolysaccharide (LPS) (16). Out of 15 S/AS pairs tested, 7 showed various patterns of reciprocal regulation (Fig. 1). Three S/AS pairs showed proportional coregulation, where both members of the S/AS pair decreased with time. Two pairs showed reciprocal regulation, where one transcript concentration was induced while the other declined in response to LPS. Two more regulated S/AS pairs showed no obvious connection. A transcriptional map of these transcripts is available in fig. S5.

Fig. 1.

Time-course analysis of S/AS pairs. Expression of S/AS RNA pairs was verified by reverse transcription polymerase chain reaction over 24 hours after activation of macrophages with LPS. R, correlation coefficient. y axis, relative expression; blue/pink symbols ratio, actual expression levels at time 0 hours. (A to G) Different S/AS transcript pairs.

Although concordant regulation is more frequent in S/AS pairs, there are many examples in which the two transcripts are expressed reciprocally. Examples were chosen to test the effect of disturbing the expression of one or the other partner in the S/AS pair. Out of five S/AS pairs selected from expression profiling (17), two produced divergent coregulation. Figure 2A shows an example of reciprocal regulation of two coding transcripts, Ddx39 (AK012002) and CD97 [a G protein-coupled receptor (AK004577)]. Targeted small interfering RNA (siRNA) inhibition of Ddx39 led to an increase in CD97 mRNA, but the reciprocal effect was not observed (Fig. 2A).

Fig. 2.

Expression perturbation of S/AS pairs. siRNAs were designed against the indicated transcripts to specifically inhibit only the target transcripts without producing an off-target effect. (A) Relative expression of the coding transcripts Ddx39 and CD97 24 hours after transfection. The Ddx39 transcript was silenced by siRNA designed to inhibit the transcript at two positions, Ddx39-1 and Ddx39-2, outside the CD97 overlap. (B to D) Hepa1-6 mouse cells. (B) siRNA perturbation of CEBPD (CCAAT/enhancer-binding protein related) and I530027A02 (hypothetical aminoacyl-tRNA synthetase class II). (C) Overexpression of I530027A02 transcript induces overexpression of CEBPD. (D) Perturbation of KIF20a (rabkinesin-6) and CDC23 (cell division cycle 23 yeast homolog) testing both cytoplasmic and nuclear RNA. (E) HeLa cell. Ts-S, thymidylate synthase; TS-AS, thymidylate synthase antisense. Results represent the mean ±SE of three independent experiments performed in triplicate relative to GAPDH (glyceraldehyde-3-phosphate dehydrogenase) controls. Controls, no siRNA added.

CAGE data identified potentially coregulated S/AS pairs in mouse hepatocyte Hepa1-6 cells. In contrast with the above correlation, the inhibition of sense hypothetical aminoacyl-tRNA synthetase class II-containing protein (I530027A02) resulted in decreased antisense C/EBP delta expression, but the reverse interaction was not observed (Fig. 2B). The association between these two transcripts was tested further by transiently over-expressing I530027A02 (Fig. 2C), which caused induction of CEBP/delta expression. This finding argues against the simplistic assumption of a negative regulatory role of antisense transcription.

Similarly, the cytoplasmic level of CDC23 was decreased by siRNA against the AS transcript Kif20a for 48 hours (Fig. 2D). Here, the RNA concentration in the nucleus was diminished, suggesting moderate reduction at a nuclear level as well. Another example is shown in the human HeLa cell line, in which siRNA-mediated ablation of an antisense thymidylate synthetase transcript produced a marginal, but reproducible, elevated level of the thymidylate synthetase mRNA (Fig. 2E).

The examples above involve S/AS pairs in which both partners encode protein, and the transcripts are processed and exported to the cytoplasm. We also manipulated the expression of six pairs in which one partner is noncoding, and in four of them there was a slight positive correlation (18). In three other S/AS pairs in the two different cellular systems tested, there was no evidence that ablation of the AS transcript altered the level of the sense transcript. This finding is consistent with the unaffected phenotype in ROSA-26 locus knockout mice, in which ablation of ncRNA did not alter expression of overlapping coding transcripts (19).

S/AS hybrids can potentially provide the templates for transcript cleavage involving the enzyme Dicer, which forms the molecular basis for so-called RNA interference (RNAi) (20). Dicer cleaves the RNA duplex to produce siRNAs, which in turn catalyze cleavage of the corresponding mRNA. siRNAs can also participate in transcriptional gene silencing in the nucleus (21-23). In addition, Dicer seems to be essential for heterochromatin formation in vertebrate cells (24). In addition to siRNA-mediated activities, noncoding antisense RNAs apparently contribute to local chromatin modification or methylation when they overlap the sense promoter (25-27).

Both RNAi and RNA-dependent heterochromatin assembly, as the basis for function in S/AS pairs, would predict that the transcripts display divergent regulation, but most S/AS pairs in our study were positively correlated in their expression. Alternatively, coexpression would occur if the transcription of the S/AS pair was controlled by the same enhancer elements (28). If antisense transcripts do reflect the transcriptional activity of enhancers, the act of transcription from the antisense promoter may generate the regulatory interaction. In the imprinted IGF2R locus, the antisense transcript, AIR, does not imprint the overlapping Mas1 gene, and elimination of the transcriptional overlap with IGF2R in a transgene does not prevent silencing (29). Hence, some effects of antisense transcription may not require the formation of an RNA duplex.

The large-scale transcriptome profiling of the mouse by the Fantom3 Consortium reveals that antisense transcription is widespread in the mammalian genome. Although there are some examples in which the pairs are discordantly regulated, and some experimental evidence of a direct regulatory interaction, generally the S/AS pairs are positively correlated. Whether concordant or discordant regulation reflects common or divergent regulation, or cis/trans-acting regulatory interactions, will require detailed analysis of the kind presented here for each of the pairs of transcripts under a wide range of conditions.

RIKEN Genome Exploration Research Group and Genome Science Laboratory (Genome Network Project Core Group) and the FANTOM Consortium:

S. Katayama,1,* Y. Tomaru,1,* T. Kasukawa,1 K. Waki,1,2 M. Nakanishi,1 M. Nakamura,1 H. Nishida,1 C. C. Yap,1 M. Suzuki,1 J. Kawai,1,2 H. Suzuki,1 P. Carninci,1,2 Y. Hayashizaki,1,2 C. Wells,3 M. Frith,1,3 T. Ravasi,3 K. C. Pang,3,4 J. Hallinan,3 J. Mattick,3 D. A. Hume,3 L. Lipovich,5 S. Batalov,6 P. G. Engström,7,* Y. Mizuno,7,* M. A. Faghihi,7,8 A. Sandelin,7 A. M. Chalk,7 S. Mottagui-Tabar,7,8 Z. Liang,7 B. Lenhard,7 C. Wahlestedt7,8

Supporting Online Material

www.sciencemag.org/cgi/content/full/309/5740/1564/DC1

Materials and Methods

Figs. S1 to S5

Tables S1 to S6

References and Notes

DDBJ Accession Codes

References and Notes

View Abstract

Navigate This Article