Not Junk After All

See allHide authors and affiliations

Science  23 May 2003:
Vol. 300, Issue 5623, pp. 1246-1247
DOI: 10.1126/science.1085690

From bacteria to mammals, the DNA content of genomes has increased by about three orders of magnitude in just 3 billion years of evolution (1). Early DNA association studies showed that the human genome is full of repeated segments, such as Alu elements, that are repeated hundreds of thousands of times (2). The vast majority of a mammalian genome does not code for proteins. So, the question is, “Why do we need so much DNA?” Most researchers have assumed that repetitive DNA elements do not have any function: They are simply useless, selfish DNA sequences that proliferate in our genome, making as many copies as possible. The late Sozumu Ohno coined the term “junk DNA” to describe these repetitive elements. On page 1288 of this issue, Lev-Maor and colleagues (3) take junk DNA to new heights with their analysis of how Alu elements in the introns of human genes end up in the coding exons, and in so doing influence evolution.

Although catchy, the term “junk DNA” for many years repelled mainstream researchers from studying noncoding DNA. Who, except a small number of genomic clochards, would like to dig through genomic garbage? However, in science as in normal life, there are some clochards who, at the risk of being ridiculed, explore unpopular territories. Because of them, the view of junk DNA, especially repetitive elements, began to change in the early 1990s. Now, more and more biologists regard repetitive elements as a genomic treasure (4, 5). Genomes are dynamic entities: New functional elements appear and old ones become extinct. It appears that transposable elements are not useless DNA. They interact with the surrounding genomic environment and increase the ability of the organism to evolve. They do this by serving as recombination hotspots, and providing a mechanism for genomic shuffling and a source of “ready-to-use” motifs for new transcriptional regulatory elements, polyadenylation signals, and protein-coding sequences. The last of these is especially exciting because it has a direct influence on protein evolution.

More than a decade ago, Mitchell et al. showed that a point mutation in an Alu element residing in the third intron of the ornithine aminotransferase gene activated a cryptic splice site, and consequently led to the introduction of a partial Alu element into an open reading frame (6). The in-frame stop codon carried by the Alu element resulted in a truncated protein and ornithine aminotransferase deficiency. This discovery led to the hypothesis that a similar mechanism may result in fast evolutionary changes in protein structure and increased protein variability (7). Several genome-wide investigations have shown that all types of mobile elements in all vertebrate genomes can be used in this way. The unsolved mystery is how a genome adapts to the drastic changes conferred on a protein by the insertion of a mobile element into the coding region of its gene. Lev-Maor and co-workers and a second group now demonstrate how this process takes place without disturbing the function of the original protein (see the figure) (3, 8).

Junk DNA caught in the act.

Two ways in which a repetitive DNA element, such as an Alu element, can be incorporated into the coding region of a gene without destroying the gene's function. (Top) A TE-cassette is inserted into the mRNA as an alternative exon. (Bottom) Insertion of a TE-cassette is preceded by a gene duplication. In both cases, the genome gains two forms of the mRNA transcript—one with and one without the TE-cassette.

Last year, Sorek et al. (9) noticed that about 5% of alternatively spliced internal exons in the human genome originate in an Alu sequence. Interestingly, because Alu elements are primate specific, these exons must be primate or human specific as well as much younger than other exons in a gene. Additionally, they noticed that the vast majority of “Alu exons” are alternatively spliced (that is, there is always another messenger RNA without the Alu element in the coding region). They concluded that “Alu elements have the evolutionary potential to enhance the coding capacity and regulatory versatility of the genome without compromising its integrity” (9).

In their new work, this group now shows how alternative splicing of Alu exons is regulated (3). It is well established that the precise selection of the 3′ splice site depends on the distance between the branch point site (BPS) and the AG dinucleotide downstream of the BPS. The optimal distance between the BPS and the AG dinucleotide is relatively narrow (19 to 23 nucleotides). Interestingly, if there is another AG dinucleotide closer to the BPS, it will be recognized by a spliceosome even if a second AG located more optimally is used in the transesterification reaction (10). A splicing factor, hSlu7, is required to facilitate recognition of the correct AG. Thus, the correct selection of the 3′ splice site is an interplay between AG dinucleotides and certain splicing factors.

It is even more tricky to maintain the delicate balance of signals that cause an exon to be spliced alternatively—you make one mistake (a point mutation) and either a splicing signal becomes too strong and an exon is spliced constitutively, or the signal becomes too weak and an exon is skipped. Lev-Maor and colleagues (3) performed a series of experiments to identify an ideal sequence signal surrounding the 3′ splice site within the Alu element that kept the Alu element alternatively spliced. It appears that in addition to the distance between two AG dinucleotides, a nucleotide immediately upstream of proximal AG is also important. Hence, a proximal GAG sequence serves as a signal weak enough to create an alternatively spliced Alu exon. Any mutation of a proximal GAG in the first position results in a constitutive Alu exon. This is an important observation because most of the more than 1 million Alu elements populating the human genome contain such a potential 3′ splice site. Of these, 238,000 are located within introns of protein-coding genes, and each one can become an exon. Unfortunately, most mutations will lead to abnormal proteins and are likely to result in disease. Yet a small number may create an evolutionary novelty, and nature's “alternative splicing approach” guarantees that such a novelty may be tested while the original protein stays intact.

Another way to exploit an evolutionary novelty without disturbing the function of the original protein is gene duplication (see the figure). Gene duplication is one of the major ways in which organisms can generate new genes (11). After a gene duplication, one copy maintains its original function whereas the other is free to evolve and can be used for “nature's experiments.” Usually, this is accomplished through point mutations and the whole process is very slow. However, recycling some modules that already exist in a genome (for example, in transposons) can speed up the natural mutagenesis process tremendously. Several years ago, Iwashita and colleagues discovered a bovine gene containing a piece of a transposable element (called a TE-cassette) in the middle of its open reading frame (12). This cassette contributes a whole new domain to the bovine BCNT protein, namely an endonuclease domain native to the ruminant retrotransposable element-1 (RTE-1). Interestingly, the human and mouse homologs of bovine BCNT lack the endonuclease domain but instead contain a different one at their carboxyl terminus. This raised two questions: When did the BCNT protein acquire the endonuclease domain, and how did the bovine genome manage such a drastic rearrangement of BCNT without losing its fitness? Iwashita et al. give the answers to both questions in their new study (8). They discovered another copy of the bovine bcnt gene that resembles mammalian bcnt homologs (also called CFDP1) just six kilobases downstream of the gene with the TE-cassette. Both copies of the gene are apparently expressed and both proteins are functional. Phylogenetic analysis suggests that shortly after gene duplication in the ruminant lineage, one of the copies acquired an endonuclease domain from an RTE-1 retrotransposon. Not surprisingly, this gene undergoes accelerated evolution.

The reports by Lev-Maor et al. and Iwashita and colleagues describe different ways in which genes can be rapidly rearranged and acquire evolutionary novelty through the use of so-called junk DNA. These discoveries wouldn't be so exciting if they didn't show how genomes achieve this without disturbing an original protein. To quote an old Polish proverb: “A wolf is sated and a lamb survived.” These two papers demonstrate that repetitive elements are not useless junk DNA but rather are important, integral components of eukaryotic genomes. Risking personification of biological processes, we can say that evolution is too wise to waste this valuable information. Therefore, repetitive DNA should be called not junk DNA but a genomic scrapyard, because it is a reservoir of ready-to-use segments for nature's evolutionary experiments (13).


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
View Abstract

Navigate This Article