Introduction to special issueBREAKTHROUGH OF THE YEAR

Genomics Comes of Age

See allHide authors and affiliations

Science  22 Dec 2000:
Vol. 290, Issue 5500, pp. 2220-2221
DOI: 10.1126/science.290.5500.2220

2000 was a banner year for scientists deciphering the “book of life”; this year saw the completion of the genome sequences of complex organisms ranging from the fruit fly to the human

Genomes carry the torch of life from one generation to the next for every organism on Earth. Each genome—physically just molecules of DNA—is a script written in a four-letter alphabet. Not too long ago, determining the precise sequence of those letters was such a slow, tedious process that only the most dedicated geneticist would attempt to read any one “paragraph”—a single gene. But today, genome sequencing is a billion-dollar, worldwide enterprise. Terabytes of sequence data generated through a melding of biology, chemistry, physics, mathematics, computer science, and engineering are changing the way biologists work and think. Science marks the production of this torrent of genome data as the Breakthrough of 2000; it might well be the breakthrough of the decade, perhaps even the century, for all its potential to alter our view of the world we live in.

The pace has been frantic. A year ago researchers had completely read the genome of only one multicellular organism, a worm called Caenorhabditis elegans. Now sequences exist for the fruit fly, the human, and plant geneticists' beloved benchmark weed, Arabidopsis thaliana. And drafts of the genomes of the mouse, rat, zebrafish, and two species of pufferfish are not far behind. Researchers have also been churning through the genomes of simpler organisms: Some five dozen microbial genomes are now on file, including those of the villains that cause cholera and meningitis. Most of these data are accessible to scientists free of charge, catalyzing a vast exploration for new discoveries.

As a result, genomics—the study of genome data—is now in hyperdrive. By comparing mouse to human, worm to fly, or even mouse to mouse, a new breed of computer-savvy biologists is hacking through the thickets of DNA code, discovering not just genes but also other important bits of genetic material, and even evolutionary secrets. We are learning, for example, that we have a lot more in common with Earth's other biota than we thought. Far from being a culmination, these genome libraries will break open decades of new laboratory investigations. And rather than investigate single genes, many 21st-century researchers will tackle whole families of genes and whole pathways of interacting proteins. Indeed, researchers are already studying how patterns of gene expression differ from one tissue to another and under different conditions.

This is a long way from the start of the 20th century, when geneticists were just rediscovering the seminal work of Gregor Mendel, a monk whose experiments with pea plants led to the first insights about heredity. It took until the 1950s for researchers to unmask DNA as the bearer of the genetic code. During the next 2 decades, biochemists developed the cloning and sequencing tools needed to fish out genes. By 1990, an insatiable hunger to know all the genes encoded in the DNA of humans prompted the establishment of the international Human Genome Project. It was biology's first foray into big science, and by almost any measure, it has been a great success. The genome achievements this past year epitomize this century-long and decade-long quest.

Most remarkably, sequencing output has skyrocketed: In May 1999, the public archives contained about 700 million bases of the human genome; by May this year, the figure was more than 3 billion and just 3 months later, more than 4 billion. This is partly thanks to an increase in government, corporate, and foundation support. But new technology in the form of better automated sequencers, as well as intense competition between public and private sequencing efforts, also drove this acceleration.

Sequencing is also faster because gene wranglers have shifted their focus from turning out finished genomes, in which all the bases are in the right order, to generating draft sequence, with bases and even whole sections of DNA still missing or in the wrong place. So whereas human chromosomes 22 (finished late last year) and 21 (completed in May), as well as the Arabidopsis genome (published last week), have all undergone time-consuming “finishing” in which the pieces are put in the right order and discrepancies are resolved, most of the rest of the human genome data that are freely available to the public exist as draft sequence and are only now being cleaned up.

One of the first tests of the value of less- than-completely-finished sequences came in March, when academic experts and Celera Genomics of Rockville, Maryland, teamed up to publish the genome of Drosophila melano-gaster, a fruit fly long studied by geneticists. Even with some 1200 gaps where bases were missing altogether, the sequence yielded many new insights about how genomes work.

The Drosophila project also demonstrated the potential of whole-genome shotgun sequencing for large genomes. In this approach, the entire genome is chopped up into millions of overlapping pieces, which are sequenced and reassembled all at once. Shotgunning had worked for sequencing microbial genomes—smaller than 10 million bases—but until Drosophila, most sequencers had tackled larger genomes piece by piece, dividing each piece into small chunks for sequencing and assembly. The completion of the Drosophila genome convinced most that some combination of a whole-genome shotgun and the piece-by-piece approach might be the most efficient way to decipher big genomes.

The Drosophila project was hailed as a model of public-private collaboration. It stands in sharp contrast to the acrimony between Celera and the publicly funded international Human Genome Project over the sequencing of the human genome. That rivalry reached an all-time low in early spring, with barbs flying in the press. But by June, relations had improved enough that the two groups jointly celebrated the near completion of a rough draft at the White House. Although the two groups are not working together per se, they have agreed to publish their work to date on the human genome sequence simultaneously, most likely in early 2001. Meanwhile, the Human Genome Project has released its sequence data free of charge through a publicly accessible database and is moving ahead with finishing the draft. It should have that job done by the end of 2003, if not sooner. Until its paper is published, Celera is making its human data available only to academic and commercial subscribers.

Already, researchers are using the new technology to study many genes or proteins at once. Dotting thousands of bits of genetic material on gene chips for studying the simultaneous expression of thousands of genes has resulted in new insights into the heterogeneity of cancer, the causes of aging, and the complexity of the immune system. And databases of genetic markers called single-nucleotide polymorphisms, or SNPs, which differ from one person to the next, should prove useful in tracking down disease genes and assessing an individual's susceptibility to certain diseases.

Yet, it's not the genes but the proteins they code for that actually do all the work. A host of promising procedures has cranked up the study of these workhorses, including microarrays made with protein spots instead of DNA spots. In 2000, during their search for new protein-protein interactions, researchers parlayed information about 27 nematode proteins with known roles so as to glean the functions of 100 others that had been complete mysteries. These efforts bespeak the coming era of proteomics, the identification and characterization of each protein and its structure, and of every protein-protein interaction.

This explosion of genetic knowledge comes with some heavy ethical and social baggage: It is not clear how society will deal with the growing potential to manipulate genomes, and many governments are grappling with how to protect individual rights once the technology exists for reading each person's genome. But the allure of this knowledge has made the quest irresistible. The world eagerly awaits the published draft of the human genome, with its genes outlined and its character explained. And almost as eagerly, the gene searchers are chasing down the genomes of many other organisms, a quest that will tell us more about our own genome as well as about our place in the grand library of life.

ReferenceS

  1. Links

Navigate This Article