This Week in Science

Science  16 Feb 2001:
Vol. 291, Issue 5507, pp. 1155
  1. A Tale of Two Sequences

    The publication of the sequence of the human genome is the result of years of intense effort and debate within the scientific community. A special news section (p. 1177) describes how we arrived at this momentous occasion, the debates that led to the publication of two separate sequences, some of the key players, and what lies in store for the sequencing centers. A pullout timeline (p. 1195) hits the high points of the genome story so far.

  2. The Data Horizon

    Although the impact of two human genome sequences will be enormous, it can be difficult to evaluate what is being represented or how to use it. Galas (p. 1257) provides a user-friendly guide to some of the fundamental aspects of the genome sequence. He also includes a number of tools and approaches to gene finding and gene structure that can provide access to the new data for the working biologist.

    Filling in the blanks in the sequences and verifying the results will be aided by additional data sets, such as the radiation hybrid map of approximately 36,000 markers described by Olivier et al. (p. 1298). They examined the concordance and discordance between the human sequence derived by Venter et al. and the human genome draft sequence derived by the publicly funded effort. This map should help to identify missing segments in the draft sequence from hard-to-clone regions of the genome.

    The massive quantities of data generated by genomic research provide challenges for bioinformatics, which seeks to integrate computer science with applications derived from molecular biology. Roos (p. 1260) identifies two major challenges to the advancement of bioinformatics research: How to use data released before publication, and how to define restrictions on community use of the data after it has been published.

    Proteomics—the large-scale analysis of a cell's proteins—has already supplanted genomics as the focus of biological research, according to Fields (p. 1221). New technologies combined with traditional molecular strategies are revealing protein function, interactions, and modifications. Better technologies and closer collaborations between scientific disciplines will be needed to mine, analyze, and compare proteomics data.

    Questions of protein function and interactions, developmental and physiological pathways, and systems biology are the focus of the International Mouse Mutagenesis Consortium (p. 1251), who have proposed goals and outlined plans for the next 10 years for annotating the mouse genome and compiling data on mouse mutations.

  3. Insights from Genomic Data

    The complete assembly of the entire human genome sequence by Venter et al. confirms recent estimates that the total number of human protein coding genes might be less than 30,000—only one-third more than the nematode Caenorhabditis elegans. Claverie (p. 1255) points out that such a low number of genes could drastically modify our understanding of organism complexity and evolution, as well as our current interpretation of transcriptome analyses. He suggests that there may be severe consequences for the long-term sustainability of the biomedical industry in the postgenomic era.

    Courseaux and Nahon (p. 1293) analyze the structural organization, pattern of expression, and origin of two genes that have emerged during primate evolution by a combination of retrotransposition of an RNA sequence, sequence mutations, and de novo creation of splice sites in adjoining sequences. These findings shed light on the first steps in the origins of new genes, and offer clues to the process by which humans and their close primate relatives diverged genetically from other mammals.

    Cells are continually exposed to environmental and endogenous insults that damage DNA. Left unrepaired, this damage will eventually lead to genome instability, with devastating consequences for both the cell and the organism. Wood et al. (p. 1284) have surveyed the human genome sequence and compiled a comprehensive list of genes that help the cell recognize and repair DNA damage. Ongoing studies of how the products of these repair genes interact with one another promises to shed new light on fundamental cellular control mechanisms that go awry in cancer as well as in normal aging.

    Comparison of the proteins coded in the human genome with those from the fruit fly and worms (nematodes) confirms that, in the course of evolution, the process of programmed cell death, or apoptosis, has become more complex. Aravind et al. (p. 1279) found that nematode cells function with just one protein in the NACHT family of nucleoside triphosphatases in their apoptotic arsenal. The human genome, however, shows no fewer than 18 proteins that belong in this family and that are related to NAIM (neuronal apoptosis inhibitory protein), a protein defective in spinal muscle atrophy. Oddly, homologs of proteins in the human apoptotic machinery are found in some bacteria, suggesting that there has been relatively recent gene transfer.

    Once gene sequences are determined, the next question is often to ask how these data relate to expression. Caron et al. (p. 1289) describe the integration of existing serial analysis of gene expression (SAGE) data, which show the level of messenger RNA expression, with the human gene map to reveal the pattern of genome-wide expression. This transcriptome map, created from both normal and diseased tissue types, indicates that highly expressed genes tend to be clustered in specific chromosomal regions, or RIDGEs. This organization is unlike that of yeast and suggests that the human genome exhibits a higher order structure.

  4. The Sequence of the Human Genome

    The sequence of the human genome assembled by Venter et al. (p. 1304) was obtained by pursuing a “shotgun strategy” in which the genome was broken into a random set of fragments of known sizes for sequencing. Computer algorithmic methods were then used to assemble the fragments into contiguous stretches that could be assigned to the correct location within the genome. The article (with representations of Fig. 1 as a chart polybagged with the issue and on Science Online) contains the method and documentation of the quality of the reconstructed genome as well as initial looks at the genes and their organization.

  5. Medicine and Genomics

    From SNP maps to individual drug response profiling, the human genome sequence should lead to improved diagnostic testing for disease-susceptibility genes and individually tailored treatment regimens for those who have already developed disease symptoms. Peltonen and McKusick (p. 1224) discuss how the human genome sequence and the completed genome sequences of other organisms will expand our understanding of human diseases, both those caused by mutations in a single gene and those where many genes and multiple factors are involved.

    Behavioral variations among individuals, and behavioral disorders generally have a large and complex genetic component, according to studies with twins and adopted children. It has been difficult to identify genes, each of which may only contribute a small amount to the phenotype, in these complicated systems, but the availability of multiple complete human genome sequences will now facilitate the application of methods such as allelic association. McGuffin et al. (p. 1232) argue that a greater understanding of the biological bases of behavioral disorders will destigmatize such diseases.

    One of the most difficult issues is determining the proper balance between privacy concerns and fair use of genetic information. Although U.S. federal law does provide some protection against discrimination in health insurance, continued scrutiny is needed. Senators Jeffords and Daschle (p. 1249) argue that policies protecting confidentiality in employment, research, and genetic testing (particularly in the reproductive sciences) should be developed and put into federal law. Eventually every country must decide what genetic information should be protected, who will have access to it, and how it may be used.

  6. Metaphors and Meanings

    The sequencing of the human genome affects how we think about ourselves. Pääbo (p. 1219) points out that comparisons between our genome and those of other mammals (particularly the great apes) will reveal how similar we are to the rest of life on Earth. And, the small differences between our genome and those of other animals will provide insights into what makes us uniquely human.

    Recent books have already begun to chronicle the impact of genome sequencing, and three have been reviewed in this issue. Kay has examined how the metaphors of information, language, and code influenced the research program and claims of molecular biology. In his book review, Lewontin (p. 1263) believes biologists should not be dissuaded by her poststructuralist terminology because she effectively demonstrates why the metaphors can be misleading. Keller has used a sketch of the history of 20th-century genetics to argue that the “primacy of the gene” is now in crisis. Carroll's review (p. 1264) finds her conclusions unconvincing, in part because they do not consider recent developments such as the growing success of genetic analyses of complex traits. The race for the genome itself provides some insight into the human condition, and a book by Davies traces the efforts to sequence the human genome from their beginnings in the mid-1980s to the June 2000 finish. Brenner's review (p. 1265) offers his own perspective on these stories of the science, politics, and people drawn into the race.