The Data Horizon

Science  16 Feb 2001:
Vol. 291, Issue 5507, pp. 1155
DOI: 10.1126/science.291.5507.1155b

Although the impact of two human genome sequences will be enormous, it can be difficult to evaluate what is being represented or how to use it. Galas (p. 1257) provides a user-friendly guide to some of the fundamental aspects of the genome sequence. He also includes a number of tools and approaches to gene finding and gene structure that can provide access to the new data for the working biologist.

Filling in the blanks in the sequences and verifying the results will be aided by additional data sets, such as the radiation hybrid map of approximately 36,000 markers described by Olivier et al. (p. 1298). They examined the concordance and discordance between the human sequence derived by Venter et al. and the human genome draft sequence derived by the publicly funded effort. This map should help to identify missing segments in the draft sequence from hard-to-clone regions of the genome.

The massive quantities of data generated by genomic research provide challenges for bioinformatics, which seeks to integrate computer science with applications derived from molecular biology. Roos (p. 1260) identifies two major challenges to the advancement of bioinformatics research: How to use data released before publication, and how to define restrictions on community use of the data after it has been published.

Proteomics—the large-scale analysis of a cell's proteins—has already supplanted genomics as the focus of biological research, according to Fields (p. 1221). New technologies combined with traditional molecular strategies are revealing protein function, interactions, and modifications. Better technologies and closer collaborations between scientific disciplines will be needed to mine, analyze, and compare proteomics data.

Questions of protein function and interactions, developmental and physiological pathways, and systems biology are the focus of the International Mouse Mutagenesis Consortium (p. 1251), who have proposed goals and outlined plans for the next 10 years for annotating the mouse genome and compiling data on mouse mutations.

