Keeping Genome Databases Clean and Up to Date

+ See all authors and affiliations

Science  15 Oct 1999:
Vol. 286, Issue 5439, pp. 447-450
DOI: 10.1126/science.286.5439.447

You are currently viewing the summary.

View Full Text


As the size of GenBank, the public archive that contains every published DNA sequence, and the number of other biological databases grows, so does the need for ways to update and coordinate the information they contain. Genomics experts estimate that some 2% of GenBank's entries may contain DNA introduced by experimental procedures. In other entries, bases are missing or incorrect in stretches of supposedly finished sequence, or genes are even placed on the wrong chromosome. Dozens of teams of bioinformaticists and biologists are trying to tackle the problems, but it's a daunting task, and a lack of funds is hampering more systematic approaches to the problem, such as having experts review incoming information or developing ways to link existing entries with new data.