# News this Week

Science  15 Oct 1999:
Vol. 286, Issue 5439, pp. 382
1. BIOMEDICAL RESEARCH

# Varmus to Leave NIH in December to Run Sloan-Kettering

1. Eliot Marshall

Last week, Harold Varmus made it official. After weeks of rumors (Science, 30 July, p. 649), Varmus informed President Clinton that he will resign as director of the National Institutes of Health (NIH) “at the end of the year” to succeed Paul Marks as president of the Memorial Sloan-Kettering Cancer Center in New York City.

The long-anticipated announcement immediately touched off speculation about who will succeed Varmus at the top of the world's largest and best-known biomedical research institution. It may be hard to fill the post with an outsider just a year before a change of administration. And Varmus's departure rattled some NIH staffers involved in unfinished projects. As one said last week: “You understand why [Varmus is] going… but you can't help feeling, as the song says: ‘You picked a fine time to leave me, Lucille.’”

Varmus will leave NIH in good shape. Last year, Congress gave NIH the largest funding increase on record—$2 billion—bringing its total budget to$15.6 billion. Despite gridlock in Congress this fall, both Republicans and Democrats seem ready to repeat their generosity. The House appropriations committee recently approved a $1.1 billion increase for NIH in 2000, and last week the full Senate approved a$2 billion raise. This would put NIH's budget on the second year of a 5-year path toward doubling.

Varmus says he will “work intensively on NIH problems until the day I leave” in late December. After that, deputy director Ruth Kirschstein will “hold down the fort.” Kirschstein is “diligent, loyal, [and] hard-working” and is “adored” by her peers, says Elizabeth Marincola, executive director of the American Society for Cell Biology. Kirschstein has run NIH before: In 1993, she was acting chief while Varmus awaited approval. But it would add “credibility,” Varmus says, to have “someone who has the documented support of a search committee … and has been endorsed by the Senate.”

Varmus is urging the president to appoint a new director quickly. In his 7 October letter to Clinton, he wrote, “I am conscious of the risks you assumed in 1993” by choosing a bench scientist to head NIH. But, Varmus added, “I hope that the achievements of the past several years will encourage you and your successors to consider other active medical scientists to run this extraordinary agency.” He has been lobbying for a late-term appointment, he says, partly because it would “send a good signal about the bipartisan nature of the job.”

Few seem to think that will happen, however. “I think it would be hard,” says a House Democratic aide. “It's highly unlikely” says a Senate Republican staffer. “A long shot,” says a key House Republican aide, adding that he can't see someone of national stature “pulling up stakes and coming here for 18 months,” which is how long the job might last if the next president decided to appoint someone new. But conventional wisdom has been wrong before, and Varmus insists there's “a reasonable prospect” of getting a permanent chief soon.

The Administration could speed up the process by recruiting from within NIH. Last week, speculation centered on National Cancer Institute director Richard Klausner, National Institute of Allergy and Infectious Diseases director Anthony Fauci, and National Institute of Mental Health director Steven Hyman. But no local recruitment effort seems to have begun—at least not yet.

The lame-duck status of the NIH director isn't likely to have a big impact either on NIH's budget prospects or on the outcome of the contorversy over whether NIH should finance research on human embryonic human stem cells, according to congressional aides. But it could slow some projects—like Varmus's plan for the online database called PubMed Central (Science, 3 September, p. 1466). Varmus himself insists that PubMed Central “won't need me” after January, “when we get started and people see how effective it is and how few side effects there are for existing journals.”

Varmus, who will turn 60 next year, says he took the Sloan-Kettering job because “I wanted to be in a place where there was medicine and a prospect of seeing laboratory findings affect a patient. I wanted to be in a great city.” Varmus also has a high regard for Marks, his former Columbia University professor, whose molecular medicine lectures were “pivotal … in my intellectual growth” in the 1960s. (Marks will continue doing research at Sloan-Kettering.)

Varmus's compensation will be a little less than $1 million a year, six times his NIH salary. But he took a pay cut to come to NIH and notes that “I haven't had a salary raise in 6 years.” He and his wife, journalist Constance Casey, “like the idea of being back in the incredible cultural richness of New York.” He will continue his work on mouse models of cancer in his lab and “possibly branch out” to new subjects. But there is one drawback, he concedes: Manhattan isn't ideal for one of his passions—bicycling. 2. FRANCE # Kourilsky Takes Helm at the Pasteur Institute 1. Michael Balter Paris—France's preeminent center for biomedical research, the Pasteur Institute in Paris, will start the next millennium with a new leader. Philippe Kourilsky, an internationally known immunologist at Pasteur, will replace outgoing director-general Maxime Schwartz on 1 January. Schwartz is stepping down after serving the maximum allowed tenure of two 6-year terms. The decision, which was made by the institute's 20-member executive board on 7 October, has been broadly welcomed by Pasteur scientists, who have been engaged in a long debate on the institute's future. “Philippe is universally known as a good scientist and a good man,” comments Jean-Louis Virelizier, a viral immunologist at Pasteur. The appointment comes at a critical juncture for the institute and its 1100 full-time researchers. Since its founding in 1888 by Louis Pasteur, the institute has occupied an elite position in French biomedical research. It has a stellar scientific track record—eight Nobel laureates in physiology or medicine this century—and finances much of its current$165 million budget through donations, legacies, and income from its own activities such as patented vaccines and diagnostic tests.

In recent years, however, Pasteur has been struggling with both financial problems and an identity crisis. Although income from donations and legacies has been increasing, this is not expected to continue, and some lucrative patents, particularly for hepatitis vaccines and AIDS tests, are either expiring or are being taken over by industry partners. At the same time, the proportion of state funding has been decreasing: Although annual spending has nearly doubled since 1989, the government now provides only 30% of Pasteur's budget, compared to 48% a decade ago.

Meanwhile, Pasteur researchers have increasingly been disagreeing over key issues such as the proper balance between fundamental research and public health concerns. At times these behind-the-scenes debates have broken into the open. For example, Schwartz and Pasteur medical director Philippe Sansonetti ran into stiff resistance when they tried to create an epidemiology department at the institute (Science, 13 November 1998, p. 1241). Sansonetti, whom some had considered the heir-apparent to Schwartz, later quietly took himself out of the running for the top job. By the time the executive board met, the widely respected Kourilsky, who has a long career in basic research but also serves as a government and industry adviser on public health issues, was the only serious choice.

“Everyone was pointing to Philippe,” says Pasteur immunologist Antonio Freitas, who adds that Kourilsky is the right person to lead the institute out of its traditional ivory-tower isolation. “Pasteur needs to be modernized, to be much more open to the outside and establish collaborations with other institutions.” And while Pasteur developmental biologist Margaret Buckingham praises Schwartz's efforts to put the institute “back on the map” in microbiology and other public health-related fields after some years of stagnation in these areas, she says that Kourilsky “will now be well placed to do the same” in booming areas of biomedical research such as “immunology, virology, and animal models of human disease.” (Kourilsky, when contacted by Science, declined to comment on his appointment until he has taken up his duties.)

Whether Kourilsky will be able to resolve the issues facing the Pasteur Institute, only time will tell. But given the nearly unanimous approval of Pasteur scientists, his appointment may be the first thing they have all agreed on in years.

3. PLANETARY SCIENCE

# Neptune's Icy Cold Satellite Comes to Life

1. Richard A. Kerr

No planet lives forever, geologically speaking. After 4.5 billion years, Earth is in its middle age, its inner stores of heat trickling out to drive plate tectonics. Earth's moon, being smaller, has lost its life-giving heat faster; the flow of its surface-renewing lavas slowed and it died eons ago, like most of the other small bodies in the solar system. But planetary scientists are now realizing that a satellite even smaller than the moon, Neptune's Triton, is still showing signs of life.

The realization is all the more startling because it is based on reanalyses of 10-year-old observations. “This is showing us some real surprises in the outer solar system,” says planetary scientist Alan Stern of the Boulder, Colorado, office of the Southwest Research Institute. Triton's meager heat from lingering radioactive decay, researchers assume, can still melt its interior of exotic ices to produce lavas or otherwise reshape and renew the surface. Recent telescopic observations even hint that geologic activity has made itself evident in recent years.

Planetary scientists use the drizzle of comets and asteroids onto a body's surface as a kind of geologic clock: The more craters there are, the longer it's been since mountain building, flooding by lavas, and other geologic processes reshaped the surface. And a young surface means a lively inner planet. In 1989, when the Voyager 2 spacecraft took the first and only close-up images of Triton, planetary scientists set to work counting impact craters. They assumed that all the craters were made by comets coming from the Oort Cloud, far beyond the outer planets. Given the calculated flux of Oort-based impactors, they eventually concluded that Triton's surface on average has been accumulating impacts for about 600 million years or less, a young face for a 4.5-billion-year-old body but hardly as young as Europa's 50-million-year-old visage.

A discovery out past Neptune has changed that view dramatically. In 1993, astronomers first spied a resident of the Kuiper Belt, a long-hypothesized disk of bodies left over from the formation of the solar system (Science, 23 June 1995, p. 1704). Although billions of Kuiper Belt objects (KBOs) normally orbit 1 billion to 3 billion kilometers beyond Neptune and Pluto, some of them fall inward to add to the solar system's comets, and a few of these collide with planets and their satellites. Two planetary scientists—Stern and William McKinnon of Washington University in St. Louis—have now factored in a rain of impactors onto Triton from the nearby Kuiper Belt, a flux calculated to be five times that from the Oort Cloud. Stern and McKinnon told a workshop* last month that Triton now appears to have been resurfacing itself fast enough to make the average age of its surface around 100 million years old.

The most likely implication of such youthfulness, says planetary scientist Jeffrey Kargel of the U.S. Geological Survey in Flagstaff, Arizona, is that “Triton has been very active [geologically] through 98% of its history. … If it was active 100 million years ago, it probably still is active.”

Most researchers would agree with Kargel, but another pair of planetary scientists is offering evidence of an even younger age for Triton's surface. Kevin Zahnle of NASA's Ames Research Center in Mountain View, California, and Paul Schenk of the Lunar and Planetary Institute in Houston factored in the Kuiper Belt, too, but Schenk also took another look at Voyager images and counted craters again. This time, Schenk sharpened the images with the same mathematical technique used to clarify flawed images from the Hubble Space Telescope. Now he could more easily recognize true impact craters in previously cryptic terrain.

Surprisingly, Schenk found that “all the craters are on one side of the satellite.” As Triton orbits Neptune, it sweeps up debris “like a car driving through a rainstorm,” says Schenk, “so the raindrops all hit on one side of the car.” Where the debris came from is a mystery, but Zahnle thinks the best bet is the destruction of an inner satellite in a collision with a comet. If that's all true, Zahnle and Schenk told the workshop, Triton has been resurfaced so rapidly of late that few or no KBOs have had a chance to pock it; therefore, its surface would clearly be less than 100 million years old and quite possibly less than 10 million years old. That would make it as geologically young as Europa.

Whichever age is correct, “the important thing is Triton's surface really is relatively young,” says McKinnon. Given its meager supply of heat, its youthfulness requires a resurfacing agent so easily mobilized that it can modify Triton's 37-kelvin surface. Lavas of water plus agents like ammonia or methanol that lower water's melting point are a possibility, says McKinnon. They may be rising from an “ocean” 150 kilometers down, bounded above and below by water ice, he says.

Whatever has kept Triton looking young over the eons may have been at work in recent years. Astronomers Michael Hicks and Bonnie J. Buratti of the Jet Propulsion Laboratory in Pasadena reported at the workshop that telescopic observations show Triton taking on a strong reddish tint for a few months at a time. Somehow, Buratti says, “most of the surface” is being altered. “It looks like there's something geological going on.” Maybe it's just Triton freshening up once again.

• *Pluto and Triton: Comparisons and Evolution Over Time, held from 23 to 24 September in Flagstaff, Arizona. See www.lowell.edu/workshop

4. PLANETARY SCIENCE

# Another Distant Consort for the Sun?

1. Richard A. Kerr

The age of discovery for planet-size bodies in this solar system would seem to have ended in 1930 with the discovery of Pluto. That tiny body turns out to be just the largest bit of debris remaining from the formation of the planets. Most of the smaller bits ring the sun in the asteroid belt or in the Oort Cloud, the spherical swarm of distant comets far beyond Pluto. Astronomers therefore generally take Pluto to be the end of the line for planet formation. But a small band of astronomers has kept up the search for a tenth planet, and this week, researchers announced two independent proposals for the location of yet another companion to the sun. And if they are right, it would be no Pluto-sized midget.

Both proposals suggest that, out among the comets of the Oort Cloud, an object several times more massive than Jupiter is orbiting some 25,000 to 30,000 times farther from the sun than Earth. Both groups argue that this unseen behemoth gravitationally perturbs Oort Cloud comets, sending them toward Earth along a distinctive sky-girdling band. But the evidence doesn't impress many other researchers. “I just don't believe it,” says planetary dynamicist Harold Levison of the Boulder, Colorado, office of the Southwest Research Institute.

Attempts to track down unseen objects through their gravitational effects “has a long and not very honorable tradition,” says astrophysicist Scott Tremaine of Princeton University. “It worked in 1846 with Neptune,” he notes, when two mathematicians independently fingered the yet-to-be-discovered planet as the cause of unexplained squiggles in the orbital motion of Uranus. But “people have tried it since,” he says, “without much success.” For example, proposed tenth planets have failed to materialize, including “Planet X,” which was supposed to graze the inner edge of the Oort Cloud and explain periodic impacts and extinctions on Earth (Science, 22 March 1984, p. 1451). A proposed stellar companion to the sun, dubbed Nemesis, has also failed to turn up so far.

Now two groups—including some veterans of Planet X and Nemesis—are again proposing a tenth major body orbiting the sun. This week at the Division for Planetary Sciences annual meeting in Padua, Italy, physicists John Matese and Daniel Whitmire of the University of Louisiana at Lafayette argued that a planet or even a brown dwarf—a massive gas ball still too small to ignite stellar fires within it—orbits through the outer Oort Cloud.

They base their assertion on the paths taken by a third of the 82 most closely studied comets observed to fall from the Oort Cloud into the inner solar system. Most comets that make it into the inner solar system are shaken loose by the galaxy's gravitational jiggling of the Oort Cloud, which Matese and his colleagues assume would send an even rain of comets falling from all parts of the sky. But the Louisiana group, which included the late Patrick Whitman, finds that about three times as many comets as expected approach in a band of the sky that circles Earth like the longest stripe on a croquet ball. And these comets, bunched in the sky, also tend to have atypically short orbits, which don't take them as far into the Oort Cloud or as close to the sun as other comets. The best explanation, the group will report in Icarus, is a body having 1.5 to 6 times the mass of Jupiter and orbiting the sun at a mean distance of about 25,000 times the Earth-sun distance—that is, in the heart of the outer Oort Cloud. “The [orbital] statistics are not compelling,” says Matese, “but they're very, very suggestive.”

Planetary scientist John Murray of The Open University in Milton Keynes, United Kingdom, also thought the bunching of comets in the sky was suggestive. In this week's Monthly Notices of the Royal Astronomical Society, he follows much the same trail as the Louisiana group and arrives at much the same conclusion. But he goes further, locating the putative comet perturber precisely near the constellation Aquila the Eagle.

Those familiar with the vagaries of cometary orbits remain skeptical. “There are some anomalies in the distribution” of comet orbits, says planetary dynamicist Julio Fernandez of the Institute of Astronomy in Montevideo, Uruguay, “but the statistical sample is not large enough to draw such conclusions.” Tremaine agrees about the small sample size and adds that the Oort Cloud is not likely to be as uniform as Matese and Murray assume. Recent close encounters with passing stars may explain the comet clumping, he says. Levison agrees with both those criticisms and raises the possibility of observational bias, the tendency of comets to be found in a band near the plane of the solar system because that is where astronomers tend to search.

Matese, for one, rebuffs the criticisms but remains philosophical. The orbital anomalies “are not likely to be explained by chance, bad data, or selection effects,” he says, but “nothing is going to be settled by most of these statistical arguments. The vast majority will remain skeptical, perhaps rightly so.” Matese and Whitmire will be patient. They are still waiting for their mid-1980s proposal of Planet X to pan out; and Whitmire had his own version of Nemesis. The final resolution of their latest proposal, says Matese, may come with infrared telescopes capable of detecting the perturber's warmth, like the Space Infrared Telescope Facility, due for launch in 2001.

5. BUDGET 2000

# Congress Boosts NSF, Reverses NASA Cuts

1. Andrew Lawler*
1. With reporting by Jeffrey Mervis.

The budgetary roller coaster ride for many U.S. scientists ended last week when President Bill Clinton said he would sign a bill that gives the National Science Foundation (NSF) a significant boost for 2000. The bill also grants NASA's science program more than either the Senate or the House had been willing to provide, although still less than the agency asked for. That victory, however, comes with a steep price tag for NASA: millions of dollars in pork-barrel spending.

House and Senate members who met on 7 October in a crowded chamber in the Capitol set aside $13.65 billion for NASA and$3.91 billion for NSF for the budget year that began 1 October. Both figures are close to the amount Clinton wanted. “I am delighted … it's a win for the economy and the nation,” said NSF director Rita Colwell in a prepared statement.

Legislators apparently robbed housing programs and the space station to put some money back into space science. Last month, the House had approved $240 million less than the agency's$2.1 billion request, whereas the Senate had cut the request by $120 million. Both actions were loudly protested by White House and NASA space science officials (Science, 24 September, p. 2045). But last week lawmakers, led by Senator Barbara Mikulski (D-MD) and Rep. Alan Mollahan (D-WV), agreed to a complicated maneuver that reduces the space science request by a mere$46 million. An additional $75 million will be spread across science, aeronautics, and technology programs, although it remained unclear early this week which programs will benefit from that money. And the lawmakers retained some$70 million in earmarks—unrequested spending—that NASA must swallow, including $15 million for a solar terrestrial observatory to be built and operated by two Maryland institutions, Johns Hopkins University and the Applied Physics Laboratory. The bill also reduces NASA's Discovery program of cheaper and faster space probes by$24 million, which NASA officials say could delay announcement of the next two missions.

Nevertheless, there was relief among space scientists. “Despite the fact we got caught up in serious budgetary give-and-take, we came out in the end with real support,” says Steven Squyres, a Cornell astronomer who chairs NASA's space science advisory panel.

Lawmakers compromised on the controversial Triana mission, a $75 million effort inspired by Vice President Al Gore that would beam back pictures of the whole Earth. Work on the spacecraft will be halted until the National Academy of Sciences conducts a study of its scientific goals. NASA had planned to launch the mission at the end of next year, but agency officials expressed relief: “This is not a termination,” said one. NASA life and microgravity sciences won a boost of$21 million above the $264 million requested, whereas earth science will receive only a$4 million cut to the $1.46 billion request—a far cry from the threatened$285 million House reduction. Much of that regained money will go to NASA's Goddard Space Flight Center in Greenbelt, Maryland. “The funding will save 2000 jobs cut by the House bill,” Mikulski said after the conference.

For NSF, the conferees voted a 6.6% increase, to $3.91 billion. That amount overrides a flat budget approved by the House (Science, 6 August, p. 813) and falls only$9 million short of NSF's requested hike of $250 million. It also restores funds for a key administration computing initiative and several new projects. NSF may have cemented its leading role in the proposed$366 million information technology initiative by receiving all but $5 million of its$110 million request for research and the full $36 million for a terascale computer. The conferees ratified both the Senate's$10 million boost to a $50 million plant genome program and its support for a$50 million biocomplexity initiative that the House had trimmed by $15 million. They also removed Senate language that would have shifted$25 million in logistical support for Arctic research—a boost of $3 million over the request—from NSF to the independent Arctic Research Commission (Science, 1 October, p. 24). “I guess it was a tempest in a teapot,” says commission director Garrett Brass, “and we appreciate their continued support for Arctic logistics.” 6. ARMS CONTROL # Scientific Groups Endorse Test Ban 1. Eliot Marshall Physicists took center stage in Washington, D.C., last week for a quick reprise of the military debates of the 1980s. President Clinton appeared with a group of scientists and military leaders on 6 October for a spirited defense of the Comprehensive Test Ban Treaty (CTBT), which would ban all nuclear testing. Opponents of the treaty, who regard it as a threat to national security, cited their own technical experts. They also mixed in the carefully worded testimony of the heads of the three U.S. weapons laboratories about the limitations of any treaty, which were also aired at a congressional hearing held 1 day after the White House event. This Cold War-era rhetoric was the result of a surprise 30 September announcement by Senate Majority Leader Trent Lott (R-MS) that the treaty, which President Clinton sent to the Senate 2 years ago, would be brought up for a vote by mid-October after 2 days of debate. (A two-thirds majority is required for approval.) Recognizing that he lacks the votes to win, Clinton at press time was negotiating for an indefinite delay. The eight pro-CTBT physicists who participated in the White House event represented a group of 32 Nobel laureates who signed a statement arguing that it is “imperative” that Congress approve the treaty to “halt the spread of nuclear weapons.” Charles Townes, a University of California, Berkeley, physicist who co-invented the laser, noted that the United States began a unilateral moratorium on nuclear testing under President George Bush in 1992. “My colleagues and I … have concluded that continued nuclear testing is simply not required to retain confidence in America's nuclear deterrent,” he said. On the same day, two other scientific societies, the American Geophysical Union (AGU) and the Seismological Society of America (SSA), released an unprecedented joint statement expressing confidence in the treaty's verification scheme. The CTBT forbids parties from conducting or helping others conduct “any nuclear weapon test explosion,” and it establishes a complex administrative system to keep everyone honest. It would create an analytical center to collect data from a global network of sensors: 170 seismic stations (more than 70 of which are now functioning), 80 radionuclide sensors, 60 infrasound detectors of low-frequency blast waves, and 11 hydroacoustic ocean detectors. Under the CTBT, any nation that suspects another of conducting a test could demand, and presumably get, an on-site inspection. A challenger could also use evidence from its own “national technical means,” such as spy satellites. Clinton agreed to these terms and signed the treaty in 1996, sending it to the Senate for ratification in 1997. Fifty-one other countries, including Britain and France, have now ratified it. Lott opposes ratification, as do many Republican senators, including John Warner (R-VA), chair of the Armed Services Committee, and Jesse Helms (R-NC), chair of the Foreign Relations Committee. Warner, for example, has said he's concerned that the treaty would deprive scientists of the best means—actual nuclear explosions—of checking the safety and reliability of U.S. weapons. Other opponents doubt the monitoring network is good enough to prevent cheating. Treaty opponents trumpeted a 3 October story in The Washington Post in which unnamed “senior officials” said the Central Intelligence Agency has “concluded that it cannot monitor low-level nuclear tests by Russia precisely enough to ensure compliance” with the treaty. CIA spokesperson Bill Harlow says this is a simplification of the CIA's report but declines to clarify the CIA's view. The effect was “devastating,” says one physicist lobbying for the treaty. The chiefs of the Department of Energy (DOE) weapons labs, called to testify before the Senate Armed Services Committee on 7 October, didn't provide their political bosses with much ammunition, either. Bruce Tarter, director of the Lawrence Livermore National Laboratory, said that simulated weapons testing under DOE's$4.5 billion-per-year Stockpile Stewardship Program “has an excellent chance of ensuring that this nation can maintain the safety, security, and reliability of the stockpile without nuclear testing” but that “it is not a sure thing.” C. Paul Robinson, director of the Sandia National Laboratory, said the best guarantee of security is to continue testing weapons. “To forego that validation … [is] to live with uncertainty,” Robinson warned. Los Alamos National Laboratory chief John Browne said the reliability of nuclear weapons requires a “national commitment”—in other words, generous funding of the stewardship program and less criticism of lab management.

The CTBT did receive a technical vote of confidence last week from a joint AGU-SSA review panel, which had examined the plan for detecting low-yield nuclear tests. The CTBT allows for on-site inspections covering no more than 1000 square kilometers of any alleged test site. This is the area within which current technology can pinpoint the location of a magnitude 4 seismic event (equivalent to a 1-kiloton blast). The AGU-SSA panel, chaired by seismologist Terry Wallace of the University of Arizona in Tucson, concluded that the CTBT verification network, when complete, “can be relied upon” to detect and locate 1-kiloton tests. Members of the panel—including Gregory van der Vink of the Incorporated Research Institutions for Seismology in Washington, D.C., and Jeffrey Park of Yale University—said they didn't think it would be difficult to spot a weapons development program, even if the tests were very small.

The AGU-SSA group acknowledged a major uncertainty, however: No one has reliable data on a blast deliberately “decoupled” from the environment. Some research suggests the seismic signal would be nearly cut in half by decoupling, a process in which a damping substance (or air) is introduced between a bomb and the surrounding structure to reduce the transmission of blast waves. But doing this would require “extraordinary technical expertise,” according to the AGU-SSA statement, and in any case, “the likelihood of detection is high.” Van der Vink said that decoupling a bomb might increase the risk of radionuclide release, which would be picked up by an independent set of sensors. “No nation could rely upon successfully concealing a program of nuclear testing, even at low yields,” the panel concluded.

Whatever the Senate does, the test ban is likely to be discussed at election time. And that may mean an encore for the scientists who appeared in last week's drama.

7. CELL BIOLOGY

# New Insights Into Cystic Fibrosis Ion Channel

1. Michael Hagmann

For commuters all over the world, a broken traffic light can be a nuisance. But when the proteins that regulate the traffic of molecules into and out of cells malfunction, it can spell disaster. Take the protein encoded by the gene at fault in cystic fibrosis, which strikes about one in 3000 newborns every year in the United States alone. Known as the cystic fibrosis transmembrane conductance regulator (CFTR), this protein channels chloride ions through the cell membrane, thereby regulating the water and salt balance in cells that line organs such as the lungs and intestines. Mutations that prevent the CFTR from doing its job disrupt this chloride transport, which in turn causes the lungs and certain other organs of affected individuals to fill up with thick, sticky secretions, setting the stage for life-threatening lung infections.

New insights into how the CFTR works may now help researchers design drugs that regulate the operation of this vital cellular channel. On page 544, physiologist Kevin Kirk of the University of Alabama, Birmingham, and his colleagues report that they've identified an interaction between two parts of the CFTR molecule that seems to help keep the channel open. Drugs directed at the CFTR regions involved in that interaction might therefore serve to either enhance or inhibit chloride transport through the channel.

Although few cystic fibrosis patients are expected to benefit from any channel-activating drugs—most CFTR mutations prevent the protein from even getting to the membrane—channel inhibitors may have a wide application. Various forms of watery diarrhea, including those caused by the cholera bacterium and pathogenic Escherichia coli, are due to toxins that kick the CFTR into overdrive. Such infections kill far more people than cystic fibrosis, mainly children in developing countries. “This is a solid step forward, one of the more important insights into CFTR regulation in recent years,” says biochemist and CFTR co-discoverer Jack Riordan of the Mayo Clinic in Scottsdale, Arizona.

The new findings stem from a discovery Kirk, Anjaparavanda Naren, and their colleagues made about 2 years ago. They found that a membrane protein called syntaxin 1A shuts down the CFTR channel by holding on to one of the channel protein's “tail” regions, which protrude into the cell interior. This suggested that the tail somehow controls CFTR function. To find out more about what it does, the team performed a series of experiments in which they either mutated specific amino acids in the tail region that binds syntaxin 1A or deleted the tail altogether.

The researchers introduced the mutant genes separately into frog eggs, where the protein products were made and inserted into the cell membranes. Because chloride transport affects the cell's electrical properties, the team assessed CFTR function by measuring the overall current across the membrane in response to signals known to activate CFTR. The researchers found that the tail had to be present for normal channel opening to occur. And they identified four negatively charged amino acids, all clustering on the same side of a predicted helical region of the tail, as crucial to that operation. “This suggested that the tail probably interacts with some other part of the CFTR,” Kirk says. Fishing for the partner region, the researchers came up with “the most obvious candidate,” as Kirk puts it. This is the so-called R domain.

Previous work has shown that the R domain keeps the CFTR channel closed until it is decorated with chemical tags called phosphate groups in response to CFTR activation signals. At that point it seemingly “swings out” of the way and sets the stage for a second incoming signal, the binding and subsequent cleavage of a small molecule called ATP, which provides the energy necessary to pop the channel open. Kirk now proposes that the tail binding to the R domain helps keep the channel unlocked. He and his team found, for example, that a CFTR with mutations in the presumed R-binding region opens at about the same rate as the normal CFTR but, once open, closes much faster. This, “of course, reduces the total amount of chloride ions” a cell can shuttle in and out, Kirk says.

In the past, the CFTR tail had attracted little attention, but that is now likely to change. “The tail is a new player in the game, and this suggests a new way of regulating the gating of CFTR which may involve other proteins that can bind to the tail,” says William Guggino, a cell physiologist at the Johns Hopkins School of Medicine in Baltimore. One of these other proteins is likely syntaxin 1A, which may keep the tail from interacting with the R domain until activating signals somehow disrupt syntaxin binding and release the tail to capture the R domain.

What's more, says Richard Boucher, a physiologist at the University of North Carolina, Chapel Hill, the newfound tête-à-tête between the tail and the R domain “gives you a clear-cut target” for drugs against cholera or for drugs to treat the 10% to 20% of cystic fibrosis patients whose CFTR makes it to the cell surface but is crippled by mutations. And though it's unclear whether such compounds will work, specific CFTR blockers, for instance, should help clarify the intricate workings of the CFTR, which even today—10 years after its discovery—still holds many secrets.

8. MATERIALS SCIENCE

# Words Writ--Very--Small by a Nanopen

1. Robert F. Service

In 1959, physicist Richard Feynman gave a speech that inspired later generations of scientists. Titled “Plenty of Room at the Bottom,” the talk foreshadowed one of today's hottest trends in material sciences: nanotechnology, assembling chosen molecules into tiny materials that can be used for everything from fluorescent dyes to solar cells. He did take some seemingly wild flights of fantasy, however, such as wondering whether crafty researchers would one day find a way to write an encyclopedia on the head of a pin. Now, chemists at Northwestern University have memorialized a paragraph of Feynman's speech in a most appropriate way: They've used a nanoscale pen and ink set to write it in an area just one-thousandth the size of a pinhead.

The work, performed by postdoc Seunghun Hong and graduate student Jin Zhu along with group leader Chad Mirkin, is described on page 523. It isn't the first example of nanoscale writing, but it is the first time researchers have accomplished the job with multiple “inks” that line up with one another to produce features as small as 5 nanometers.

More than just a gimmick, the achievement could pave the way for applications, ranging from testing novel catalysts to creating nanoscale electronic devices, that might reveal whether the dream of making electronic devices with the dimensions of molecules could ever succeed. “It seems like a real enabler” of nanotechnology, says Clifford Kubiak, a chemist and nanotechnology expert at the University of California, San Diego.

Previously, researchers have used either electron beam lithography or, more recently, the tiny styluslike arm of an atomic force microscope (AFM) to create nanometer-sized features on a surface. But these techniques can damage the surface, particularly if it's already been patterned with an organic ink, or leave behind molecular contaminants, making it hard to add new, pristine layers that line up in perfect registry with the ones below, a typical requirement for making electronic devices.

To get around these problems, Mirkin and his colleagues came up with a technique called dip pen nanolithography (Science, 29 January, p. 661). It takes advantage of another problem encountered by researchers trying to write features with a conventional AFM: water from the air that condenses between the tip and sample, interfering with the tip's motion and thereby the resolution of the image. Rather than being stymied by this problem, Mirkin's team used the water layer to transport organic ink on the AFM tip to a surface. Thus, dragging the tip over the surface produces well-defined lines.

With just one ink, they could write simple structures, including letters. But making an electronically active nanostructure requires positioning different organic conductors, insulators, and semiconductors in different regions. The Mirkin team hasn't yet accomplished that, but it has taken a step in that direction by figuring out how to align a second set of ink marks with the first.

They began by coating one AFM tip with an ink consisting of 16-mercaptohexadecanoic acid (MHA), an organic molecule capped with a water-attracting carboxylic acid group. They then used this ink, along with their computer-controlled AFM, to write a set of parallel lines 70 nanometers apart. Because they feared that their second AFM pass would damage these lines if they used it to locate them directly, they also put in cross-shaped alignment marks, which sit 2 micrometers on either side of the lines.

Next, the researchers changed their AFM tip to one dipped in a second ink called 1-octadecanethiol (ODT), which is capped with a water-repelling methyl group, and scanned this tip across the surface to find the alignment marks. The computer then positioned the tip near the original set of parallel lines and wrote another set alongside the first. Alternatively, the researchers simply use the second ink to fill in the space around the first set of features, because the two inks they chose don't mix.

Finally, to view the patterns they created, the team switched to an uncoated AFM tip, which they used to scan the entire surface. Since the tip encountered higher friction with the MHA than with the ODT, it could tell the two materials apart and create an image of the pattern.

All this tip-changing sounds slow and tedious, but the team has recently automated the procedure, enabling them to write letters reasonably quickly. Indeed, the 115-word paragraph from Feynman's speech took 10 minutes—“about the same amount of time it took us to print it out on our color printer,” says Mirkin.

While nanowriting could generate some interest among spies, Kubiak believes its real value will be in making numerous nanoscale electronic devices in a highly reproducible fashion. That would benefit researchers trying to understand how well such small devices operate. Mirkin adds that the technique could also be used by researchers trying to understand catalyst behavior, since it would enable them to place catalysts and reactants in precise locations, just a few nanometers apart, and then to watch to see how they react over time. But, as with most enabling technologies, the best use is probably not yet even a glimmer in anyone's eye.

9. POLICING OF SCIENCE

# A Misconduct Definition That Finally Sticks?

1. Jocelyn Kaiser

A White House panel was due this week to unveil the first government-wide definition of improper conduct in scientific research. True to long-circulating rumors, the new definition would narrow research misconduct to three specific acts: fabrication, falsification, and plagiarism (FFP). But officials at the White House's Office of Science and Technology Policy (OSTP) say they've fleshed out these categories to ensure that a variety of serious misdeeds are explicitly included.

The proposed misconduct policy, expected to appear in the Federal Register on 14 October, spells out a range of procedural steps for policing misconduct that all federal agencies would be forced to follow. The most controversial issue, however, is the definition, which the scientific community has agonized over for years. Those who have advocated a minimalist definition appear to have won the battle. “I think it is something that we are comfortable with,” says David Kaufman, a toxicologist at the University of North Carolina, Chapel Hill, and president of the Federation of American Societies for Experimental Biology (FASEB), who notes nonetheless that his organization had not seen the final phrasing before Science went to press.

The misconduct definitions now used by the Department of Health and Human Services (HHS) and the National Science Foundation (NSF) consist of FFP and “other serious deviations” from accepted practice, a clause that has been criticized as too vague. But other attempts to broaden FFP have faltered: A 1995 proposal for an updated HHS definition, from a commission headed by Harvard reproductive biologist Kenneth Ryan, drew fire for being too open-ended and potentially stifling creativity (Science, 21 June 1996, p. 1735). In April 1996, an OSTP committee called the National Science and Technology Council (NSTC) set out to craft its own definition of research misconduct.

The NSTC's proposal, obtained by Science, starts out much like the existing HHS definition, defining research misconduct as “fabrication, falsification, or plagiarism in proposing, performing, or reviewing research, or in reporting research results.” The added value of the NSTC wording is that it spells out each of these concepts in a sentence to ensure that they encompass misdeeds that may have fallen through the cracks. In addition, to make it clear that destroying a colleague's research data is considered misconduct, the NSTC definition added “manipulating research equipment” to the falsification category. The definition also explicitly covers plagiarism during peer review.

Included in the proposed policy is the statement that a misconduct finding must amount to a “a significant departure from accepted practices of the scientific community.” NSF had argued in favor of such wording, which echoes its own misconduct definition. The agency has invoked a similar clause in at least one case—to discipline a professor who sexually harassed several students. Although “there's no flexibility to go beyond research misconduct more broadly than it's defined in this policy,” an OSTP official says, agencies and universities are free to investigate and prosecute other transgressions during the course of research.

Parties now have 60 days to comment on the definition; the Office of Research Integrity and the National Academy of Sciences are planning a meeting on 17 November with FASEB, journal editors, and other groups to vet the proposal. Judging from the subdued reaction so far, it appears the agencies have reached the end of a long road toward a standard definition of research misconduct.

10. DIPLOMACY

# Gibbons Joins Effort to Boost Science at State

1. David Malakoff

The State Department now has a 12-step self-help plan for producing science-savvy diplomats. A National Academy of Sciences (NAS) panel last week sent Secretary of State Madeleine Albright a dozen recommendations for rebuilding her department's depleted expertise in science, technology, and health. But while top diplomats welcome the ideas—and have asked former White House science advisor Jack Gibbons to help put them into practice—they say a budget crunch could slow the progress. “The problem is not one of will but of resources,” says Ken Brill, acting head of the agency's Bureau of Oceans and International Environmental and Scientific Affairs.

The new report,* which Albright requested in April 1998, is a fleshed-out version of a preliminary study the panel released late last year (Science, 25 September 1998, p. 1937). It concludes that science-based issues—from trade in genetically modified crops to global climate change—are moving “to the forefront of the international diplomatic agenda” just as the State Department is losing technically trained staff. The number of full-time science counselors at embassies, for instance, has slipped from 22 in the 1980s to about 10 today. The panel also found it “striking and alarming” that foreign service officers assigned to the agency's roughly 300 science-related posts, many of them part-time, had “weak” academic credentials. “Ironically, as the world becomes more technologically interdependent, the trend at the State Department has been to downplay science and technical expertise,” says panel chair Robert Frosch, a research fellow at Harvard University in Cambridge, Massachusetts.

To reverse that trend, Frosch's committee goes straight to the top. Albright, the panel says, “should articulate and implement a policy that calls for greater attention to [science] dimensions of foreign policy throughout the department” and should appoint a high-ranking aide to make sure that technical advice is injected into policy discussions. The panel also recommends setting up an external advisory committee, training all diplomats on technical issues, strengthening the department's ties with research-oriented agencies, and assigning 25 new full-time science counselors to key outposts abroad. But the department shouldn't pick and choose among the recommendations, Frosch says. “We want it to be a package and not a menu.”

State Department officials won't say if they'll go that far. Albright plans to meet with NAS president Bruce Alberts “as soon as their schedules permit.” But Brill notes that State Department officials have already asked Gibbons, who left the White House last year, to help a committee review the report and draw up a game plan by next spring. Gibbons will also help develop the science advisor's post and aid ongoing efforts to strengthen training, increase dialogue with scientists, and recruit more academics to serve stints as science fellows within several departments. Brill warns, however, that all that may be tough to do on a budget that has shrunk by 15% since 1993.

• *“The Pervasive Role of Science, Technology, and Health in Foreign Policy: Imperatives for the Department of Science” (www.nap.edu)

11. SCIENTIFIC COMMUNITY

# 10 Years After the Wall, Science Revives in Eastern Germany

1. Robert Koenig

The scientists of former East Germany endured perhaps the toughest overhaul of any national research system. They came through it, but there are some regrets

BERLIN—A decade ago, bioinformatics researcher Jens Reich was struggling to keep working within communist East Germany's corrupt research system. Because he had been pegged as a political dissident, his research group in East Berlin had been dissolved and the secret police were tapping his phones. Biochemist Benno Parthier—who had refused to join the communist party—was stuck for the 24th straight year as a group leader at an Academy of Sciences institute in Halle. And Dagmar Schipanski, a successful electronics researcher, was languishing as an assistant professor in Ilmenau with almost no opportunity to travel to Western countries.

Today, Reich is an influential researcher at the Max Delbrück Center for Molecular Medicine, a national research center created in eastern Berlin. Parthier finally got to direct the Institute for Plant Biochemistry in Halle, and in 1990 he was elected president of the Leopoldina science academy (see sidebar). And Schipanski became the first former East German to head the Science Council, then ran for the German presidency earlier this year, and is now science minister for the state of Thüringen.

These three emerged successfully from the years of upheaval that followed the fall of the Berlin Wall on 9 November 1989, but many of their colleagues were not so fortunate. In the general euphoria after the East-West divide was bridged, East German scientists looked forward to joining the well-organized and generously funded research system of their West German compatriots; finally, they would be free to travel and would be insulated from political interference. But for many, absorption into West German research was no salvation. Within 2 years of the wall's demolition, the East German Academy of Sciences was disbanded and the Science Council carried out a tough evaluation of research institutes, which it reorganized, merged, or simply shut down. Researchers had to reapply for their positions, and a large fraction of the academy's 24,000 employees lost their jobs.

This harsh medicine, perhaps the most fundamental overhaul ever applied to a major research enterprise, has transformed former East Germany's once bloated and poorly funded research system. Several basic research institutes—some incorporating parts of centers from the old regime, others started from scratch in the past decade—are now widely regarded as being on a par with those in the west, or at least rapidly gaining ground. University research has not fared so well, however; it is catching up only slowly. And industrial research, which collapsed after reunification, is only now showing signs of life, thanks largely to a growing number of small start-up companies in fields such as computer software and biotech.

Today, 10 years after the wall came down, many former East German scientists view with satisfaction the transformations their institutions have endured. But the process has left scars that will be slow to heal. “Overall, I think the restructuring of science in the east has been a success,” Schipanski told Science. “Researchers are motivated now, and there have been fantastic improvements in their laboratory equipment.” Parthier says “the big science transformation decisions were inevitable,” but he regrets that so many older East German scientists lost their jobs after 1990 and too many talented young researchers left the country. Reich is pleased that “we now have the opportunity to do world-class research here,” but he thinks “we could have done better with the transition, which was done too hastily.”

One complaint continues to resonate among researchers in the east: Many feel they have been “colonized.” Westerners got the lion's share of the top positions after reunification, and institutes were shoehorned into West Germany's research structures. Germany, some critics believe, missed an opportunity to overhaul the whole system.

Indeed, government officials, business leaders, and editorial writers are trying to boost the sluggish economy by exhorting scientists like Kanegasaki and Murai to take the plunge into the business world. Three agencies have requested $85 million in the fiscal year beginning 1 April for a package of subsidies, tax breaks, and loans to nurture new companies, particularly “bioventures,” in the hope of expanding the country's minuscule presence in commercial biotechnology. This planned surge of money is the latest in a series of steps—from loosening regulations covering stock offerings to clarifying intellectual property rights—aimed at generating more start-ups, particularly from university and national institute labs. As a result, a few companies like Effecter are beginning to dot the landscape. But Japan's scientific community still has a long way to go to match the entrepreneurial vigor of U.S. researchers. Fostering a true venture business culture in Japan, says Yoshihiro Ohtaki, a molecular biologist who now heads Biofrontier Partners, “could take 10 years.” One sign of progress, in a country where the establishment still holds enormous sway, is the participation of some senior scientists in these start-ups. This spring, for example, Kenichi Matsubara, professor emeritus of molecular biology at Osaka University and a key figure in Japan's early human genome research efforts, joined a dozen colleagues to establish DNA Chip Research Inc. The company hopes to develop DNA chips for diagnostic purposes before tackling the technology needed to characterize the subtle genetic differences among individuals known as single-nucleotide polymorphisms. These differences are expected to help scientists trace disease genes and develop drugs tailored to those characteristics. Observers estimate that a dozen or more start-ups, covering everything from computer-based rational drug design to improvements in NMR techniques, have sprung from work at the Institute of Physical and Chemical Research (RIKEN) outside Tokyo, the University of Tsukuba, and other schools. And more are on the way. Megumi Takata, director of the Center for Advanced Science and Technology Incubation Ltd. (CASTI), the licensing organization affiliated with the University of Tokyo that helped to set up Effecter, expects another company to be formally established this year and several more by next spring. Officials of the licensing organization affiliated with the University of Tsukuba know of at least three groups working on business plans. And Biofrontier's Ohtaki says he is getting inquiries “from all over Japan.” These new entrepreneurs are taking advantage of a raft of changes in government policies. “There wasn't just a single bottleneck,” says Ohtaki. “It was more like a jigsaw puzzle with too many pieces missing.” In the past few years financial regulations that made it nearly impossible for start-ups to raise money and offer shares to the public have been relaxed, restrictions on the use of stock options loosened, and tax incentives created for financial “angels” to get behind venture businesses. Earlier this year, legislators enacted what is seen as the Japanese version of the 1980 Bayh-Dole Act, the U.S. law that gives universities the right to commercialize publicly funded research. But there's one big difference: In Japan the rights go directly to researchers. To help national university and institute researchers patent and market their discoveries, the government has also authorized special technology licensing organizations. Although these regulatory moves are important, scientists say, a shift in attitude among both private sector researchers and university professors will be essential. It's still extremely rare for someone like Murai to give up a well-paying, secure position at a big company for a risky start-up. And even a decade ago it would have been “considered unseemly for academics to engage in commercial activities,” says Matsubara. Most faculty at top schools join a university after receiving their doctorate and spend their careers climbing the ladder. Indeed, a pool of retired but still active professors may be a key to success: Kanegasaki says he would not have given up his professorship to start a business before reaching retirement age, and Matsubara also retired before launching his business career. A distaste for business shows up at every stage of the process. Kanegasaki says opposition from his former colleagues at University of Tokyo's Institute of Medical Science, for example, prevented him from renting lab space for his fledgling company. Many start-ups even try to retain an academic flavor by using “institute” or “research laboratory” rather than “company” in their corporate names. And few professors plan to take their companies public, says Kazunori Kondo, who studies venture businesses at the National Institute of Science and Technology Policy. Such privately held businesses, he argues, are less likely to become powerful engines of economic growth. Those who break through these prejudices face a different set of problems. Whereas faculty members can license rights to their discoveries, their active participation in private companies is still strictly limited. They cannot serve on a corporate board, for example, although they can serve as scientific advisors. And mid-career moves are still a lot more treacherous in Japan than in the United States, says Takata. “This makes it difficult for the start-ups to find the bench-level researchers they need to turn a discovery into a product,” he notes. One bright spot may be the growing ranks of researchers entering the scientific workforce after completing postdoctoral appointments (Science, 3 September, p. 1521). An even bigger problem, Ohtaki warns, may be a shortage of managers capable of building companies from the ground up. “There isn't such a pool of managers in Japan now,” he says. The only answer, he says, is to start more businesses so that potential managers can get the necessary on-the-job training. Good management principles are uppermost in Murai's mind, too. Sitting in his office at a conference table that once served as a kitchen table, he explains that his company's immediate goal is to show that the protein Kanegasaki discovered really has the potential to be developed into a drug. At that point, he says, the company could sell stock to the public and use the proceeds to conduct clinical trials or hook up with an established drug firm. But long before Effecter gets that far, Murai says, it's going to need a second round of venture financing. Even while spouting the language of a venture business manager, however, Murai is carrying out his share of the lab work. While that combination may be rare—“I don't think there are many like me in Japan yet,” he says—he believes that his training is right for the job: “I think it's going to be easier for researchers to learn about starting businesses than for business managers to learn about biotechnology.” 16. COMPUTER GAMES # Physics Meets the Hideous Bog Beast 1. Mark Sincell* 1. Mark Sincell is a science writer in Houston, Texas. Programmers are turning to physics to add more reality to computer games, but so far the early market tests have been disappointing When you slime a hideous bog beast with your laser blaster in real life, the beast doesn't consult a table to figure out which way it is supposed to fall. In the virtual world of most computer games, however, that's exactly what happens. A programmer has carefully scripted each potential event—like the fall of the blasted bog beast—long before you tear the shrink-wrap off your new game. If a particular combination of causes and effects isn't found in the programmer's predetermined table of allowed possibilities, it just doesn't happen. While this approach worked well enough when Pong was the state of the art in video games, many game designers think the traditional scripted game is becoming too restrictive. Their attempts to exploit advances in computer technology and inject more natural behavior into gaming have given birth to a whole new form of interactive entertainment: physics-driven computer games. So far only one physics-driven game, Trespasser, has made it to market—where it flopped. Nevertheless, several companies are now spending millions of dollars developing new games and the software engines to drive them. In one sense, the computer game industry is driven by novelty, and the potential payoff of the first truly physics-driven game is huge. “We are all looking for the next big thing,” says David Wu of Pseudo Interactive Inc., a Toronto-based game design company, “and physics is the biggest frontier in gaming right now.” Computer games are all about movement: prowling through a dungeon in search of treasure, skidding around corners in a high-speed chase, or sending an opponent tumbling with a well-placed flying drop kick. In a scripted game, movement is like a movie with several different endings. For each choice a player makes, a graphic designer has pre-recorded a video clip of the resulting motion. Although game designers are expert at linking the clips to produce an almost seamless illusion of continuous motion, they can't always cover every possibility. Inevitably, bugs creep into the animation code and the seams start to show. Although serious gamers don't expect the virtual world to be perfectly true to life, the last few years have seen dramatic advances in computer graphics that are raising players' expectations. “As the graphics get more realistic, your eye starts to pick out movement problems,” says Chris Hecker, founder of the Seattle-based game company Definition Six. The inconsistencies can cause players to “lose the suspension of disbelief that makes a game fun,” says Hecker. But how can a programmer possibly account for all the complexities of real motion? That's where the physics comes in. Instead of scripting each event, the new generation of programmers uses physical laws to create objects that obey a specific set of rules in all circumstances. Instead of saying that a car traveling around a curve at high speed will always skid into the wall and then creating a film clip showing the crash, the programmer writes in the appropriate coefficient of tire friction. Then the computer handles the gritty work of summing all the forces on the moving car to determine where it moves next. And if the car hits the wall, the computer can even predict the flight path of errant tires as they bounce over the wall into the grandstands. To apply the laws of physics, the game developer breaks down each physical object in the game into a collection of simple geometric components—cubes, spheres, or cylinders—connected by joints. “Then you assign masses to the parts of the body and add the properties of real joints, like a ball-and-socket or a hinge,” explains Anselm Hook, a game-physics developer at the London-based company MathEngine. The objects can be as simple as a boulder or as complicated as the human body. When two bodies interact, the physics “engine,” the portion of the computer code that handles the physics, first computes the forces on each object, including gravity, collisional impulses, and friction. It then solves the constrained differential equations of motion governing the components of each body and moves them forward in real time. As you might imagine, the physics engine soaks up precious computational resources. At any given moment, a game player's view might include several other creatures and various objects, and the engine must continuously monitor the forces on every item in the scene. “You can't forget about a box on a table if you want to keep it there with forces,” says Hecker. Top-of-the-line home computers and commercial video games have only recently acquired the horsepower to drive these complex engines. While the inexorable increase in computer speed should soon take care of that problem, physics-based games face other hazards. “Physics engines can get blown up” when a differential equation solver becomes unstable, says MathEngine's Bryan Galdrikian. Instability—a central problem in numerical analysis—happens when tiny differences between the numerical solution and the “real” solution accumulate, causing the computer to lose track of the “real” solution. Instabilities were one explanation for the commercial failure of the first physics-based computer game, Trespasser. Based on the movie Jurassic Park 2: The Lost World, Trespasser placed players on an island filled with malevolent dinosaurs that obeyed only the laws of physics. After several years in development and almost$7 million, Trespasser was released with all the fanfare—and hopes for profitability—that a Jurassic Park tie-in can provide. Unfortunately, few people bought it. “It looked more like a research project than a game,” says Wu. And Trespasser was plagued with instabilities that caused weird things to happen. “An object could suddenly sink into a rock,” says Galdrikian.

Bloodied but unbowed by Trespasser's flop, game designers are turning to academics to learn how to build more robust engines. Wu, for one, is looking to the world of robotics for better ways to control the creatures in his games. “I spend a great portion of my time reading research papers and journals,” says Wu. “If you want to innovate in game development, you must look deeper—into more focused and less applied research.” But that can be a hard assignment for game designers. Hecker, an entirely self-taught physicist, estimates he has spent “3 years and counting” studying game-related physics.

Alan Milosevic thinks he has a better idea. “Game developers are not experienced in physics,” he points out, “and they are gasping for help.” Instead of forcing developers to learn physics, he argues, why not provide them with a general-purpose physics engine that they can plug directly into their game? With a prepackaged engine controlling movement in the game, designers would be free to worry about making the game fun.

Betting that developers would rather buy an engine than learn physics, his company MathEngine—and competitors Ipion and Telekinesys—are working furiously to get the first fully functional engine to market. MathEngine has hired several physicists to help with the design, but Milosevic says that potential employees need more than an advanced research degree to succeed in gaming. “We can't just solve the fluid dynamics equations,” says Milosevic, “so our employees have to be able to improvise and imagine.”

After all, physics-based games need to be more than physically consistent: They need to be fun. Some video game aficionados worry that physics-based games will be dully realistic. If you can't jump 30 feet, they say, what's the point? But Galdrikian argues that “realistic doesn't mean the game has to be completely real, and gravity doesn't have to equal Earth gravity, or even be constant. It could change in a game.” Says Hecker: “The real motivation behind incorporating physics engines isn't reality at all,” but creating total consistency. If all the objects in a game obey a consistent set of rules, a gamer's absorption is less likely to be disrupted by actions that don't “feel right.”

Although game designers are betting that the “rightness” of physics-based games will eventually strike a chord with consumers, few are yet willing to risk years and millions of dollars to produce the next Trespasser. So, instead of sinking the entire investment into a game completely driven by physics, game designers are focusing on creating entertaining games while slowly incorporating more physics. For the time being, physics will be used “as eye candy, until we get used to it,” says Hecker.

17. # Do-It-Yourself Gene Watching

1. Eliot Marshall

The growing use of relatively inexpensive microarrays to monitor the expression of thousands of genes at once is creating a flood of data on everything from strawberry ripening to viral pathogenicity

Next week, students will begin arriving at the Cold Spring Harbor Laboratory on Long Island to begin “our most oversubscribed laboratory course on record,” says David Stewart, director of meetings. Sixteen people paid $1955 each to learn how to build and use a machine for genetics research—a device that deposits thousands of pieces of DNA in precise microarrays on glass slides. For another$30,000, four will actually take the machine home. “We were somewhat amazed,” says Stewart, surrounded by boxes of parts waiting to be assembled. The course is new and it wasn't even advertised, yet eight times as many people signed up as could be accepted.

Microarrays are hot. People who never thought they would do large-scale gene studies suddenly are eager to try their hand at monitoring thousands of genes at once. They are watching patterns of gene expression change as strawberries ripen, viruses cause disease, and tuberculosis infects host cells (see sidebar). And they are cataloging the genes that are overexpressed or suppressed when normal cells become cancerous. The National Institutes of Health (NIH) is supporting this trend, funding its own microarray studies and providing grants to institutions to buy the technology. All this is generating a flood of data that traditional journals find hard to accommodate and digital databases don't yet know how to handle.

The basic idea behind this surge of interest isn't new: Researchers have been using microarrays since the early 1990s to study gene expression en masse. What is new is the relatively low cost of entry into the field. Over the past year or so, inexpensive, do-it-yourself techniques like the one being demonstrated at Cold Spring Harbor have become widespread, replacing or complementing the high-tech “GeneChip” technology that was once about the only game in town.

The GeneChip system, made by the Affymetrix Corp. of Santa Clara, California, paved the way, and is still the system of choice for many pharmaceutical companies and academic labs that can afford it. Affymetrix uses a photolithographic method borrowed from the electronics industry to deposit probes for thousands of different genes on a single wafer the size of a dime. Each probe is a short stretch of synthetic DNA called an oligonucleotide that replicates a unique sequence identifying a gene. These “oligos” are laid down in precise, sequence-specific arrays. To determine which genes have been expressed in a sample, researchers isolate messenger RNA from test samples, convert it to complementary DNA (cDNA), tag it with fluorescent dye, and run the sample over the wafer. Each tagged cDNA will stick to an oligo with a matching sequence, lighting up a spot on the wafer where the sequence is known. An automated scanner then determines which oligos have bound, and hence which genes were expressed.

Affymetrix sells a variety of standard kits for yeast, Arabidopsis, mouse, rat, and human genes, among others, which are listed at $500 to$2000 per chip. (The chips are good for one use.) The company donates equipment to collaborators at major genome centers, but few labs get free chips and few can afford the estimated $175,000 it costs to install an Affymetrix setup. Several researchers claim that, until recently, it was also hard to get GeneChip arrays because supplies were short. Among those responsible for lowering barriers to the field are the three scientists who will be teaching the Cold Spring Harbor course, all from Stanford University: geneticist Patrick Brown, his former grad student Joseph DeRisi, and bioinformatics expert Michael Eisen. Brown, along with an engineering student named Dari Shalon, devised a cheap way of generating microarrays in the mid-1990s to study patterns of gene expression in yeast. It's simple but effective: Instead of using expensive and time-consuming photolithography to lay down oligo arrays, the Stanford team uses metal rods like fountain pens to deposit carefully selected cDNAs at known locations on a microscope slide. These cDNAs act as probes for genes expressed in a test sample. Shalon left Stanford to found a company based on this concept, Synteni Inc. of Palo Alto, California. Last year, Incyte Pharmaceuticals, also in Palo Alto, acquired Synteni for$80 million. Incyte now processes microarray chips for a fee, much as film is processed. But Brown and his lab took a different tack: They give the technology away.

Last year, DeRisi launched a Web site that explains exactly how to build a microarray machine with off-the-shelf parts (see sidebar, p. 446). And Eisen has given away gene-clustering software that identifies patterns in microarray data. Brown, meanwhile, has become a big proselytizer, inviting dozens of collaborators into the field. Kenneth Burtis, a Drosophila expert at the University of California, Davis, who followed DeRisi's lead and built his own arrayer, says, “Joe's take on it was: ‘People don't realize this isn't rocket science, and they shouldn't be afraid of it.’ That's the way I got swept up in this.”

Many other researchers are building machines, and several companies are now selling machines like Stanford's at roughly twice the price of the do-it-yourself model. Geoffrey Childs and Aldo Massimi at the Albert Einstein College of Medicine in New York City, and Vivian Cheung at the University of Pennsylvania, Philadelphia, designed and built microarrayers from scratch. Others, including a team at Rosetta Inpharmatics Inc. in Kirkland, Washington, and at the Hewlett-Packard Co. of Palo Alto, have developed ink-jet oligo printers, but these are not generally available.

Affymetrix, meanwhile, has taken steps to increase its production of GeneChip arrays and offer terms more agreeable to academics. In September, the company also moved into the spot microarray world, acquiring a small company that sells these machines, Genetic Microsystems of Woburn, Massachusetts. DeRisi views this move as an attempt to swallow the competition, but Affymetrix's vice president of marketing, Thane Kreiner, describes it as a way to give clients a technology that “complements” the GeneChip, although the company insists that GeneChip arrays yield higher quality data.

All of this points to a boom in microarray experimentation by “mom-and-pop” genetics labs. What is the attraction? Simple, Brown says: “As people look at large-scale pictures of the expression programs of genomes, they've begun to realize that there's at least as much information in genomes entirely devoted to [controlling] where and at what level the genes are expressed” as to defining proteins. Gene expression, he points out, is what really distinguishes one cell type from another. “And suddenly, that's just an open book.”

Among the sponsors of this technology is National Cancer Institute (NCI) director Richard Klausner. NCI was an early collaborator on GeneChip technology and has been funding large-scale studies of gene expression in cancers since 1996. Now NCI is backing low-cost microarrayers as well. On 21 September, the institute awarded $4 million to 24 institutions, including cancer clinics, to help them set up microarray facilities. “It is absolutely imperative that cancer researchers have open access to this technology,” Klausner said in a prepared statement. Klausner and others are hoping that the ability to monitor gene expression will enable them to “produce a snapshot of the genes that are active in a tumor cell.” This thrust was advocated by an advisory panel chaired by Eric Lander of the Massachusetts Institute of Technology and Arnold Levine, president of The Rockefeller University in New York City, both of whom are themselves major users of the Affymetrix technology. Lander, for example, has recently been developing tools for cataloging leukemias by their gene expression signatures (see Golub Report, p. 531). And Levine recently published a study of gene expression in colon cancer. Several lab chiefs at NIH also began collaborating on microarray studies with Brown, Eisen, and Stanford geneticist David Botstein in the mid-1990s. Now they're hooked. Jeffrey Trent, intramural research chief at the National Human Genome Research Institute (NHGRI), built a Stanford-style arrayer 3 years ago on NIH's campus in Bethesda, Maryland, and has been using it to study genes involved in melanoma. Like other devotees, Trent believes that GeneChip arrays and microarrays are powerful because of their huge data output. Big samples make it easier to spot patterns, such as common sets of genes expressed in different kinds of cells. The Stanford “mantra” is quite simple, says Eisen: “More data is good.” Eisen's software sorts through the color-coded microarray readouts, clustering genes that exhibit similar patterns of expression in various cells. Trent and his colleagues at NHGRI made their own slides to monitor the expression of more than 8000 human genes from 31 melanoma tumors. Offering a visitor a glimpse of the results last month, Trent pulled out a sheet with colored dots grouped in what he calls “Eisenized” clusters. Along the top are names of the melanoma cell types; down the side, in fine print, are the names of human genes whose fragments were deposited on the slides. To generate the data for this chart, Trent tagged cDNA from cancerous and normal control samples with red and green fluorescent dye, respectively, then washed the samples over the slides. Genes strongly expressed in the cancer cells as compared to a reference standard gleamed a lurid red when excited by a laser, while those underexpressed showed up in green. Genes expressed in roughly equal proportions came out yellow. Eisen's algorithm grouped genes with similar expression patterns across the range of cell types in colored blocks on the chart, on the assumption that the function of these genes is similar as well. Genes of known and unknown function turn up in clusters, so researchers tentatively assign functional labels to unknown genes based on their cluster mates. Trent acknowledges that this approach is “speculative,” but it is a first step, he believes, in developing new, molecular definitions of high- and low-risk types of melanoma. A short distance from Trent's lab on NIH's Bethesda campus, an NCI team led by Edison Liu and Louis Staudt is using a locally made arrayer to investigate breast cancer, leukemia, and lymphomas. Staudt described some of this work at a meeting of microarray researchers in Scottsdale, Arizona, last month, comparing it to astronomy. His lab is doing “discovery” research, he explained. Like Galileo, he suggested, NCI scientists have a new instrument so powerful it will let them see patterns that just weren't visible before. Staudt warned, however, that there are professional risks in this venture. Galileo was denied tenure, he joked, because he was handed “a pink slip saying [his telescope] wasn't hypothesis-driven”—something for which microarray studies are sometimes faulted. Staudt and colleagues have created what they call the “Lymphochip,” an array with 18,500 carefully selected genes involved in the development of the immune system's antibody-producing B cells. “We had absolutely no trouble getting the technology up and running,” says Staudt, who's working with Stanford to create a shared gene expression database. Already, he says, it looks as though microarray profiling “will be a very useful tool” for “subdividing disease categories and giving them a molecular identity.” Ash Alizadeh, one of Staudt's collaborators at Stanford, described how he used the Lymphochip to look at gene expression profiles in 50 cases of diffuse large cell lymphoma, long considered a “wastebasket category” of poorly defined illnesses. After linking genetic profiles to case outcomes, he identified two distinct subgroups—“diseases within a disease,” Staudt says. One gene expression profile appears to carry a good chance of survival; the other does not. If such results hold up, genetic profiling could be useful in diagnosing and treating lymphoma. ## Data overload With such tools coming on line and interest in expression studies on the rise, the volume of data in this field is likely to grow exponentially in the next few years. Already, Brown and others have been talking about new ways of storing, sharing, and publishing these huge files. Each experiment produces a flood of data: Trent's melanoma expression data, for example, would produce a print-out about 10 meters long if printed at full length—too big to publish in a journal. For the moment, Brown says, microarray users are storing results in their own Web- accessible files and opening them to the public when they publish a journal article. Personally, Brown would be happy to skip the journal-controlled part of this process and put the data right out on the Web. That's why he's enthusiastic about NIH's plan for online publishing, PubMed Central (Science, 3 September, p. 1466). One problem—where to archive data—may be solved soon. At the Arizona microarray meeting in September, David Lipman, director of NIH's National Center for Biotechnology Information, announced that NCBI staffer Alex Lash is heading up the design of a new database for the field, to be called the Gene Expression Omnibus (GEO). It will connect sets of experiments that appear “relevant to each other,” so that a user could quickly find all the experiments involving certain gene families and look for common themes. “We're working on fields and data structures and will load samples this fall,” Lipman says. He hopes GEO will be running by spring. Yet to be resolved, however, is how to make results comparable. GEO will ask researchers submitting the data to define the experimental “platforms” they use. That may be simple for people using arrays or array services such as those provided by Affymetrix and Incyte. But there are no standards for homemade devices, and small differences in experimental conditions may lead to discrepancies in results. But Lipman isn't rushing to impose standards on the young field. Brown thinks that's the right course: It would be a mistake, he says, to try to impose rules on the field while investigators are still in exploratory mode, pointing their microarray telescopes at the universe of genes. Better to let standards evolve gradually, as the data start pouring into GEO in 2000. 18. # An Array of Uses: Expression Patterns in Strawberries, Ebola, TB, and Mouse Cells 1. Eliot Marshall When scientists began using microarray devices to study gene expression in the early 1990s, many focused on the same humble organism: brewer's yeast. Since then, they've widened their horizons. Experiments are under way studying how genes are turned on and off in complex plants, pathogens, model animals such as the nematode and mouse, and human cancer cells. Some of these projects were on display last month at a meeting of microarray users organized by Nature Genetics in Scottsdale, Arizona*, where the following examples were highlighted. • Just about everyone likes strawberries, but no one has identified the genes involved in fruit development, according to Asaph Aharoni, who decided to look for the answer in a gene expression study. Aharoni, a biologist at the Center for Plant Breeding and Reproduction Research in Wageningen, the Netherlands, focused on a group of 1800 genes from strawberries. Using microarray technology developed at Stanford (see main text), he printed strawberry cDNAs—probes for expressed genes—on slides and monitored which genes were being expressed in fruit, from green to fully ripe. Aharoni found 200 genes whose expression varies with development, including a late-stage cluster that is turned on during membrane breakdown. Now he aims to look at genes affecting hormonal control. • Kevin Anderson and Chunsheng Xiang of the U.S. Army Medical Research Institute of Infectious Diseases in Frederick, Maryland, investigated a more sinister organism: Ebola virus. They were curious about what makes the Ebola-Zaire strain a feared killer and the Ebola-Reston strain—which turned up in a Virginia primate lab in 1989—not a known threat to humans. Using a cDNA array of 1400 human genes, Xiang compared the gene expression profiles of normal human monocyte cells and cells that had been infected with two strains of virus. The Ebola-Zaire strain produced a “remarkably different” pattern from the Reston strain, according to Anderson. It strongly induced genes that produce immune-system regulators called cytokines and chemokines, along with inhibitors of apoptosis. Anderson says this may suggest how the deadly Zaire strain spreads rapidly. • A large research team is building a comprehensive collection of full-length mouse genes under the leadership of Yoshihide Hyashizaki at the RIKEN Genomic Sciences Center in Tsukuba, Japan. Speaking for RIKEN, Yasushi Okazaki presented data from the most recent addition to this massive database—a map of gene expression in the mouse body. Okazaki and colleagues, with help from Stanford, have arrayed 20,000 mouse cDNAs and recorded distinctive gene expression patterns for the heart, liver, tongue, kidney, lung, spleen, placenta, and other tissues. • A group of British researchers used the genomic sequence of Mycobacterium tuberculosis, finished just last year, to look at which genes are turned on in this lethal bug during infection. Joseph Mangan of St. George's Hospital Medical School in London used an array of TB genes to see which were most highly expressed as the organism invaded the scavenger cells called macrophages. Among the genes that appear most active during early infection are a group involved in capturing iron, suggesting that the organism competes with the host for iron, and in a “dormancy” response that may help TB evade immune attack. • *The Microarray Meeting, 22 to 25 September, Scottsdale, Arizona. 19. # Companies Battle Over Technology That's Free on the Web 1. Eliot Marshall The microarray revolution reached a flash point at Stanford University on 17 April 1998. That's when Joseph DeRisi, then a grad student in Patrick Brown's lab, posted a document called the “MGuide” on the Web. It isn't a radical tract; it's just a “lighthearted” manual, DeRisi says, telling the reader how to build a microarray robot and listing all the necessary parts, suppliers, and prices (cmgm.stanford.edu/pbrown/mguide/index.html). The estimated cost:$23,500.

The Brown-DeRisi machine employs a cluster of metal pens to print thousands of tiny DNA spots on glass slides, which can be used to perform rapid studies of gene expression. Researchers like the design because it allows them to do gene expression studies for a fraction of what it would cost to obtain the equipment for similar studies from large commercial enterprises, such as Affymetrix Inc. of Santa Clara, California, maker of GeneChip systems (see main text).

Affymetrix has never challenged the MGuide. But the company has been engaged in a furious legal battle with business rivals using the same technology, including a competitor that was born in Brown's lab called Synteni Inc. of Palo Alto, California.

Synteni is the brainchild of Dari Shalon, a former grad student of Brown's and co-inventor with Brown of the microarray gadget described in the MGuide. Brown, Shalon, and Stanford filed for a U.S. patent in 1994 and received one in 1999. Stanford gave Shalon exclusive rights to commercialize the arrayer, and in 1994 he founded Synteni. He then sold the company and its patent rights to Incyte Pharmaceuticals of Palo Alto for \$80 million in January 1998. (Shalon is now director of Harvard University's Center for Genomics Research.) Incyte uses the technology to provide gene expression monitoring services to clients but doesn't sell machines. In May 1998, it closed a big deal to supply data to Monsanto.

Right after Incyte went into the business, the legal battle over microarrays began to heat up. Affymetrix, which had filed broad patents on microarray concepts and systems between 1989 and 1996, sued Incyte in January 1998 in the U.S. District Court in Delaware for patent infringement. Incyte responded with a countersuit for infringement against Affymetrix. In September 1998, Affymetrix upped the ante: It asked the Delaware court for an immediate injunction to stop Incyte from “making, selling, or offering to sell their Gene Expression Microarray products and services.” Incyte, meanwhile, appealed to the U.S. Patent and Trademark Office for an extraordinary “interference proceeding,” arguing that its patents voided key claims advanced by Affymetrix.

Both maneuvers failed: The court dismissed Affymetrix's injunction request, and the Patent Office ruled that the evidence did not support Incyte's argument that the rival patents were void. It's now up to the courts to decide which company is violating the other's turf; the trial is scheduled to begin in September 2000 in the U.S. District Court in San Francisco. This battle ultimately will include other contenders, as well, including Hyseq Inc. of Sunnyvale, California, and Edwin Southern of Oxford University in Oxford, U.K., who hold broad patents on gene array technologies.

Meanwhile, DeRisi says that thousands visit the MGuide, and several dozen labs around the world—including in China, Japan, Australia, and Eastern Europe—download updated versions of the manual, presumably because they use it. Could this giveaway of the technology that companies are battling over draw legal attacks as well? Brown and DeRisi don't think so. They note that Stanford supports free use of the technology for research. Anyway, where patent issues are concerned, Brown says, “I don't want to have anything to do with them if I don't have to.” DeRisi adds: “I've looked at the legal documents; I can't understand what they're talking about.”

20. # Keeping Genome Databases Clean and Up to Date

1. Elizabeth Pennisi

As the size of GenBank and the number of other biological databases grows so does the need for ways to update and coordinate the information they contain

Last year, Michael Kelner thought he had finally gotten his hands on the elusive front end of a gene he'd been working on, on and off, for months. But when he searched GenBank, a public archive that contains every published DNA sequence, looking for similar genes that might hold clues to his gene's function, he knew something was wrong: He turned up more than 100 matches, or “hits,” from a wide array of genes from many different organisms. Kelner, a molecular pathologist at the University of California, San Diego, soon realized that the sequence all these genes had in common was a contaminant, introduced by the commercial kit he had used to clone his gene. And the fact that it turned up in so many genes in GenBank suggests that many other scientists unknowingly had the same problem. As a result, “there's a huge number of public sequences that are incorrect,” he says.

John Mallatt had a similar sobering experience recently. An evolutionary biologist at Washington State University in Pullman, Mallatt was trying to determine evolutionary relationships between various organisms by comparing the sequences of specific genes. After several months' work, he realized that the GenBank sequence he had been relying on for one of the ribosomal RNA genes of Xenopus, a species of frog, was incorrect. “I found the error entirely by accident as I stumbled on [a report in the scientific literature] with the corrected sequence,” he recalls. “It took me about 10 hours, crawling through the correct and incorrect sequences base by base, to fix it and enter the correct sequence into my phylogenetic analysis.” Mallatt subsequently discovered that GenBank contains both sequences, but there is no indication which is the correct one. Because Xenopus is one of the few amphibians whose genes have been sequenced, it is widely used for evolutionary studies, so it's likely that other researchers have completed and published phylogenies with the wrong data.

Databases like GenBank have revolutionized biology, providing researchers with powerful tools to hunt for new genes, compare the way genes have evolved in many different organisms, and figure out the functions of newly discovered genes. But more and more researchers like Kelner and Mallatt are discovering that this mother lode of information contains some fools' gold that can mislead the unwary biological prospector. Based on their surveys, genomics experts estimate that some 2% of GenBank's entries may contain DNA introduced by experimental procedures. In other entries, bases are missing or incorrect in stretches of supposedly finished sequence, or genes are even placed on the wrong chromosome. Even more problematic, some say, are inaccuracies in the labels and annotations that accompany many sequences. Hamster sequence is called human DNA, for example, and partial genes are misclassified as complete.

“GenBank is full of mistakes,” says Michael Ashburner, who helps run a fruit fly database called FlyBase and other databases at the European Molecular Biology Laboratory (EMBL)-European Bioinformatics Institute (EBI) in Cambridge, U.K. GenBank's counterparts, the DNA Database of Japan (DDBJ) and EBI, and archival databases such as the Protein Data Bank (PDB), are also accumulating errors. Even databases whose entries are reviewed and updated, such as FlyBase or SwissProt, a long-established protein database based at EBI and at the Swiss Institute of Bioinformatics in Geneva and Lausanne, can have mistakes or missing data. And these problems are only going to intensify as labs around the world pour out sequence data from the human genome and other organisms.

Dozens of teams of bioinformaticists and biologists are trying to tackle the problems, but it's a daunting task. For one, “the databases are getting so large that getting people to correct the errors and knowing what the errors are is a major thing,” laments Terry Attwood, a biophysicist at the University of Manchester in the U.K. Neither GenBank, nor EBI, nor DDBJ, discriminates between correct and incorrect data. Like PDB, they expect the discoverers and depositors of the data to update and correct the information they supply, yet many researchers find that task too burdensome. “There's no reward for it,” says William Gelbart, a Harvard developmental geneticist who is one of the coordinators of FlyBase.

And a lack of funds is hampering more systematic approaches to the problem, such as having experts review incoming information or developing ways to link existing entries with new data. Until now, bioinformaticists have been able to “roll a solution together with bubble gum and bailing wire,” says Owen White, a bioinformaticist at The Institute for Genomic Research (TIGR) in Rockville, Maryland, but that won't be enough. “We must have real money from the granting agencies, or we're basically going to have the [biological equivalent of the] Hubble [Space] Telescope and no way to look at the data.” And it will also require a change of attitude on the part of many researchers. “We have to educate people about databases so [researchers] don't assume [the databases] are right,” says Attwood.

## Problematic sequences

Douglas Crawford knows only too well the need for a better way to look at the biological data in databases. Over the past 5 years, “the utility of GenBank has declined greatly,” says Crawford, an evolutionary biologist at the University of Missouri, Kansas City. At one time, Crawford and his colleagues, who study genes for metabolic enzymes, were eager to look through GenBank for any matches to a new gene they isolated. Now, he says, such searches tend to turn up “hundreds of hits,” including “a lot of sequences which by themselves are meaningless” because they are just pieces of genes, or worse, because they are slight variations on the same gene from the same species. It takes many hours of tracking down the primary literature to sort out if any of those matches are useful. Sometimes, after going through the trouble to find a gene that is supposedly complete enough to warrant further study, Crawford says he finds that the sequence is missing key bases at the beginning or end, or it may lack a coding region found in that same gene from other species, “and we don't know whether that [loss] is real or not.”

One annoying type of contamination is what led Kelner astray: the inclusion of a piece of DNA from an entirely unrelated organism in a stretch of sequence. Most such problems arise because vector DNA—bits of genome from the phage or bacterium used to clone the sequence under study—was not removed before the new sequence was submitted to the databases. By the end of last year, according to EBI's Rodrigo Lopez and his colleagues in Cambridge, U.K., some 219 published reports of contamination in the major sequence databases had been published, several of which noted dozens of different kinds of problems. Because it's up to the discoverers to make corrections, “there is very little that database curators can do to remove [the errors],” Lopez and his colleagues wrote in the September 1998 EMBnet newsletter.

Kelner's experience illustrates this problem. After he scored so many hits in GenBank, he quickly suspected the commercial cloning kit he had used. Sure enough, when he looked up the journal reports of a few of the sequences that matched his, he realized that all the researchers had used the same CLONTECH Marathon kit to pull out their gene's front end. Like many researchers before him, Kelner hadn't seen instructions buried in the appendices telling him to trim out the kit's DNA. Kelner caught the error before he deposited the sequence in GenBank, but many other researchers evidently did not, and the contaminant is now officially recorded as part of scores of genes. GenBank's Paul Kitts says he had not been aware of contamination with the CLONTECH sequence until Kelner brought it to his attention, but he is not surprised.

Sequences that contain errors or missing segments represent a more insidious problem that can trip up the unsophisticated database user. If a gap in a supposedly completed genome or gene region is large enough, or the region is studied by enough people, then it is likely to be caught fairly quickly, says Mark Boguski of the National Center for Biotechnology Information (NCBI). Such was the case with some six megabases of sequence from the nematode genome that was still missing from GenBank when it was published late last year (Science, 11 December 1998, pp. 1972 and 2012). Several months later, after other researchers complained to GenBank, the nematode sequencing teams made public the missing segment. But misassembled data, in which adjacent sequences don't really belong next to each other, or small gaps such as the loss of a single coding region in a gene with multiple coding regions, are more likely to go unnoticed, says Boguski. The same is true for incorrect sequences. And even when these problems are detected, there is no mechanism to flag them or to replace them with the correct data. For this reason, “there will always be an error legacy which will be very difficult to correct,” says Attwood.

## Missteps in translating form to function

Even if a gene's sequence is complete and accurate, an unwary biologist can still be led astray if information about the gene's function is incorrect. Indeed, Amos Bairoch, a biochemist who heads SwissProt, believes that errors in the annotation that accompanies sequence data are more worrisome than errors in the sequences themselves. Five years ago, he notes, geneticists mostly sequenced genes whose functions they already knew. Now the reverse is true. “The [genes] that have been characterized are a very small island in a flood of [sequence] data,” says Bairoch.

Researchers take a first stab at figuring out an unknown gene's function by running its sequence through computer programs that suggest what type of protein it codes for, based on how closely its sequence resembles that of a known gene. But the computer programs can be tripped up, because most proteins consist of several parts, or domains, that have different roles. One may bind to DNA, say, while a second attracts another type of protein, and a third catalyzes some chemical reaction. Protein A may look like B because it has a similar catalytic domain and thus is assumed to have the same function, say as an alcohol dehydrogenase. Later, a new protein, C, looks like B, again because the sequences share a degree of similarity, and so the computer program assumes that C is also an alcohol dehydrogenase. But this time, the similarity might be between B and C's protein-binding domains, not their catalytic portions, and C may not be an alcohol dehydrogenase after all. Down the line, a fourth protein, D, that resembles C will also be called an alcohol dehydrogenase. Yet, in reality, “you now have no clue what the function is,” White points out, and it becomes harder and harder to figure out where the assignment went wrong.

Peer Bork, a computational biologist at EMBL in Heidelberg, Germany, estimates that about 15% of the annotation in GenBank is either unverified or not up to date, and “the error propagation is explosive,” he says. And these errors are showing up in print as well, Bairoch notes. At SwissProt, for example, a team of some 40 Ph.D. researchers, each with a year's training in interpreting sequence and annotation data, are finding an alarming number of published reports with problems similar to those seen in GenBank.

Even seemingly simple information, such as where a gene begins and ends, can be wrong: Complex genes often code for more than one protein, depending on where the DNA-transcribing enzymes start and stop, and the different proteins often have different functions. “As many as a third of the genes are alternatively spliced,” says Sylvia Spengler, a biophysicist at Lawrence Berkeley National Laboratory in California. Charting these alternative ways of reading a gene “is important,” she adds, “but we don't yet know how to deal with that [in databases].” Both NCBI and Spengler's team have begun experimenting with ways to indicate when genes code for more than one protein, but it's still uncertain how clear these kinds of annotations will be.

## Interpreting electronic “literature”

The jury is still out on the best way to tackle these problems. Automated screening systems can catch some of the errors, such as contaminant sequences. GenBank's Kitts, for example, has compiled and updated a list of possible contaminating sequences—more than 2000 have been identified to date—and perfected a computer program that flags these sequences in new submissions to the archive. EBI is taking similar steps. When researchers in Europe search for matches in EBI, a computer program automatically scans the data to screen out possible contaminants.

These efforts should cut down on the amount of contaminating sequence entering these databases and will help ensure that such sequences don't throw off genomics analyses. For other kinds of errors, there's no quick fix. Some experts, such as Nathan Goodman, a bioinformaticist at Compaq Computer Corp. in Marlboro, Massachusetts, would like individual scientists or groups of scientists to take responsibility for keeping the information about their particular genes of interest up to date. To foster this kind of care, he says, journals should publish notifications of these corrections, enabling researchers to get credit for their efforts. But databases would still have to develop ways for the corrections to be entered.

Others advocate new, intensively curated databases that would be the equivalent of review articles in the world of electronic literature, containing time-tested, authoritatively annotated entries. Several such databases are already being set up. NCBI's RefSeq, for example (Science, 30 April, p. 707), identifies one “best” sequence for each way the gene can be expressed, lists the gene's approved name and synonyms, and describes or links to functional information. Unlike GenBank, curators of this new database “may make editorial comments and corrections that the original authors don't agree with,” NCBI computational biologist James Ostell points out, even though that may miff the original authors. Moreover, “the reference collection provides an armature on which we can put all sorts of information.” Using a “Link Out” system, individuals or groups will eventually be able to insert pointers in GenBank that will direct users to sites with more, perhaps better, information about a particular sequence. Ostell thinks users will eventually go to RefSeq first, bypassing many of GenBank's shortcomings.

Other groups around the world are also setting up their own curated databases, some with specialized foci, others with the intent of annotating the human genome with their own customized tools. In this way, “there will be competing products; then the audience can judge which is most useful,” says Boguski. TIGR, for example, downloads new sequence data daily from GenBank and has a software package that includes Glimmer, a program that trains itself to recognize genes in microbial genomes, and a program called AAT that matches new DNA data to sequences in protein and cDNA databases. Likewise, Chris Overton, a computational biologist at the University of Pennsylvania, Philadelphia, and his colleagues have come up with a set of annotation tools that use grammar rules and other aspects of computational linguistics to sort out genes and their function. They and others are working feverishly to make their sequence comparisons more sophisticated so as to improve the program's ability to predict function correctly.

Still others are positioning themselves to be go-betweens who can help researchers who don't use electronic resources very often to search GenBank data more intelligently and thoroughly, and they, too, have their own approaches to annotation. “The key for biologists,” says John Bouck, a bioinformaticist at Baylor College of Medicine in Houston, Texas, “is to be able to understand what the limitations are but also to realize how much information is there.” The Web-accessible Genotator, for example, combines several programs that find genes, look for matches between genes, or check for sequence that signals the beginnings and ends of genes, thereby enabling a gene hunter to explore a range of tools all at once. The GENOME CHANNEL, based at Oak Ridge National Laboratory in Tennessee, is a tool for a comprehensive sequence-based view of genomes. And courses and workshops are popping up to teach researchers what they can—and cannot—expect from their cyberspace explorations.

As ideas for fixing the databases proliferate and funding agencies step up support for bioinformatics (Science, 11 June, p. 1742), even the more skeptical researchers are expressing some optimism. “These [problems] are all solvable,” says Goodman, “if there is enough will in the community to solve them.”

21. # Seeking Common Language in a Tower of Babel

1. Elizabeth Pennisi

With gene sequences by the thousands pouring into databases, efforts are revving up to figure out what all those genes do—what proteins they make and how they fit into the workings of living things. Comparing genomic data from different organisms will be key to answering those questions. Already, researchers have found that organisms from microbes to the sequoia tree and whales to mushrooms have much more in common than was once appreciated, and those similarities are shedding light on the functions of unknown genes and their protein products.

But these efforts suffer a big handicap: Genetic information is stored in different ways in different databases, which makes it hard to compare their holdings. So, while computational biologists are trying to improve the quality of the databases (see main text), they are also working to build bridges between them. So far, they have had only limited success. “The main problem is interoperability—how to merge information from different databases,” says William Gelbart, the Harvard developmental geneticist who helps run FlyBase, a database devoted to the fruit fly Drosophila.

Ideally, researchers want to do one-stop shopping among the scores of databases that now collect genetic data, conducting a single search for all of the information on record about a particular gene, protein, organism, or pathway. And bioinformaticists have begun to try to make this possible. In the meantime, each database has its own Web site with unique navigation tools and data- storage formats that make such searching difficult, by a person or a program. Users have to master the idiosyncrasies of each database's tools, and programs can't easily recognize data that are not stored in a uniform way. The lack of a common language for gene functions is also proving to be a serious problem.

Alternate spellings, different names for the same gene, or different uses for the same word can trip up even smart search programs. Take, for example, a search for genes involved in vein development: It is likely to pull up information related to the human circulatory system, a leaf, or a fruit fly wing. Remedying the problem, says Gelbart, is “a question of how to undo 100 years of [building] a tower of Babel.” And the digital babble threatens to become deafening as new kinds of databases, such as those cataloging gene expression data (see p. 444) or data from large-scale protein identification experiments, come online.

Even 2 years ago, when there were fewer databases and much less data, cross-searching was difficult, says Nathan Goodman, a bioinformaticist at Compaq Computer Corp. in Marlboro, Massachusetts. Goodman, who was then at The Jackson Laboratory in Bar Harbor, Maine, and Jackson Lab mouse geneticist John Macauley conducted a test to see how easily they could cross-search GenBank, a sequence archive, and smaller mouse and human genome databases, which contain other types of information about these genes, to pool information on particular genes. They reported in the August 1998 issue of Bioinformatics that they failed to identify counterparts for 26% of the mouse genome database entries that were known to have an equivalent human gene; the reverse was true for 17% of the genes in the human genome database.

Some promising efforts are under way to tackle these problems. One, called the gene ontology project spearheaded by Michael Ashburner at the European Bioinformatics Institute (EBI) in Cambridge, U.K., seeks to come up with a set of common, shared definitions for each term used to describe biological data. Another, now coordinated by EBI's Tomás Flores, is establishing standard ways of representing data. Both have a long way to go and do not yet have the full support of the community, however.

A few researchers, such as Goodman, argue that the best way to minimize incompatibility is to centralize the data collection and storage. If one “federation” oversees the various databases, they argue, then it is more likely that standards will be established and links between the databases will be kept up to date. “A centralized database might be much easier to maintain,” Gelbart notes.

But that approach seems to be losing ground, as new data archives proliferate. At least three groups are independently coming up with their own way to store and display data from microarrays, for example. They include Stanford—where microarrays were invented—EBI, and the National Center for Genome Resources. Similarly, at least three databases for cataloging slight variations in genes called single nucleotide polymorphisms, or SNPs, have been set up in the past 2 years.

Some experts argue that such multiple efforts are healthy. “It allows people to experiment and come up with new ideas,” says Ashburner. But others worry about conflicting standards and duplication of effort. Letting 100 flowers bloom “is expensive, and it doesn't scale well,” says Owen White of The Institute for Genomic Research in Rockville, Maryland. “And the hazard you head toward is that everyone has a different way of representing their data.” As White sees it, the way things are going now, “in the near future, people will want to ask simple questions and will find the databases inadequate.”