# News this Week

Science  01 Oct 2010:
Vol. 330, Issue 6000, pp. 18
1. U.S. Graduate Education

# Academy Rankings Tell You a Lot, But Not Who's No. 1 in Any Field

1. Jeffrey Mervis

Perhaps it should be called the Mr. Potato Head of graduate school rankings.

Remember how easy it was to alter the appearance of that toy's bland, tubular face by sticking an ear or an eye in an unexpected place? Well, the latest analysis of the quality of U.S. research doctoral programs by the National Academies' National Research Council (NRC) can be manipulated in much the same way. But the exercise is hardly child's play.

This week's release of the long-awaited assessment, the first since 1995 and 3 years behind schedule, disgorges a massive amount of information about 5100 doctoral programs in 62 fields at 212 U.S. universities. More than a decade in the making, the assessment is meant to reflect the collective wisdom of the U.S. research community on what defines a top-quality graduate program. In an era of increased accountability, it's also designed to address questions from students, faculty members, university administrators, elected officials, and the public about the quality of any particular graduate program.

Yet that strength—the ability to serve as many audiences as possible—may also be the assessment's most controversial feature. Those who simply want to know who's No. 1 in neuroscience, for example, or read a list of the top 10 graduate programs in any particular field will walk away disappointed after massaging the report's Excel spreadsheets, available at www.PhDs.org or www.nap.edu/rdp. That's because, like Mr. Potato Head, the NRC assessment can look quite different depending on your definition of “best.”

To be sure, NRC does rank programs—but oh so carefully. Instead of assigning a single score to each program in a particular field, the assessment ranks the program on five different scales. Each score is also presented as a range of rankings reflecting the 5th and 95th percentiles of the scores it received. The scales themselves are based on 20 characteristics (see table, p. 19) that the NRC panel deemed appropriate for a quantitative assessment. Two are supposed to portray the overall quality of the program—one derived from a reputational survey (the R scale), the other from a quantitative analysis (the S scale). Three others rely on subsets that address important dimensions of quality: research activity, student support and outcomes, and diversity. The report itself highlights the uncertainties generated by such an exercise by calling the results “illustrative rankings [that] are neither endorsed nor recommended by the NRC as an authoritative conclusion about the relative quality of doctoral programs.”

Given all those caveats, some university administrators are taking the rankings with more than a grain of salt. “We're pleased with how well our own programs ranked,” says Patricia Gumport, dean of the graduate school at Stanford University. “But we have concerns about the methodology. So we're not planning to use the range of rankings.”

It's easy to see the source of Gumport's concern by looking at what the assessment says about Stanford's anthropology department, to pick just one example. The department is ranked between 13th and 47th on the R scale and between 3rd and 9th on the S scale. In addition, it falls between 3rd and 14th on research activity, between 1st and 43rd on student support and outcomes, and between 12th and 33rd on diversity.

“It's difficult to draw meaningful conclusions about the relative quality of programs from these ranges of rankings,” says Gumport with impressive understatement. Instead, she and her deans plan to mine the database to compare the performance of the university's 47 programs on one or more characteristics, or to see how a particular Stanford program stacks up with its peers around the country on those characteristics.

That's exactly what NRC hopes will happen. “We wanted to give people the chance to create rankings based on variables that they thought were important,” says Charlotte Kuh, the NRC staffer who has lived and breathed the $6.7 million assessment since it was launched in 2004 and who has made countless presentations to the graduate school community since then on its progress or lack thereof. That's especially true for different audiences, she adds: “While faculty have certain values, students may be worried about other things.” For example, an undergraduate who's thinking of becoming a microbiologist can find out how long it takes students at University X to complete their Ph.D. degrees, or what share of graduates from University Y find academic jobs. Likewise, an engineering dean interested in increasing the number of women or minority faculty members and students can compare the gender and racial diversity of her programs with those of others. The absence of a single score separates NRC's rankings from those done by several other organizations, in particular, U.S. News and World Report, whose influential annual assessments of the “best” universities emulate those for college sports teams by offering the type of ordinal ranking that readers seem to crave. “We felt it was more responsible to be accurate,” say Jeremiah Ostriker, a professor of astrophysics and former provost at Princeton University who chaired the committee that carried out the assessment. “That's especially the case if a small change in a range of rankings could make one school seem better than another.” To understand why one number is never enough, you need to understand how the panel went about its business. Asked by the National Academies to prepare a data-based assessment, the committee first whipped up a batch of 20 program characteristics. Then it served up those characteristics in two different ways. The first involved asking 87,000 faculty members to weigh the importance of each characteristic. Then it applied those weights, in combination with data on those measures from institutions, faculty members, and students, to rate each one of 4838 programs in 59 fields. (It collected data but did not rate three fields that fell below a minimum size and frequency.) That's the S, or quantitative, ranking. The second method asked 8000 faculty members to rank the overall quality of up to 15 programs in their field. Then NRC used a regression analysis to determine the perceived weights that each faculty member used in rating each program. It did this 500 times, selecting a different set of raters each time. That's the R, or reputational, ranking. That process differs markedly from the 1995 assessment, which ranked programs based directly on their reputations. “This time around, we were not interested in the reputations per se,” explains Ostriker. “Reputations suffer from many flaws, including a halo effect, time lag, and so on.” The panel intended to combine the R and S rankings into a single ranking, “for simplicity's sake,” he says. But a National Academies' review panel blew the whistle on that approach. “They said we were actually measuring slightly different things and that they should be kept separate,” explains Kuh. Ostriker says that readers can combine them by following a formula given in the appendix of a second volume, on the panel's methodology. “It's more confusing, but logically, I have to agree with the reviewers,” he says. The committee also took the review panel's advice to broaden the confidence level of each ranking from 50% to 90% (that is, to say there is a 90% chance that the quality of a given program falls within a particular range of rankings). At a 50% confidence level, the two rankings for a given program by definition overlapped only half the time, an outcome that Ostriker says reviewers found confusing. On the other hand, as Gumport points out, divergent scores like those for Stanford's anthropology program cast additional doubt on the NRC exercise. One curious finding in this year's assessment is the correlation between a program's quality and its size. Faculty members did not put that characteristic high on their list, says Kuh. But, as in 1995, it shows up strongly in the rankings themselves. “What is it that size does?” Kuh asks. “I think it makes the program more visible. It's more likely to place students in other programs, and people from that institution have more articles in the literature, which may make readers think, ‘Say, this must be a pretty good program.’” But the data show there's also a downside to size, she adds, with students from larger programs taking a longer time to earn their degrees. The age of the data has attracted much criticism. Many university officials say the data—collected during the 2005–06 academic year—no longer present an accurate picture of many programs. “We've had 462 new faculty hires and 325 departures in the past 3 years,” Gumport notes, a significant proportion of the 1900-member faculty. But the panel disagrees. Ostriker notes that the rankings are based in part on 5-year averages and says he doubts moving the time frame forward by a few years would have made much of a difference. In addition, the NRC questionnaire found that 80% of the faculty members had worked at their programs for at least 8 years. “That points to a huge level of stability,” says Kuh. For all the data's richness, however, panel members say they should not be used to determine the fate of a particular program. “None of the results dictates any decision, but they should stimulate discussion on why you are where you are,” says Richard Wheeler, vice provost at the University of Illinois, Urbana-Champaign. NRC hopes to update the database if it can raise sufficient funds to continue the project. In the meantime, says Ostriker, universities are encouraged to collect and disseminate new information about their graduate programs. 2. Infectious Diseases # Rival Teams Identify a Virus Behind Deaths in Central China 1. Richard Stone BEIJING—When Xue-jie Yu came to China last year to probe a lethal fever outbreak, everyone—Yu included—assumed he would provide damning testimony against a known suspect. Every summer for 3 years, hundreds of people in central China came down with an illness characterized by high fever and gastrointestinal (GI) distress. Many victims bled profusely, and an alarming number of the sick—rough estimates are as high as 30% in some areas—died. By early 2007, scientists at the Chinese Center for Disease Control and Prevention (CDC) here fingered the killer as human granulocytic anaplasmosis (HGA), an emerging bacterial infection from tick bites. But to Yu, an expert on tick-borne diseases at the University of Texas Medical Branch in Galveston, things didn't add up. “The fatality rate was too high,” Yu says, and in his experience it was “rare” for HGA patients to have GI symptoms. Working at the Chinese CDC's National Institute of Communicable Disease Control and Prevention (NICDC) here, Yu tested blood samples for Anaplasma phagocytophilum, the HGA bacterium—and came up empty. Last December, his team identified a new kind of bunyavirus, a family that includes infamous members such as hantavirus and Rift Valley fever virus. The finding, in a paper submitted to The New England Journal of Medicine (NEJM), unmasks a dangerous new emerging virus—not a bacterial outbreak—and explains why antibiotics failed to stop it. Behind the scenes, however, a fierce argument has broken out over who discovered the virus. This summer, a Chinese CDC team led by hantavirus expert Li Dexin, director of the agency's Institute of Virology, also uncovered a bunyavirus—possibly the same one Yu's group identified. They have submitted a deeper analysis of the pathogen, including complete RNA sequences of 11 strains, to The Lancet. Yu charges Li's group with trying to rob him of the discovery; Li says Yu's viral sequence is incomplete and that his team identified the virus as the killer. Several key questions are disputed or unanswered. For starters, researchers do not know how lethal the virus is. The mortality rate may be high in China in part because clinics often prescribe the steroid dexamethasone to bring down high fevers; steroids suppress the immune system, which usually worsens infections. Scientists also differ on whether the virus should be handled in a biosafety level 3 facility—reserved for dangerous pathogens—or in less secure laboratories. And although the infection shows a seasonal pattern associated with tick-borne diseases—cases begin in early spring and peak in midsummer before tapering off by autumn—the vector is still a mystery. One indisputable fact is that the emerging disease has claimed scores of lives—mostly of farmers—in China's heartland. The first documented outbreak was in 2006, in Anhui Province. At that time, a team led by NICDC Director Xu Jianguo, Chinese CDC's chief bacteriologist, rushed to Anhui. Using PCR they found A. phagocytophilum DNA in the blood of one patient who died and in family members and hospital staff who became infected. HGA had been recognized in the United States in 1990 and in Europe in 1997; Xu's group reported China's first cases in the 19 November 2008 issue of The Journal of the American Medical Association. Curiously, however, none of the patients recalled having been bitten by ticks. And when outbreaks recurred in 2007 and 2008, the disease did not respond to antibiotics. Thinking it might have an unusual Anaplasma variant on its hands, NICDC in 2009 invited Yu as a short-term visiting researcher under the 1000 Talents Program, which brings overseas scientists to China. That summer, the pathogen surfaced in Henan and Hubei provinces. In June, Yu went to Hubei's capital, Wuhan, to collect blood samples from patients. “They did not look like typical HGA cases,” he says. After he failed to detect A. phagocytophilum, Yu says he urged Chinese CDC scientists to consider a viral pathogen—but researchers there flatly rejected the idea. Yu persisted and spotted virus particles that December in cell culture using electron microscopy. Then in February, he says, a member of his Texas lab, Yan Liu, “cracked the code of the viral genome.” Two days after he informed scientists at the Chinese CDC about his findings, he says, his 1000 Talents affiliation with NICDC was terminated. Chinese CDC Director Wang Yu was intrigued by Xue-jie Yu's findings and invited him to share them at a 15 April meeting at CDC headquarters to plot strategy for studying the disease. Among the attendees were Li and CDC virologist Liang Mifang; they found Yu's presentation unconvincing. “He said he isolated a bunyavirus, but he had gotten just fragments,” says Liang. Yu confirms that the virologists were dismissive: “Li tried to deny the importance of my work,” Yu says. Yu and his colleagues have named the virus Dabie Mountain virus after the range that straddles the borders of Hubei, Anhui, and Henan provinces where they collected samples. But Yu was not invited back to China this summer to continue his research. “I am the first scientist to discover the viral pathogen for an emerging infectious disease who has no access to study the virus and the disease anymore,” he says. In May, the Chinese CDC set up surveillance for the pathogen in Henan and Hubei provinces. The disease flared up in four other provinces as well, and Li's team collected blood and serum from all six affected provinces. They amplified viral RNA sequences and from more than 500 clones linked 14 to bunyavirus. They also isolated bunyavirus in cell culture and sequenced 11 strains. They have named it severe febrile and thrombocytopenic syndrome (SFTS) virus and have classified it in the phlebovirus genus of bunyavirus. Li's group also detected the virus in 35 patients from three provinces. “It's solid work. They clearly show that a new virus is causing disease,” says a U.S. scientist who has seen the data and asked to remain anonymous. But the rival claim didn't sit well with Xue-jie Yu. When he heard that Li's team had submitted its findings to The Lancet, he sent an e-mail to the journal accusing Li of poaching his discovery. Liang says that's not true: “All data in our manuscript belong to us, not anyone else,” she says. On 17 September, The Lancet asked Li's group to withdraw the paper and resubmit it after settling the authorship dispute. The squabbling has put Wang, the Chinese CDC's director, in an awkward position. There is “no doubt,” he says, that Xue-jie Yu “discovered the novel bunyavirus.” While noting that Yu's results are not as “rich” as Li's team's, Wang says, “everyone knows what a scientific breakthrough is, and what is accumulating work.” After the NEJM paper is published, he hopes, “other papers can go smoothly.” But it may take Wang's best diplomatic skills to get any collaboration on the emerging virus to go smoothly. 3. ScienceNOW.org # From Science's Online Daily News Site Strong as Silk The silk produced by a Madagascar spider that spins webs as large as 2.8 square meters is the toughest biomaterial yet discovered, according to a new study in PLoS ONE. Ingi Agnarsson, an entomologist at the University of Puerto Rico in San Juan, and colleagues collected Darwin's bark spiders (Caerostris darwini) from Madagascar's Andasibe-Mantadia National Park and brought them to a nearby greenhouse where the spiders could spin fresh webs. When the researchers compared Darwin's bark spiders' silk with other spiders' silk, they found that it ranked near the top in terms of strength and was twice as elastic as any other known spider silk, making it the toughest known biological material. Understanding the properties of spider silk could help engineers synthesize even tougher, lighter-weight materials, the researchers say. Primordial Magnetic Field Two physicists say they have found an extremely weak magnetic field stretching across the universe, a possible remnant of the big bang. If scientists confirm the finding, it could help reveal the origins of magnetism in the cosmos. Physicists Shin'ichiro Ando of the California Institute of Technology in Pasadena and Alexander Kusenko of the University of California, Los Angeles, knew that a vast magnetic field could scatter high-energy photons emitted by distant supermassive black holes, blurring the resulting telescope images. The effect would be too minuscule to see in a single image, so the pair created a composite image from data on 170 different black holes collected by the Fermi Gamma-ray Space Telescope. When they compared the composite with the product of a mathematical model that assumed no such field exists, the two images didn't match. Their calculations, published in The Astrophysical Journal Letters, suggest that the magnetic field responsible for the discrepancy has about one-quadrillionth the strength of Earth's. Although the researchers speculate that the field is a vestige of the big bang, others caution that it will require additional modeling to determine its source. Viking Village A team of archaeologists announced last week the discovery of a Viking settlement near the village of Annagassan, 70 kilometers north of Dublin. It could be the long-lost town of Linn Duchaill, one of two Irish outposts described in medieval accounts. The other, Dúbh Linn, became Dublin. Annagassan filmmaker Ruth Cassidy and archaeological consultant Mark Clinton began searching for the town in 2005 and had almost given up when they came across an area ideal for building and repairing ships. A survey turned up a series of defensive ditches not typical of Irish construction, convincing the local Louth County Museum to fund an excavation. In just 3 weeks, the team found 200 objects, including evidence of carpentry, smelting, and ship repair, and hacked coins, which Clinton says were a typical “calling card” of the Vikings. Other Viking experts are cautiously optimistic that the settlement is Linn Duchaill but say it needs to be solidly dated before the case is closed. Read the full postings, comments, and more at http://news.sciencemag.org/sciencenow. 4. Psychology # Social Savvy Boosts the Collective Intelligence of Groups 1. Greg Miller People who are good at solving one type of brainteaser tend to excel at a variety of mental calisthenics—support, many psychologists say, for the concept of general intelligence. A study published online this week in Science (www.sciencemag.org/cgi/content/abstract/science.1193147) extends this concept to groups of people, arguing that groups have a “collective intelligence” that predicts their performance on a range of collaborative tasks. The researchers, led by Anita Woolley, an organizational psychologist at Carnegie Mellon University in Pittsburgh, Pennsylvania, reached this conclusion after studying 699 people working in small groups. They also investigated why some groups appear to be smarter than others. Surprisingly, the average intelligence of the individuals in the group was not the best predictor of a group's performance. The degree to which group members were attuned to social cues and their willingness to take turns speaking were more important, as was the proportion of women in the group. “This paper really takes all the lessons of 100 years of psychometric research on individual intelligence and applies it in a novel way to look at group decision-making,” says Richard Haier, a neuroscientist at the University of California, Irvine, who studies intelligence. “You can get a lot of interesting ideas out of this.” In the first part of the study, Woolley and colleagues at the Massachusetts Institute of Technology recruited 120 people from the Boston area and randomly assigned them to teams of three. Whereas most previous research has focused on what makes certain teams excel at a given type of task, Woolley says she wanted to look instead at whether a team's performance on one task generalizes to others. Teams worked on a variety of tasks, including brainstorming to come up with possible uses for a brick and working collaboratively on problems from a test of general intelligence called Raven's Advanced Progressive Matrices. These problems involve evaluating several shapes arranged in a grid and identifying the missing item that would complete the pattern. The groups also worked on more real-world scenarios, such as planning a shopping trip for a group of people who shared a car. The researchers scored these tests according to predetermined rules that considered several factors (awarding points when shoppers got to buy items on their list, for example). Each participant also took an abbreviated version of the Raven's test as a measure of individual intelligence. These experiments showed that a group's performance on any one task did in fact predict its performance on the others. That suggests that groups have a consistent collective intelligence, Woolley says. She and colleagues calculated a “c factor” for each group, based on its performance across tasks, a direct parallel to the much-debated general intelligence factor, g. Neither the average intelligence of group members nor the intelligence of its smartest member correlated with the group's performance. To investigate further, Woolley and colleagues recruited another 579 people from Boston and Pittsburgh and assigned them to groups of two to five members. This time the researchers did find a weak correlation between both the average and the highest individual intelligence of members of a group and its collective intelligence. But other factors were stronger predictors. One was the group members' average score on a test that required them to infer what was on another person's mind—whether they were annoyed or worried, for instance—by looking at a photograph cropped to show just the eyes. That suggests that “social sensitivity” is a key ingredient of successful teams, Woolley says. The researchers found that the degree to which members took turns speaking also predicted their performance. The proportion of women in a group also correlated with collective intelligence, but Woolley says much of this effect can be explained by the gender difference in social sensitivity: women tend to have more of it. The “careful, empirical experiments” are a welcome addition to the literature on teams, which is dominated by observational studies, says Brian Uzzi, a sociologist at Northwestern University in Evanston, Illinois. He agrees with the authors' conclusion that the collective intelligence of groups may be more amenable to improvement than general intelligence in individuals, which most research suggests is difficult to change. Coaching to improve social perceptiveness and turn taking, or selecting individuals with those tendencies, might make for smarter groups, for example. Research on how the gender composition of teams affects their performance has a long and controversial history, says Katherine Phillips, an organizational behaviorist at Northwestern. Some studies have found that women improve teams by virtue of their social acuity, whereas others have found that women are more likely to remain quiet and let others have their say in team discussions, sometimes to the detriment of the team. In the current experiments, women may have been more likely to speak up because none of the group members had particular expertise in the problems at hand, Phillips says. However, the random makeup of the groups may limit the reach of the findings, cautions Linda Gottfredson, a sociologist who studies intelligence at the University of Delaware, Newark. She notes that the groups were composed of strangers. “It is possible that turn taking in conversation was so important for that reason,” she says. “They did not know how bright and sensible the others were.” In a more typical workplace setting, Gottfredson says, individuals would be more familiar with their teammates and know whom to listen to and encourage to speak. In that case, she says, the members' individual intelligence may be a more important factor than turn taking. 5. Science Policy # India's Vision: From Scientific Pipsqueak to Powerhouse 1. Pallava Bagla NEW DELHI—In 1930, Indian physicist C. V. Raman won a Nobel Prize for his discovery of inelastic photon scattering, known as the Raman effect. The phenomenon became a powerful tool for analyzing matter—but it was other countries that used the basic knowledge to invent Raman scanners. That still causes heartburn here. In a new report, a blue-ribbon panel cites Raman as one egregious example of India's systemic failure to capitalize on basic research findings. The report, released last week by Prime Minister Manmohan Singh, offers a stinging indictment of India's scientific frailties, noting that science here is “severely hampered by oppressive bureaucratic practices and inflexible administrative and financial controls.” Titled India as a Global Leader in Science, the “vision document” also offers a blueprint for strengthening Indian science—one that will require heaps of money to implement. “We would need to redouble our efforts and hope that the ideas in the vision document will inspire the scientific community,” Singh said in releasing the report. The first of its kind, the report is getting mixed reviews. Goverdhan Mehta, an organic chemist at the University of Hyderabad and past president of the International Council for Science in Paris, says the recommendations are sound. “If India is to become a formidable force, incremental approaches will just not work. One needs to leapfrog,” Mehta says. Others are unimpressed. It is a “very environment-friendly document since it has recycled so many old ideas,” scoffs one senior scientist. Commissioned by Singh's office, the panel, led by the chair of the prime minister's scientific advisory council, C. N. R. Rao, a chemist at Jawaharlal Nehru Center for Advanced Scientific Research in Bangalore, flatly declares in its report that “India is yet to become a major force in global science. … Indeed India's relative position in the world of science has declined in the last twenty years.” It blames inadequate investment in science both by the government and by industry. That has led to a disconnect between basic research labs and industry, says chemical engineer Raghunath A. Mashelkar, now president of the Global Research Alliance in Pretoria. The paradigm of “science being born in India but products being born overseas has to be overturned,” he says. “To begin to contribute significantly to world science,” the report says, India's share of scientific papers should rise from the present 2% or so to “something like 10%” over the next 10 years. It also urges Indian researchers to claim more intellectual property: The panel calls for a 10-fold increase in international patents owned by Indians, from 1900 in 2007 to about 20,000 by 2020. And training scientists should get a major boost: India should produce about 30,000 science Ph.D.s a year by 2025, up from 8420 in 2006. To meet those targets, the government should double science spending by 2020, the panel says. It also seeks a$250-million-a-year venture capital fund to develop basic research findings. And the panel calls for the creation of a National S&T Council along the lines of the U.S. National Research Council to help India address urgent issues such as water, energy, and food security.

“We will seriously try to implement the vision,” India's Science Minister Prithviraj Chavan told Science. Any new funding initiatives would be considered in the 2011 budget. Rao is not optimistic. “I feel a bit depressed and discouraged by the state of Indian science,” he says.

6. ScienceInsider

# From the Science Policy Blog

Government lawyers asked an appeals court to suspend a federal judge's ruling that froze federally funded research on human embryonic stem cells. The three-judge panel had tough questions for both sides, but at press time, the court had not yet ruled on whether to allow research to continue while the case proceeds.

Weeks after it emerged that four papers written by prominent gene therapy expert Savio Woo had been retracted and two postdocs subsequently fired, two more papers have been retracted. Mount Sinai School of Medicine in New York City has cleared Woo of wrongdoing, and several papers on his techniques have not been retracted.

The outgoing chair of the science committee in the U.S. House of Representatives says he wishes he had pushed harder to pass the reauthorization of the America COMPETES Act, which supports increased U.S. spending on research and science education. But Representative Bart Gordon (D–TN) still hopes the Senate will take up the House-passed version in the lame-duck session after next month's elections. Meanwhile, top U.K. science officials have lobbied the British government to avert spending cuts expected to be part of next year's budget, to be released.

Chinese police have detained a top surgeon in an investigation into the assaults on a critic of medical research and an investigative journalist. Xiao Chuanguo, chief urology surgeon at Tongji Hospital in Wuhan, was detained in connection with attacks on misconduct watchdog Fang Shimin and reporter Fang Xuanchang.

Scientists can entice the public to donate to their work on a new philanthropy site called SciFlies. Started by a veteran political fundraiser and scientist with experience as an entrepreneur, the site hopes to raise funds for grants as large as $100,000 to support scientists in all fields. For more science policy news, visit http://news.sciencemag.org/scienceinsider. 7. Conservation # Filling Gaps in Global Biodiversity Estimates 1. Elizabeth Pennisi Data deficient. Biologists use this term to describe a species for which too little is known to determine its conservation status. It's a problem at a global scale as well, as scientists try to assess whether biodiversity loss has been stemmed since the Convention on Biological Diversity (CBD) was enacted almost 20 years ago. Scientists, who hope to get new conservation targets approved later this month (Science, 10 September, p. 1272), have had comprehensive information on just a few groups of organisms, such as birds and mammals, but not on some important large groups—especially plants. Because they are so critical to the survival of other species, “we think plants may be more representative of what's happening to the whole of biodiversity,” says Eimear Nic Lughadha of the Royal Botanic Gardens, Kew, in the United Kingdom. Earlier this week, researchers led by Nic Lughadha helped fill this data gap by using an innovative statistical approach called the Sampled Red List Index. By taking a random sample of the known species in various plant groups, such as ferns, and compiling existing information on just that subset, they have concluded that 22% of all plant species are threatened. Previous estimates had ranged between 20% and 80%, making it difficult for policymakers to know how much they should emphasize plants in their conservation strategies. Over the past 2 years, other research teams have been applying this approach to poorly understood groups of animals. For both plants and animals, researchers plan to look at these same sets of species periodically to see how the groups are faring and thus how well humans are protecting biodiversity. The new index is a response to growing frustrations with the International Union for Conservation of Nature's (IUCN's) Red List of Threatened Species, which is an ongoing tally of which species are known to be in trouble. Its data are used to tell how well the CBD is working. The Red List now contains entries on about 45,000 species; birds, mammals, amphibians, and corals are especially well-documented. Those groups are relatively easy to track because each contains a small number of species (say, several thousand) and many experts are studying them—especially birds and mammals. The world's 10,000 avian species have been sampled six times since the early 1970s, for instance, providing enough information to conclude that birds are slightly worse off now than they were then. But for many of the rest of the biosphere's 1.75 million known plants, fungi, and animals, the ratio of species to experts is much less favorable. Who can do a detailed analysis of the range, distribution, and ecology of each of the 380,000 estimated plant species, for example, or 75,000 butterflies, just once, let alone periodically to figure out the trends? In 2005, IUCN recruited experts to see whether a statistical sampling technique, applied to a particular group of flora or fauna, could fill the void—and if so, just how small the sample could be. “It's like a poll; you want to know with which degree of certainty you can determine trends through time,” says Jonathan Baillie, a zoologist at the Zoological Society of London who helped develop this index. A sampled approach would work only if experts could draw on a comprehensive list of species in that group. Using plants and birds as test cases, statisticians decided they needed a minimum sample size of 1500. That was large enough so that the protocol would work even if 40% of the randomly selected species were busts—with no conservation data available. Kew took on plants, finishing the new analysis just in time for this month's U.N. Biodiversity Summit. Working with Kew geographic information system specialist Steve Bachman and volunteers, Nic Lughadha and her collaborators looked at ferns and their relatives, conifers and cycads, and also the monocots, the 75,000-strong group of flowering plants that includes grasses, palms, and lilies. They did not have a comprehensive list for the remainder of flowering plants, formerly known as the dicots, but determined they could study 1500 legumes—a major dicot subgroup—as stand-ins. For some of these randomly picked species, the researchers had little to go on, so there was a “mad scramble” to track down any information about their distribution and ranges, says Neil Brummitt, now with the Natural History Museum in London. Often, they had to rely on labels on herbarium specimens to get a sense of how a plant's range and distribution, for instance, changed through time. Other groups' estimates of the risk of extinction were based primarily on studies of a small group of plants or plants in a particular geographic location, says Nic Lughadha, who adds that the new index provides the first global picture of how plants are doing: in short, about the same as mammals, but worse than birds. The new technique has already provided insights into the world's fauna. Freshwater crabs and fish, two data-deficient groups, are at higher risk than their marine counterparts. And some trends are emerging—such as the potential advantage of having wings. Researchers already knew that birds are better off than mammals and amphibians; similarly, the newly examined dragonflies are in less danger than crayfish and crabs. “It really has improved our knowledge of some of these groups immensely,” says IUCN's Craig Hilton-Taylor. The index approach has its limitations. “By not doing the whole group, you have less information to direct conservation priorities” at the local level, says Hilton-Taylor. And because selection is random, some highly vulnerable species may be left out, says Sacha Spector, an ecologist with the environmental organization Scenic Hudson in Poughkeepsie, New York. Even so, he thinks the sampled approach is providing a broad picture of what's happening: “It will be extremely successful in jump-starting the biological research world [for] doing more assessments and protecting the species that were under the radar before.” 8. Newsmaker Interview: Ian Poiner # Counting the Ocean's Creatures, Great and Small 1. Dennis Normile In a massive undertaking, 2700 scientists from 80 nations have spent the past 10 years compiling everything known about life in the world's oceans: from extinct denizens to present-day biota to what future ecosystems might look like. Next week, participants will gather in London to celebrate the fruits of their labors: the first ever Census of Marine Life (CoML). Launched in 2000 with funding from the Alfred P. Sloan Foundation, CoML gathered some 16 million records into accessible databases. Fieldwork by CoML contributors has added another 2600 publications, including 6000 or so new species, raising the number of known marine species to over 240,000—about 25% of the total presumed to exist. The$670 million effort “has been good value for dollar spent,” says Ian Poiner, chief executive of the Australian Institute of Marine Science in Townsville. The tropical marine ecologist was involved in the census from the beginning and has chaired the scientific steering committee since 2008. Although CoML is a milestone, many questions about ocean life remain unanswered, Poiner notes: “The age of discovery is still with us.”

Q:Has CoML achieved its goals?

I.P.:The census has three key points of focus: One is diversity, or what lives in the ocean; the distribution, where that life is found; and then the abundance, how much life is there. On top of that we looked at what changes have occurred and forecast what [ocean life] might look like in the future.

I think we dealt with diversity very well; distribution and abundance are much more difficult. There is still much to be done, particularly on abundance.

Q:Do you have an idea of what kinds of species are still out there?

I.P.:When you plot the accumulated number of species over time, it's still going upward with no hint of leveling out. Census researchers estimate there will be at least a million species and possibly many more. We know less about the smaller things than the bigger things. And marine microbes are another story. Probably 90% of the ocean biomass will be microbes.

Q:Why is it important to identify those species?

I.P.:Most ecologists view biodiversity as a measure of the health of biological systems. What are all the species doing out there and what's their role in ocean ecosystems? Those questions remain to be sorted out. Getting a better handle on biodiversity and understanding [the implications] of declines in biodiversity and species invasions are things we need to know to effectively manage our oceans for both conservation and for economic benefit.

Q:Aside from the sheer numbers, what are the most significant findings?

I.P.:Things like discovering animals that live without oxygen in the deep Mediterranean. On the Great Barrier Reef, probably a third to a half of the soft corals are new to science. One of the iconic species of the census is the yeti crab from the Pacific near Easter Island. It is not only a new species but a new genus and a new family. There are many of those sorts of discoveries.

The census found life everywhere we looked, and it is much more complex and interconnected than we expected. Probably the other [key finding] is that we humans have had far more impact on the oceans than we had imagined.

Q:The CoML is regularly mentioned in the comic strip Sherman's Lagoon. Was that part of your public outreach efforts?

I.P.:We encouraged and supported it. It's one way to communicate the importance of ocean life. But it is only one of many avenues we've used to enhance the general recognition of the importance of our oceans. National Geographic will also publish a two-sided wall map depicting the major findings of the census.

Q:Is census data being used to understand the impact of the Gulf of Mexico oil spill?

I.P.:The short answer is yes. We released last year the first comprehensive compilation of the biodiversity of the Gulf of Mexico. Sadly, it was timely. It has been and will continue to be used to understand the impact of the Gulf of Mexico oil spill.

Q:Who else will use census data?

I.P.:Census data are already being used for the upcoming Convention on Biodiversity, both by countries reporting their ocean biodiversity and in looking at options for the management of our high seas. [This information] will be used much more extensively in the future. An example is the Great Barrier Reef; it is a large area and one of the most iconic places in the world. The management and zoning and systems to maintain protected areas and multiple-use areas are dependent on the information that has been collected in the census.

Q:What comes next?

I.P.:The census was very much a bottom-up process where the marine science community came together to identify challenges and identify ways to collaborate. Now we are using those networks to help us determine where to go in the future.

This census is the first of, hopefully, many. With the technologies [CoML has] developed, we're now in a much stronger position to continue discovery and exploration and to monitor changes in ocean biological patterns. Challenges [include] getting the balance between discovery and monitoring right and demonstrating the effective use of that information in assessing and managing our oceans.

9. # Growing Prospects For Life on Mars Divide Astrobiologists

1. Richard A. Kerr

As discoveries on Mars, including warm spells and salty soil, raise the chances of finding life there, scientists are considering how best to look for it within their budget.

The microscopic, wormy-looking things found in a meteorite blasted off Mars certainly reinvigorated NASA's search for extraterrestrial life in the 1990s. But the critterlike sightings and almost all of the other evidence for ancient life claimed for the martian meteorite have since faded away. Replacing them in recent years is a string of encouraging discoveries on Mars, including pervasive ice just beneath polar soils, water-wrought minerals of every sort, and soil benign enough to grow asparagus.

“Mars has met all requirements to support life,” says planetary scientist and astrobiologist Bruce Jakosky of the University of Colorado, Boulder. “Life is possible.” But as the planetary science community draws up its prioritized list of missions for the next decade, astrobiologists have split over the next logical step in the search for life on Mars.

Many researchers are content to follow NASA's lead as it cautiously moves from “following the water” in search of likely habitable or once-habitable environments to “following the carbon”—that is, looking for chemical traces of ancient life. Others are not so patient. “It's time to search for life by searching for life,” says astrobiologist Carol Stoker of NASA's Ames Research Center (ARC) in Mountain View, California. The direct detection of life—living, dormant, or recently deceased—should be a high priority, she and others say. The planetary science decadal survey now being drafted and to be released next March will determine whether they're right.

## A life-friendlier planet

The search for martian life took a body blow in the late 1970s, when life-detection experiments on the Viking 1 and Viking 2 landers failed to stir up clear signs of either living creatures or lifeless organic debris. Mars seemed to be dead, as dead as an eons-old, hyperarid, incredibly cold landscape would appear to be.

It wasn't long, however, before things started looking up again. Orbital observations from Viking onward revealed ever more evidence that water gushed across the surface of early Mars, forming lakes and perhaps even a northern ocean at about the time life was getting started on Earth. Spectra returned by orbiters in the past decade showed that liquid water on early Mars stayed around long enough to corrode rock into alteration minerals under life-friendly conditions. And the Mars rover Spirit found ancient minerals produced by hot ground waters, much as happens in the life-infested hot springs of Yellowstone.

Closer to modern times, water—albeit frozen—is turning up in all manner of places. In recent years, orbiting instruments have spied widespread ice just beneath the surface soils of Mars: centimeters down in polar regions and meters down in immobile glaciers at mid-latitudes. But that water might not always have been frozen. Mars's wobbling on its axis would have warmed parts of the planet. Orbitally induced climate cycles mean that Mars has become “a far more interesting place than we thought,” says astrobiologist David Des Marais of ARC. The fresh-looking gullies of Mars, for example, may have been gushing meltwater in the not-so-distant geologic past.

With things looking up for biology on Mars, NASA sent the Phoenix lander to a landing site in the martian Arctic that turned out to be relatively life-friendly. “We believe we have a habitable environment in modern times,” says Stoker, who is on the Phoenix team. The site is “not necessarily growing organisms today,” she says, but “over the past 10 million years during warm conditions,” life could have survived and perhaps thrived.

To put a number on that assertion, Stoker and 12 co-authors published a “semiquantitative” analysis of the Phoenix site's potential habitability online 16 June in the Journal of Geophysical Research-Planets (JGR). They considered the likelihood of four factors essential for life. One is liquid water. Phoenix's discovery of carbonate requires liquid water in the past to produce the mineral, they noted. The high concentration of perchlorate salt found by Phoenix offers a means of forming brines that could remain liquid even at today's temperatures. And even without brines, orbitally induced warming could have allowed liquid water in the recent past.

Phoenix's perchlorates could also serve as an essential energy source for microbes with a taste for such chemicals, the group said. Nutrients such as nitrogen and phosphorus blow in on dust. And environmental conditions, such as the Phoenix-measured pH, are benign. All that makes martian high latitudes “more tantalizing,” says atmospheric chemist Sushil Atreya of the University of Michigan, Ann Arbor.

## A lively Mars, ever?

An even more tantalizing hint of possible current life showed up in 2003. Researchers reported finding traces of methane gas in the atmosphere of Mars, first in spectroscopic observations returned by Mars Express and then in telescopic observations from Earth. A lifeless, subsurface water-rock reaction called serpentinization could be the source, but living, methane-belching microbes would work, too.

Another encouraging sign just came from the laboratory. Astrobiologist Rafael Navarro-González of the National Autonomous University of Mexico in Mexico City and his colleagues report in a paper in press in JGR their terrestrial version of the Viking search for organic matter on Mars. In the Viking experiments, a martian soil sample was heated to 500°C and the resulting gases analyzed. Instead of organics from the soil, Viking detected only carbon dioxide and two chlorinated methane compounds. The latter were considered to be contaminants brought from Earth, even though they did not appear when the experiment was run without soil while Viking was in space.

Navarro-González and his colleagues, including Christopher McKay of ARC, reran the Viking experiments in the lab using the most Mars-like soil available, one from the heart of the Atacama Desert of Chile containing a trace of organic matter. When they added a bit of perchlorate before the heating, in line with the Phoenix discovery, they duplicated the Viking results: no volatilized organics, some carbon dioxide, and the same two chlorinated methane compounds.

“The simplest explanation is that there was perchlorate and organic matter in the Viking soil samples,” says McKay. When heated, the perchlorate would have oxidized most of the organic matter to carbon dioxide and chlorinated a small amount of it. If the Viking soil did contain organic matter, it could have come from lifeless meteorites and cosmic dust that rains onto Mars. But “we did dispel the idea that there's no point in searching for” molecular traces of life in martian organic matter, McKay says.

## Go for it?

All the upbeat developments have some astrobiologists raring to go. “The theme has been ‘Follow the water,’” says Stoker, “but we understand enough to take the next step.” That step would be attempting to directly detect active or dormant present-day life. “The probability of identifying life is higher for modern” than for ancient life, says Stoker. The best approach, McKay says, is to send modern biochemical sensors to Mars capable of directly detecting complex biomolecules like DNA from living or dormant organisms. Such a prospect has made some astrobiologists “impatient to do something directly relevant to the search for life,” says McKay, “rather than taking ‘more pictures of rocks.’”

NASA's astrobiology plans for Mars are “geared much more toward ancient life than present-day life,” notes Jakosky. In 2011, NASA will send the Mars Science Laboratory, now dubbed Curiosity, to follow the carbon at a low-latitude site. In a joint 2016 effort with the European Space Agency, it will send the Trace Gas Orbiter to check on the methane. And for 2018, NASA is considering a rover that would collect likely samples for eventual return to Earth and detailed laboratory analysis for ancient life.

A search for ancient life has plenty of supporters. “A majority [of astrobiologists] would say ancient life will be easier to find than present-day life,” says Jakosky. To find present-day life, researchers have to first identify a likely site, he says. Then they would have to choose which measurements would detect alien life. In hindsight, Viking scientists erred in designing their life-detection experiments for microbes accustomed to Earth's most benign conditions. And ruling out microbial contamination from Earth would be challenging. Although going after present-day life could yield “the most profound result,” says Jakosky, it would be “very risky. The odds are against it even if life is present.”

A committee of the National Research Council is now balancing the odds of success of the two approaches against the available funding in the United States as it finishes the first draft of the Planetary Science Decadal Survey. Its prioritized list of missions for 2013 to 2022 is “exciting and actually implementable,” says committee chair Steven Squyres of Cornell University. How lively its missions to Mars will be remains to be seen.

10. Archaeology

# Archaeologists See Big Promise In Going Molecular

1. John Travis

Creative applications for ancient DNA emerged at a meeting, where researchers also described studies of other preserved molecules, including ancient RNA and proteins.

COPENHAGEN—As a field, ancient DNA is getting, well, old. It's been more than 2 decades since scientists began analyzing fragments of preserved DNA from tissues that are hundreds, or even tens of thousands, of years old (Science, 17 November 2006, p. 1068); the approach arguably reached a climax earlier this year with the unveiling of the draft nuclear genomes of a 4000-year-old Eskimo and a 38,000-year-old Neandertal.

Yet last month's International Symposium on Biomolecular Archaeology (ISBA)—a biennial meeting devoted to studies of isotopes and DNA in ancient biological materials—revealed that ancient DNA research is poised to undergo a new kind of growth spurt. Flooded with data from next-generation sequencing technologies, researchers are now exploring a host of new applications and addressing an ever-wider circle of questions about ancient ecologies, human behavior, and lifestyle. They're probing ancient medical practices by extracting DNA from Roman herbs, for example.

Some are moving beyond isotopes and DNA to probe new classes of ancient molecules, using ancient RNA to study gene activity and identifying ancient species with bits of collagen from fossil bones. “Everyone is in a rush to find the next methodology,” says bioarchaeologist Greger Larson of Durham University in the United Kingdom. “We are all testing things very quickly without knowing how successful any will be.”

## Seed of an idea

Take ancient RNA, for example. Few thought it made sense to even look for DNA's cousin in ancient biological samples, because RNA is much more fragile than DNA. But M. Thomas Gilbert of the University of Copenhagen's new Centre for Geogenetics (an ambitious ancient DNA lab funded by the Danish government and officially launched at the start of the meeting) became intrigued years ago when he read of a date palm germinating from a 2000-year-old seed from the ancient Jewish fortress of Masada. To Gilbert, that suggested the ancient seed still harbored RNA and that other old seeds might as well. At the meeting, Gilbert's colleague Sarah Fordyce reported that they have been able to sequence many RNAs, including the messenger RNAs (mRNAs) made when genes are transcribed, from Chilean and North American maize seeds that are more than 700 years old. “We think we can do ancient transcriptomics,” she says.

Seeds are a good source for ancient RNA, because they have evolved to keep genetic material intact under harsh conditions. “Seeds are particularly valuable objects for studies of ancient nucleic acids,” says Matti Leino of the Swedish Museum of Cultural History in Stockholm, whose own group recently sequenced DNA in historical seed collections and found evidence challenging the importance of a mutation hypothesized to be central to wheat domestication.

Preserved RNA in seeds could be a boon to studies of ancient plant domestication, as levels of mRNAs reflect changes in gene activity that may be as important to the process as changes in protein-coding DNA sequences. A plant's RNA profile differs from tissue to tissue, but Gilbert's team suspects that seeds, as a crucial developmental stage, nonetheless will offer key insights. Gilbert's is not the only team that has discovered the protective power of seeds; a group led by Robin Allaby of the University of Warwick in the United Kingdom has isolated RNAs from ancient Egyptian barley seeds and hopes to identify a variety of RNAs that regulate gene activity.

RNA may be considered fragile, but the bone protein collagen is among the best survivors of the ravages of time. RNA can last hundreds of years, and DNA tens of thousands, but collagen apparently sometimes survives for millions of years in a bone. That's why Michael Buckley of Bournemouth University, while working in the laboratory of Matthew Collins of the University of York, both in the U.K., spearheaded a technique to use collagen fragments, or peptides, to identify the species of nondescript bits of bone. Called ZooArchaeology by Mass Spectrometry (ZooMS), the method relies on subtle differences among species in the amino acid sequence, and thus mass, of collagen. For less than $10 a test, scientists may one day be able to take any bit of fossilized bone and identify the genus, or even species, it came from, says Buckley. (Collins has nicknamed collagen the “bar code of death” because it can identify fossils, whereas DNA is sometimes used as a bar code to identify living species.) Collins's team has so far demonstrated ZooMS's ability to identify a wide range of fossil animal bones. It can even distinguish between ancient sheep and goat bones, a notoriously difficult problem that has blurred the picture of how people domesticated each animal. Collins's group is now building up a database of collagen's amino acid sequences that more fully spans the animal kingdom. ZooMS is “fast and cheap and can be a real neat way to ID species,” says Gilbert. And it has other potential uses, such as identifying the collagen sequence of extinct species for which no DNA has been isolated, helping better place those animals within an evolutionary tree. “Every couple of weeks, we think of something new to do with it,” says paleobiologist Ian Barnes of Royal Holloway, University of London, who has begun to collaborate with Collins's group. Other researchers suspect that the future of ancient proteomics may involve more than just collagen, as there's increasing evidence that many proteins in bones and in other biological materials can survive for long periods, at least in a fragmented form that can be analyzed by mass spectroscopy. Collins's team is trying to extend the ZooMS technique to keratin proteins, to analyze animal hair and feathers used by ancient peoples, for example. ## An ancient pharmacopeia As the proteomics researchers looked to the future, ancient DNA researchers presented a few new tricks of their own. Botanist Alain Touwaide of the Smithsonian Institution in Washington, D.C., studies ancient “pharmacological” texts to understand how ancient people treated illnesses. But a sunken 1st century Roman shipwreck containing a host of medical items, including pill-like objects still dry in waterproof clay cylinders, offered him an unusual chance to verify those documents. Touwaide suspected that the pills or tablets, which may have been applied to skin rather than ingested, were medicines composed of ground-up plants, so he recruited Smithsonian ancient DNA expert Robert Fleischer to analyze two tablets provided by an Italian museum. At the meeting, Fleischer reported that he has indeed found chloroplast DNA in the tablets, so far identifying 10 different types of plants, including celery, carrots, and hibiscus. Touwaide hopes ultimately to match the tablets' contents to one of his texts and identify its possible therapeutic use. It would be the first time that “ancient medicines are analyzed and then interpreted on the basis of ancient scientific literature,” he says. “The results Rob has gotten are something I've been dreaming of for 35 years.” Touwaide predicts that such work will ultimately offer a “new direction” for drug discovery. “We're doing applied history,” he says. Researchers in Copenhagen noted that Fleischer hasn't yet gotten long enough DNA sequences to identify every plant in the tablets to the species level, and some worry about contamination. But many were impressed. Ancient DNA researcher Ludovic Orlando of the Centre for Geogenetics says he likes “the idea of getting access to ancient pill recipes using DNA. This approach can be really powerful.” Other ISBA presentations extended ancient DNA studies to new types of preserved biological specimens. For example, Charlotte Oskam and Michael Bunce of Murdoch University in Perth, Australia, have developed a method to tease out DNA sequences from fossil eggshells. They're using it to identify smashed and burned shell fragments left by the hunters who colonized New Zealand about 700 years ago. These people apparently drove large flightless birds such as the moa to extinction, and identifying the species and quantity of eggs at each site may suggest how quickly this happened. And Luise Brandt in Gilbert's group reported the first success at obtaining DNA from ancient wool textiles; researchers had worried that wool processing destroyed genetic material. The technique could reveal where the wool came from, illuminating early trade routes for textiles. ISBA is a young conference, begun in only 2004, and to date it has always been held in Europe. But in 2012 it moves to China, whose representatives lobbied hard to host the meeting. For China, an ancient country with an increasing passion for cutting-edge science, biomolecular archaeology is apparently an irresistible combination. 11. IEEE International Conference On Computational Intelligence And Games # Game-Miners Grapple With Massive Data 1. John Bohannon Researchers have been using massively multiplayer online games as natural laboratories, but the data are a challenge to interpret. At the meeting, a computer scientist presented results obtained using a mathematical tool called archetype analysis that works by identifying the most extreme data points and describes the rest as combinations of these archetypes. After describing his methods with dozens of mathematical formulae, Christian Thurau's next slide shows the result. It looks like a pile of fine metal dust in a magnetic field, revealing the invisible lines of force. This plot comes from the kind of data set that social scientists dream about: a flawless digital record of the social behavior of more than 10 million people interacting in a highly controlled setting over a 4-year period. But Thurau, a computer scientist at the Fraunhofer Institute for Intelligent Analysis and Information Systems in Bonn, Germany, never observed any of his research subjects, at least not in person. He harvested the data from World of Warcraft (WoW), the wildly popular online fantasy game. It's called game-mining: digging for insights on human behavior in the terabyte-sized data logs generated by computer games. “You have over 10 million people playing World of Warcraft about 4 hours per day, 7 days a week,” says Jaideep Srivastava, a computer scientist at the University of Minnesota, Twin Cities. “And that's on average; some play 80 hours per week!” Because players' interactions are automatically recorded in WoW and many similar virtual worlds, researchers can use these massively multiplayer online games as natural laboratories. But the data are a challenge to interpret. Thurau's WoW study is a case in point. His goal was to reveal the evolution of WoW's guilds, the groups that players voluntarily form with each other to socialize, share resources, and slay monsters. Just the basic demographic information associated with the guilds amounted to 192 million 70-dimensional data points that represent information on the levels, skills, and activities of the players. “How do we make sense of that?” asks Thurau. After failing with the classical techniques for finding patterns in high-dimensional data sets, he turned to a mathematical tool called archetype analysis, developed in the 1990s for physics and economics research. The original method failed at first because its computing time grows exponentially with the size of the data set, but Thurau devised a mathematical shortcut. The method works by identifying the most extreme data points—in this case, the guilds that are most different from each other—and describes the rest of the guilds as combinations of these archetypes. “It turns high-dimensional data into something that makes sense to humans,” says Thurau. The output was an eight-dimensional “shadow” of the WoW data, projected as a simple 2D plot, that evolves over the course of the 4-year period of the study (see figure, p. 30). For the most part, says Thurau, the results confirm many researchers' hunches about the social behavior of WoW players. Only a small fraction of guilds are active, those run by highly organized, ambitious groups of players. In spite of the staggering fraction of their lives spent in the game, most players are “casual rather than hardcore,” he says. Game-mining isn't just for multiplayer games. A team led by Georgios Yannakakis, a computer scientist at the IT University of Copenhagen, described player behavior in Tomb Raider: Underworld, a single-person game in which a gun-toting female archaeologist steals artifacts from ruins. They analyzed data from 10,000 players on the Xbox Live network, covering 35 different variables such as the use of weapons, the rate of progress, and whether it was tigers, traps, or other hazards that killed them. Their aim was to train a computer to predict the level at which any given player will eventually quit the game out of frustration—one of the hopes of the game industry is to create “personalized” games that adapt to each player's abilities and interests. The computer wasn't perfect at foretelling the players' fates, but it was far better than random. Just by observing how people played the first two levels of the game, it could predict with 77% accuracy where they would give up. Much of that prediction power came from counting the number of seconds players took to navigate a single obstacle, the jellyfish-filled “flush tunnel.” Yannakakis says the accuracy should improve as he tracks more players for the training, as well as obtaining “finer granularity” in the data, such as players' exact routes of movement. 12. IEEE International Conference On Computational Intelligence And Games # Killer Bots Are Getting Human 1. John Bohannon The bot that won the third annual 2K BotPrize, a competition to create artificially intelligent game-playing agents that can fool a judge into believing they are human, represents a leap forward for game AI because the team used machine consciousness rather than just mimicking human behaviors. It was standing room only in the computer lab as intense violence played out on a giant screen. The game was Ultimate Tournament 2004, the classic multiplayer first-person shooter. But not all of the avatars blasting at each other were controlled by humans. Half of them were bots programmed by scientists in the room, nervously monitoring their programs for crashes. This was the third annual 2K BotPrize, a competition to create artificially intelligent game-playing agents that can fool a judge into believing they are human. The contest is a variation on a classic test, first proposed in 1950 by computing pioneer Alan Turing, in which a judge has a conversation with a human and a computer and must decide which is which. The Turing test still defeats artificial intelligence (AI) 60 years later; machines largely remain terrible conversation partners. Action-based video games can offer an alternative Turing test. “They don't require speech, they provide a highly constrained environment but are still a challenge for AI,” says Philip Hingston, the computer scientist at Edith Cowan University in Perth, Australia, who organized the contest. The rules are simple: Avatars controlled by humans and bots are dropped in a complex environment littered with weapons. It's kill or be killed. Each round, some of the human players—the judges—must decide which of their opponents are machine-controlled, based solely on their behavior. The team that designed the bot best at fooling the judges wins the$5000 prize and a trip to Australia, funded by the game company 2K.

This year's prize was scooped by Conscious Robots, a team of Spanish computer scientists. Its bot represents a leap forward for game AI, says Hingston, because the team used machine consciousness, a technique rarely applied because of its complexity. Rather than just mimicking human behaviors—such as using imperfect aim or introducing randomness into running routes—the team's bot was designed to think like a human. “In our approach, we try to effectively integrate several cognitive skills, like attention and learning,” says Raúl Arrabales Moreno, a computer scientist at the University Carlos III of Madrid. The bot has a set of innate behaviors that are regulated by a higher control system, similar to the role of a conscious mind. It was incorrectly identified as human by the judges 32% of the time. By comparison, one human player was incorrectly identified as a machine 35% of the time. “There is only a slender gap between the humans and bots now,” says Hingston.

“There has been significant progress since the 2009 competition,” says Simon Lucas, a computer scientist at the University of Essex in the United Kingdom and one of the human players in the contest. Besides creating more engaging computer-controlled opponents for mass-market video games, the goal is to create better AI agents for “serious games” that simulate natural disasters and other complex problems (see above). Lucas predicts that a bot will be fully indistinguishable from human players “within the next 2 years.”

13. IEEE International Conference On Computational Intelligence And Games

# Smarts for Serious Games

1. John Bohannon

Games used for teaching or training face big problems because of their simple programming. A computer scientist presented a solution to those problems at the meeting: a new architecture for so-called serious games that uses artificial-intelligence techniques similar to those in some of the latest video games.

You are a firefighter. As a blaze spreads across the factory, a paint canister goes off like a bomb. There are still panicked workers to be cleared. And to make matters worse, one of your crew is injured. How do you proceed?

Don't worry, it's just a game. But playing it could save lives. Games with ulterior motives such as teaching or training people—known among researchers as “serious games”—are on the rise, providing a cheap and safe supplement to on-the-job training. But serious games face “big problems” because of their simple programming, says Joost Westra, a computer scientist at Utrecht University in the Netherlands. In a real fire, there can be hundreds of people making unpredictable decisions all around you. Yet the nonplayer characters (NPCs) in the games usually follow tightly scripted behaviors, so unless you play exactly as the programmer expects, NPCs behave like confused robots. Another flaw in serious games is that they use “fixed scenarios or simple rules to determine the course of the game,” says Westra. “Expert users can quickly estimate how the game will react to their actions” but still must play through the easy levels to reach their proper level. The result, he says, is “disengagement, boredom, and possibly quitting the game before that level is reached.”

To fix these problems, Westra created a new architecture for serious games that uses artificial-intelligence (AI) techniques similar to those in some of the latest video games. He focused on a game called RescueSim, a serious game for firefighters. Rather than following scripts, Westra's code turns each NPC into an autonomous agent with its own nuanced goals, responding to events as they happen. An NPC firefighter, for example, will have the goal of extinguishing a fire but can switch to helping an injured comrade if no one else is near. An NPC's awareness of what the game's player and the other agents are doing is crucial, Westra says, because firefighting requires teamwork. One firefighter must turn on the pump while another keeps doors closed to prevent drafts that feed the fire; yet another must operate the hose.

In early testing of the system, the AI architecture shows promise. Not only does it make NPCs act reasonably, Westra says, but the entire game can also now adapt to different users. Beginners take on only simple jobs while NPCs take care of the rest; expert players must learn to command a crew in complex situations. “A game needs to be built with this architecture from the beginning,” says Westra, who plans to design a “bush fire team training” game with collaborators in Turkey.

“This is the future of serious games,” says Kyong Jin Shim, a computer scientist at the University of Minnesota, Twin Cities, who is developing such a system for training U.S. soldiers. “We need smarter agents and ingame characters.”