NET NEWS: Search Engines Fall Short

Science  16 Jul 1999:
Vol. 285, Issue 5426, pp. 295
DOI: 10.1126/science.285.5426.295b

To find that one snippet of information hidden in the thicket of data on the Web, most users turn to search engines. But a survey of the 11 most widely used search engines, reported in last week's issue of Nature, suggests that these trailblazers are losing the scent. No single search engine, the study found, covered more than 16% of the Web's contents (see table).

Computer scientists Steve Lawrence and C. Lee Giles of the NEC Research Institute in Princeton, New Jersey, first carried out a “census” of the Web's contents. By checking out what was behind 3.6 million randomly chosen IP numbers—the unique number for each Web address—the researchers calculated there to be about 3 million servers hosting 800 million pages. Next the duo ran 1050 queries on the 11 search engines and compared them to the results for Northern Light, which produced the most hits and, at 128 million pages in its index, covered about 16% of the Web. The other engines ranged from 15.5% to a measly 2.2% coverage.

This situation won't last forever, Lawrence predicts. The “growth rate [of the Web] will presumably slow down, so the search engines should catch up,” he says. Meanwhile, the 11 engines combined cover about 42% of the Web, making software tools like MetaCrawler that harness together several search engines “the way to go” for exhaustive searches, says Lawrence.

View this table:

Navigate This Article