NET NEWS: Search Engines Fall Short

See allHide authors and affiliations

Science  16 Jul 1999:
Vol. 285, Issue 5426, pp. 295b
DOI: 10.1126/science.285.5426.295b

To find that one snippet of information hidden in the thicket of data on the Web, most users turn to search engines. But a survey of the 11 most widely used search engines, reported in last week's issue of Nature, suggests that these trailblazers are losing the scent. No single search engine, the study found, covered more than 16 of the Web's contents (see table).

Computer scientists Steve Lawrence and C. Lee Giles of the NEC Research Institute in Princeton, New Jersey, first carried out a census of the Web's contents. By checking out what was behind 3.6 million randomly chosen IP numbersthe unique number for each Web addressthe researchers calculated there to be about 3 million servers hosting 800 million pages. Next the duo ran 1050 queries on the 11 search engines and compared them to the results for Northern Light, which produced the most hits and, at 128 million pages in its index, covered about 16 of the Web. The other engines ranged from 15.5 to a measly 2.2 coverage.

This situation won't last forever, Lawrence predicts. The growth rate [of the Web] will presumably slow down, so the search engines should catch up, he says. Meanwhile, the 11 engines combined cover about 42 of the Web, making software tools like MetaCrawler that harness together several search engines the way to go for exhaustive searches, says Lawrence.

View this table:

Navigate This Article