Newswise — Every weekday at 5:00 a.m., a nondescript gray van rolls down the underground service road beneath the French National Library, in Paris, and arrives at a svelte glass skyscraper soaring above the bustling Seine River. Here, at the Tower of the Times, the van delivers a tiny but astoundingly rich snapshot of life in a country that takes its cultural heritage very seriously. The van has been stuffed with two copies each of some of the 3000 periodicals printed recently in France that are being sent to the library for preservation.

The Web, however, is unlike a hard-copy publishing platform--not simply because it is amorphous and immeasurably large but because its "documents" are boundless. An "online publication" exists in a perpetual state of being updated, and it cannot be considered complete in the absence of everything it's hyperlinked to. To capture even a trace of the Web is requiring engineers to build a new, more sophisticated generation of software robots to trawl the Web's vast and varied content.