Library sets out to archive every piece of British content on the Web
April 6, 2013, 12:06 am TWN
LONDON -- Capturing the unruly, ever-changing Internet is like trying to pin down a raging river.
But the British Library is going to try.
For centuries the library has kept a copy of every book, pamphlet, magazine and newspaper published in Britain. Starting Saturday, it will also be bound to record every British website, e-book, online newsletter and blog in a bid to preserve the nation's “digital memory.”
As if that's not a big enough task, the library also has to make this digital archive available to future researchers — come time, tide or technological change.
The library says the work is urgent. Ever since people began switching from paper and ink to computers and mobile phones, material that would fascinate future historians has been disappearing into a digital black hole. The library says firsthand accounts of everything from the 2005 London transit bombings to Britain's 2010 election campaign have already vanished.
“Stuff out there on the Web is ephemeral,” said Lucie Burgess, the library's head of content strategy. “The average life of a web page is only 75 days, because websites change, the contents get taken down.
“If we don't capture this material, a critical piece of the jigsaw puzzle of our understanding of the 21st century will be lost.”
The library is publicizing its new project by showcasing just a sliver of its content — 100 websites, selected to give a snapshot of British online life in 2013 and help people grasp the scope of what the new digital archive will hold.
They range from parenting resource Mumsnet to online bazaar Amazon Marketplace to a blog kept by a 9-year-old girl about her school lunches.
Like reference collections around the world, the British Library has been attempting to archive the Web for years in a piecemeal way and has collected about 10,000 sites. Until now, though, it has had to get permission from website owners before taking a snapshot of their pages.
That began to change with a law passed in 2003, but it has taken a decade of legislative and technological preparation for the library to be ready to begin a vast trawl of all sites ending with the suffix .uk.