Acquiring and providing access to historical web collections

Abstract

Every day, unique valuable information that describes our current days disappears from the web. National archives or libraries have been keeping cultural heritage for centuries by collecting and preserving past generation objects or printed media. Now, it is mandatory to preserve digital cultural heritage in the form of web content. The Portuguese Web Archive project began in 2008. Since then, it has periodically collected live-web content to be preserved but also acquired historical web collections from third-parties previously published. However, storing information before it vanishes from the web is not enough to make web archives useful to societies. Thus, the Portuguese Web Archive developed and made freely available several software tools to enable access to web-archived collections. The Portuguese Web Archive provides a full-text search service to access 1 131 million files archived from the web since 1996 (www.archive.pt). It also provides access methods to enable research and development activities over web-archived data.

Details

Creators
Daniel Gomes; David Cruz; João Miranda; Miguel Costa; Simão Fontes
Institutions
Date
Keywords
web archiving; digital preservation; portuguese web archive; lisbon
Publication Type
paper
License
CC BY-SA 2.0 AT
Download
272875 bytes

View This Publication