A Risk Analysis of File Formats for Preservation Planning

Abstract

This paper presents an approach for the automatic estimation of preservation risks for _le formats. The main contribution of this work is the definition of risk factors with associated severity levels and their automatic computation. Our goal is to make use of a solid knowledge base automatically aggregated from linked open data repositories as the basis for a risk analysis in the digital preservation domain. This method is meant to facilitate decision making with regard to preservation of digital content in libraries and archives. We have developed a tool for aggregating rich and trusted _le format descriptions. It exploits available linked data resources and uses expert models to infer knowledge regarding the long-term preservation of digital content. The ontology mapping technique is employed for collecting the information from the web of linked data and integrating it in a common representation. Furthermore, we employ AI techniques (i.e. expert rules, clustering) for inferring explicit knowledge on the nature and preservation-friendliness of the _le formats. A statistical analysis of the aggregated information and the qualitative analysis of the aggregated knowledge are presented in the evaluation part of the paper. A Web service is created to support programmatic access to format and risk analysis reports.

Details

Creators
Roman Graf; Sergiu Gordea
Institutions
Date
Keywords
digital preservation; risk analysis; linked open data; preservation planning; ontology matching; information integration; lisbon
Publication Type
paper
License
CC BY-SA 2.0 AT
Download
855977 bytes

View This Publication