On Enhancing the FFMA Knowledge Base

Abstract

Ensuring the long term access to digitized content is a major concern of digital libraries. The document migration and summarization are key activities employed reach this goal. The evaluation of preservation friendliness and making recommendations for long term preservation requires deep domain knowledge which is currently not available in any integrated knowledge base. In this paper we present an approach for enhancing the automatic aggregated knowledge on computer file formats. A clustering algorithm is employed to identify related file formats and to predict missing semantic associations between file formats and software tools. This is used to improve the discovery of software tools supporting the less popular file formats.

Details

Creators
Gordea, Sergiu; Graf, Roman
Institutions
Date
Keywords
digital preservation; file format categorization; related file formats; lisbon
Publication Type
paper
License
CC BY-SA 2.0 AT
Direct Download
774803 bytes

View This Publication