MIXED: Repository of Durable File Format Conversions

Abstract

DANS (Data Archiving and Networked Services), the Dutch scientific data archive for the social sciences and humanities is engaged in the MIXED project to develop open source software that implements the “smart migration” strategy concerning the long-term archiving of file formats. Smart migration concerns the conversion upon ingest of specific kinds of data formats, such as spreadsheets and databases, to an intermediate XML formatted file. It is assumed that the longterm curation of the XML files is much less problematic than the migration of binary source files and that the intermediate XML file can be converted in an efficient way to file formats that are common in the future. The features of the intermediate XML files are stored in the so-called SDFP schema (Standard Data Formats for Preservation). This XML schema can be considered as an umbrella as it contains existing formal descriptions of file formats developed by others. SDFP contains also a schemas developed by DANS, e.g. a schema for file oriented databases. It can be used e.g. for the binary "DataPerfect" format that was used on a large scale about twenty years ago and for which no existing XML schema could be found. The software developed in the MIXED project has been set up as a generic framework, together with a number of plug-ins. It can be considered as a repository of durable file format conversions. The MIXED project is at its ending phase and this paper contains an overview of the results.

Details

Creators
René van Horik; Dirk Roorda
Institutions
Date
Keywords
san francisco
Publication Type
paper
License
CC BY-SA 3.0 AT
Download
934883 bytes

View This Publication