filedriller: Marrying Siegfried and the National Software Reference Library

Abstract

Working with huge collections of unstructured data is a common yet still challenging task in digital preservation. This paper presents a tool for finding irrelevant files in large data sets to spot the relevant. The tool builds on two well-known and frequently used applications, respectively data sets and combines and extents them in a meaningful way.

Details

Creators
Steffen Fritz
Institutions
German Literature Archive
Date
Keywords
format identification; tool; automation; siegfried; nsrl
Publication Type
paper
License
CC BY 4.0 International
Download
229108 bytes

View This Publication