Preserving File Names Across Their Perilous Journey Through File Systems

Abstract

File names are more than mere identifiers; they are artifacts of the digital environments in which they were created. They reflect the technical limitations of specific file systems and, often, the cultural context of their creators. As long as a file remains within its original environment, its name rarely raises concerns. However, when files are migrated to new systems – whether through organizational processes or preservation efforts – problems emerge. One common approach is to normalize file names by restricting them to a limited character set. While technically expedient, this method disregards the intent of the original creator. We advocate for retaining original file names whenever possible, viewing them as critical traces of context. However, this approach is not without challenges. We will discuss specific scenarios that have prompted us to re-examine our commitment to preserving original file names. These include file names containing control characters, Unicode Extended Use Area characters, and characters that are difficult to express in XML. Through case studies, we will explore the tension between technical pragmatism and the preservation of cultural context in digital objects. This lightning talk aims to spark a discussion about the significance of file names in digital preservation and to encourage a more nuanced approach that balances technical considerations with respect for the cultural and historical value embedded in these seemingly mundane digital artifacts.

Details

Creators
Nick Krabbenhoeft; Hilary Shiue
Institutions
Date
2024-09-19 13:25:00 +0100
Keywords
approaches to preservation; from document to data
Publication Type
lightning talk
License
Creative Commons Zero (CC0-1.0)
Video Stream
here
Collaborative Notes
here