Email Preservation at Scale: Preliminary Findings Supporting the Use of Predictive Coding

Email Preservation at Scale: Preliminary Findings Supporting the Use of Predictive Coding

Abstract

Email provides a rich history of an organization yet poses unique challenges to archivists. It is difficult to acquire and process due to sensitive content and diverse topics and formats, which inhibits access and research. Predictive coding alleviates these challenges by using supervised machine learning to: augment appraisal decisions, identify and prioritize sensitive content for review and redaction, and generate descriptive metadata of themes and trends. Following the authors’ previous work which describes the project at its inception, preliminary findings support the use of predictive coding as an effective tool to enable digital preservation at scale. Specific tools, methodologies, and human factors that affect their success are discussed.

Details

Creators: Joanne Kaczmarek; Brent West
Institutions
Date
Keywords: boston
Publication Type: paper
License: CC BY 4.0 International
Download: 402222 bytes

View This Publication