Training AI Models From Within A Digital Preservation System: Ensuring Machine Learning approaches can learn from users

Abstract

Artificial Intelligence and Machine Learning tools are highly dependent on the datasets that are used to train them. Building datasets for general purpose usage is something that Commercial-Off-The-Shelf tools and services have done well, but they are not necessarily tuned to domain specific needs. Creating custom datasets for this purpose is often possible with these services, but is often complex or imposes complex requirements on an organization seeking to do it. This paper describes work undertaken to allow the refinement of datasets and re-training of custom models to be performed through simplified user interfaces that can be directly embedded within an existing digital preservation system.

Details

Creators
Jack O'Sullivan
Institutions
Date
2024-09-17 11:50:00 +0100
Keywords
information technology for dp; from document to data
Publication Type
paper
License
Creative Commons Attribution 4.0 (CC-BY-4.0)
Download
(unknown) bytes
Slides
here
Video Stream
here
Collaborative Notes
here

View This Publication