Preserving digital news by partnering with newspapers and their platforms

Abstract

After researching the digital news preservation landscape in the USA, Portico identified a potential gap. While libraries and archives have strategies for preserving print newspapers, hyper-local digital newspapers are less likely to be preserved. This is at a time of rapid loss of hyper-local print news and an increased dependence on digital-only news. A mechanism for preserving digital news is web archiving, but due to the rapid turnover of stories on news websites, it can be difficult to visit the site frequently enough to capture every article. Some newspapers provide RSS feeds, but not all, and it can be difficult to detect corrected articles in these feeds. Other newspapers implement a subscription model and cannot be harvested without a special arrangement. Some newspapers are aggregated into larger databases, but these often don’t include the smallest digital-only platforms and are for-profit subscription services that may not be preserved by a third party. Portico is a community-supported dark archive for scholarly material that forms agreements and works with publishers to preserve their content. Based on this research, Portico initiated a pilot to determine if digital news articles could be managed in a similar way to journal articles. Portico partnered with a single newspaper and worked with their content management system provider to retrieve an XML export of every article. The XML and supporting files (photos etc.) were successfully ingested into the archive and were similar to journal articles. To confirm if this was repeatable, Portico worked with another newspaper on the same platform and reused the workflow with few changes. Portico is repeating this experiment with two more newspapers on different platforms. If the content can be archived from each, Portico will seek to expand the work and develop a business model to support a broader effort in digital news preservation. An early step will be to reach out to the ~3000 newspapers on the platforms that have already been configured. For the poster, the author will share details of the process used for this project and seek feedback from the community about the value of this approach for preserving digital news.

Details

Creators
Karen Hanson
Institutions
Date
2024-09-17 13:35:00 +0100
Keywords
approaches to preservation; scaling up
Publication Type
poster
License
Creative Commons Attribution 4.0 (CC-BY-4.0)
Download
(unknown) bytes
Collaborative Notes
here

View This Publication