MULTILINGUAL LABELS FOR DIGITAL PRESERVATION

Abstract

We introduce a technique for finding multilingual translations for lists of words using technologies of the Semantic Web. We present four subsets of data from Wikidata and Wikipedia as sources of multilingual labels. Our sample dataset consists of seven terms related to digital preservation. We compare the number of labels we can source for these terms from other human languages via SPARQL queries using the Wikidata Query Service. After discussing the composition of each subset, we detail their advantages and disadvantages. Providing multilingual la- bels as additional access points for resources such as on- tologies, vocabularies and user interfaces for applications increases the relevance of these resources to a larger per- centage of the global population. Increasing multilingual access promotes inclusion for a broader range of people, which leads to greater diversity in the digital preservation community.

Details

Creators
Thronton, Katherine; Seals-Nutt, Kenneth
Institutions
Date
Keywords
wikidata; semantic web; multilingual data; knowledge graph subsets
Publication Type
paper
License
CC-BY 4.0 International
Direct Download
bytes
Slides
here

View This Publication