AI Semantic Credibility and Wikibase Knowledge Graph Self-Verification: Leveraging Large Language Models and Wikidata to Mitigate Agent Hallucination within Conversational Applications for the Digital Preservation Domain

Abstract

We introduce a chat bot tailored to the domain of digital preservation that makes use of artificial intelligence (AI) as well as structured data from the Wikidata knowledge base. Our choice to leverage data from Wikidata in order to provide verified responses to questions related to the domain of computing increases the transparency of our chat bot. In response to the frequency of LLM hallucination, when models provide inaccurate but plausible responses, we generated a list of questions people may have about file formats and software. We posed a series of prompts to our Wikidata AI Digital Preservation Chat Bot (ChatWikiDP) related to file format identification and software compatibility. We then checked those responses against statements in the Wikidata knowledge base to compare answers. We present a chat bot that combines generative language synthesis from LLMs and attributable linked data from Wikidata with an interface that supports interactive verification of claims. The combination of responses from LLMs and data from knowledge graphs ensure that when we move from documents to data, we can retrace our steps from factually accurate information to source documents.

Details

Creators
Katherine Thornton; Kenneth Seals-Nutt
Institutions
Date
2024-09-18 11:30:00 +0100
Keywords
information technology for dp; from document to data
Publication Type
paper
License
Creative Commons Attribution 4.0 (CC-BY-4.0)
Download
(unknown) bytes
Video Stream
here
Collaborative Notes
here

View This Publication