Semi-automated Generation of Linked Data from Unstructured Bibliographic Data for Japanese Historical Rare Books


A large number of bibliographic data and images of Japanese historical rare books have been published on the Web. For constructing structured bibliographic data, such as in a format of Linked Data, the data providers need to extract structured information from notes of the bibliography, which are written in Japanese natural texts with domain-specific terms. These jobs have been usually performed by persons with special knowledge. In this paper, the authors propose a semi-automated method to convert natural texts in real bibliographic data into Linked Data. As a part of the method, a simple ontology of key elements (named entities) in bibliographic data is constructed along general bibliographic rules for historical heritage. The ontology also has a capability to describe relations between a book and its parts. This allows mechanical access to information such as a creator of a cover picture, preface and so on. Finally, a script creates connections from named entities to URIs which describe headings or glossaries provided online by public organizations.


Natsuko Yoshiga; Shin-Ichi Tadaki
Publication Type
CC BY-SA 4.0 International
166124 bytes

View This Publication