Enhancement of research methodologies, either by simplifying methods of discovery or by expanding existing avenues of scholarly work, is one of the activities the Primary Source Awards aim to recognize. This year’s award in the Research category goes to Thea Lindquist, History Librarian at the University of Colorado at Boulder, who embarked on a WWI Linked Data project based on an assessment of scholarly needs related to digitized primary sources. Lindquist’s project helps to improve the research experience by applying linked data methods to enhance use of CU’s digitized World War I Collection (WWI) of more than 1,200 pamphlets, books, maps, and speeches spanning the years between 1914 and 1920. This online resource represents a wide range of genres, authors, geopolitical units, and subject matter, and offers full-text searching capability along with visualization tools that facilitate alternate avenues of exploration in the collection.
In collaboration with scientists from the Semantic computing Research Group (SeCo) at Aalto University (Helsinki), Lindquist selected the WWI collection to demonstrate the types of complex questions that can be answered by employing automated data linking in a specialized subject domain. She chose the topic of the civilian experience in occupied Belgium, as it was well-represented and also highlighted current scholarly interest in the impact of “total war” on civilian populations.
The project attempts to establish links among various types of data, including data points relating to the specific collection, additional incorporated datasets, and external data sources such as DBpedia, a linked data version of Wikipedia. A further goal is to create an event-based framework that will encourage linking by a variety of projects. The sponsors of this project describe, "This framework is meant to be shared, thus providing the 'semantic glue' that binds separate datasets together and allows searching and browsing among various World War I collections." To date, datasets converted to RDF (Linked Data format) to facilitate sharing in the WWI Linked Data project include: the MARC records for the CU digital collection; specialized vocabulary terms and an authoritative timeline of the war from the Imperial War Museum in London; contextual information on German atrocities in Belgium; and data on the German army hierarchy.
The project recognizes a fundamental barrier between researchers and existing digital collections. Too often relying on Google to identify digitized resources, faculty and students alike are challenged by the decontextualized presentation of content, or by database retrieval tools that are either too complex or not granular enough. The project team notes, "Even with relevant sources and adequate context, users may struggle with challenges inherent to primary-source research: foreign languages, document bias, historical usage, grammar, etc. Although all of these issues make it difficult and time-consuming to find and use online primary sources, participants agreed that these sources present a unique educational and research opportunity."