The Center of Digital Humanities Research (CoDHR, pronounced Coder) is participating in a $5,000,000 grant awarded to Canadian Higher Ed Institutions called the Linked Infrastructure for Networked Cultural Scholarship (LINCS). This grant project addresses the following problem: Humanities scholars have unprecedented quantities of data for addressing complex social processes, but are hampered by the lack of meaningful connections between, and by the incompatibility of, online materials. Most continue to interact with cultural data only through reading rather than by leveraging algorithmic processes to answer major questions about human culture. Humanities researchers need a smarter, “semantic” web whose links will elucidate the diverse causes, effects, and significance of human action and expression.
The “semantic web,” theorized by Tim Berners Lee in 2001 (Scientific American), builds upon but transforms the current web. Right now, when we search via Google on a topic, a list of websites with links to each pops up as search returns. But if information on those websites is coded properly, a knowledge graph can be returned instead that contains vital information concerning a topic culled from many web pages. Linked Open Data, data that has been properly encoded and so can be used for building the semantic web, allows just that, transforming the Web into a database. Moreover, it would allow inferences to be made. For example, one web page might tell you that poet Robert Southey befriended scientist Humphry Davy while living in Bristol during the period 1780-1800. Another web page might tell you that Humphry Davy started “The Pneumatic Institute” in Bristol in 1780, and that it ran until 1810. An inference engine making use of both bits of information, gathered together properly in an inference chain, could return “Robert Southey” as part of the answer to the question, “Were any creative writers involved with the Pneumatic Institute during the eighteenth century?” or even contribute to much broader questions such as, “What percentage of literary authors engage with scientists over time?”
LINCS will convert large datasets into an organized, interconnected, machine-processable set of resources for Canadian cultural research, and, as a LINCS team member, CoDHR @ Texas A&M will extend that work to US researchers. LINCS will provide context for the cultural material that currently floats around online, interlink it, ground it in its sources, and help to make the World Wide Web a trusted resource for scholarly knowledge production.
CoDHR @ TAMU is spending the summer transforming a metadata store of 1.7 million records into Linked Open Data. While doing so is partly an automated process, the project directors and an Intern would work together to provide human oversight and disambiguation strategies: the cleaner the data, the better the semantic web will work.