Presentation at CASBS 2010: Muninn Project

Posted on: 1 June 2010
By: warren

Tracking, Transcribing, and Tagging Government: Building Digital Records for Computational Social Science

Tuesday June 22, 2010, 14:15-15:15

Center for Advanced Study in the Behavioral Sciences

Abstract:

The Muninn Project is a multidisciplinary,  multinational, academic research project investigating millions of records pertaining to  the First World War in archives around the world.

In this talk I will review some of the methods being used in the Muninn project to  extract information from the scanned documents of historical archives. Previous data  extraction efforts for historical research were done through the human review of  documents, one at a time. We employ an approach where computing power is used to collate  similar document types to extract the information from them.

The Great War era produced a mix of hand-written and type-written documents that require  processing using computer extraction methods assisted by the manual reviews of specific  cases by human volunteers. I will contrast this with previous methods that have been used  to digitize documents, such as recapchat, and close with some observations about managing  archival data in a high-volume setting.

Language Undefined

Tags:

lod

WW1

You are here

Presentation at CASBS 2010: Muninn Project