Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Semantics-Based Content Extraction in Typewritten Historical Documents A. Antonacopoulos and D. Karatzas
 

Summary: Semantics-Based Content Extraction in Typewritten Historical Documents
A. Antonacopoulos and D. Karatzas
PRImA Lab, School of Computing, Science and Engineering, University of Salford,
Greater Manchester, M5 4WT, United Kingdom
http://www.primaresearch.org

This work has been supported through the EU grant IST-2001-33441.
Abstract
This paper presents a flexible approach to extracting
content from scanned historical documents using
semantic information. The final electronic document is
the result of a "digital historical document lifecycle"
process, where the expert knowledge of the
historian/archivist user is incorporated at different
stages. Results show that such a conversion strategy
aided by (expert) user-specified semantic information and
which enables the processing of individual parts of the
document in a specialised way, produces superior (in a
variety of significant ways) results than document
analysis and understanding techniques devised for

  

Source: Antonacopoulos, Apostolos - School of Computing, Science and Engineering, University of Salford

 

Collections: Computer Technologies and Information Sciences