| | |
Summary: Aletheia - An Advanced Document Layout and Text Ground-Truthing System
for Production Environments*
C. Clausner, S. Pletschacher and A. Antonacopoulos
PRImA Lab, School of Computing, Science and Engineering, University of Salford,
Greater Manchester, M5 4WT, United Kingdom
http://www.primaresearch.org
*
This work has been supported in part through the EU 7th
Framework Programme grant IMPACT (Ref: 215064).
Abstract - Large-scale digitisation has led to a number of new
possibilities with regard to adaptive and learning based meth-
ods in the field of Document Image Analysis and OCR. For
ground truth production of large corpora, however, there is
still a gap in terms of productivity. Ground truth is not only
crucial for training and evaluation at the development stage of
tools but also for quality assurance in the scope of production
workflows for digital libraries.
This paper describes Aletheia, an advanced system for accu-
rate and yet cost-effective ground truthing of large amounts of
documents. It aids the user with a number of automated and
|