Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Thick 2D relations for document understanding

Summary: Thick 2D relations
for document understanding
Marco Aiello *, Arnold M.W. Smeulders
Intelligent Sensory Information Systems, University of Amsterdam, Kruislaan 403,
1098 SJ Amsterdam, The Netherlands
Received 17 March 2002; received in revised form 15 April 2003; accepted 8 May 2003
We use a propositional language of qualitative rectangle relations to detect the
reading order from document images. To this end, we define the notion of a document
encoding rule and we analyze possible formalisms to express document encoding rules
such as LaTeX and SGML. Document encoding rules expressed in the propositional
language of rectangles are used to build a reading order detector for document images.
In order to achieve robustness and avoid brittleness when applying the system to real life
document images, the notion of a thick boundary interpretation for a qualitative
relation is introduced. The framework is tested on a collection of heterogeneous doc-
ument images showing recall rates up to 89%.
2003 Elsevier Inc. All rights reserved.
Keywords: Document image analysis; Document understanding; Spatial reasoning;
Bidimensional Allen relations; Constraint satisfaction: applications
1. Introduction


Source: Aiello, Marco - Institute for Mathematics and Computing Science, Rijksuniversiteit Groningen


Collections: Computer Technologies and Information Sciences