Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Low-level structural recognition of documents

Technical Report ·
OSTI ID:68590
;  [1]
  1. Campus Scientifique, Vandoeuvre-les-Nancy (France)

This paper focuses on the qualitative approach of the low-level structured document analysis. The system identifies the different logical fields within the document and produces as output a structured flow with confidence scores. The strategy is driven by a generic model and by an OCR flow. Logical labels are attached to research areas after hypothesizing and testing typographical, lexical and contextual properties. A qualitative recognition is performed, which allows to amphasize ambiguities and unrecognized fields. Library references are treated to illustrate this method.

Research Organization:
Nevada Univ., Las Vegas, NV (United States)
OSTI ID:
68590
Report Number(s):
CONF-9404212--
Country of Publication:
United States
Language:
English

Similar Records

Lexicon-based word recognition without word segmentation
Technical Report · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:68574

About the logical partitioning of document images
Technical Report · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:68577

An automated system for numerically rating document image quality
Conference · Mon Mar 31 23:00:00 EST 1997 · OSTI ID:463673