About the logical partitioning of document images
- German Research Center for Artificial Intelligence, Kaiserslautern (Germany)
Logical labeling of documents is one of the key issues of a document analysis system. This is because the recognition of meaningful document constituents, i.e. logical objects, is evident for many other subsequent analysis steps, such as text recognition, address block analysis or even text understanding. In this paper, I propose an extension to the system ANASTASIL using a three-step approach for partitioning raster images of business letters into logically labeled area items. Furtheron, practical results are presented. In particular, we took a training set of 20 letters to establish a model, i.e. a so called GTree. This model is taken to logically label 82 unknown business letters from different companies. The results are evaluated with respect to recall and precision of classification, but also to the elapsed time.
- Research Organization:
- Nevada Univ., Las Vegas, NV (United States)
- OSTI ID:
- 68577
- Report Number(s):
- CONF-9404212--
- Country of Publication:
- United States
- Language:
- English