Script and language determination from document images

Spitz, A L

Script and language determination from document images

Technical Report · Fri Dec 30 23:00:00 EST 1994

OSTI ID:68579

Spitz, A L ^[1]

Fuji Xerox Palo Alto Laboratory, Palo Alto, CA (United States)

We have developed techniques for distinguishing which language is represented in an image of text. This work is restricted to a small but important subset of the world`s languages, using techniques that should be applicable across much more comprehensive samples. The method first classifies the script into two broad classes: European and Asian. This classification is based on the spatial relationships of fiducial points related to the upward concavities in character structures. Language identification within the Asian script class (Japanese, Chinese, Korean) is performed by analysis of the optical density distribution of the text images. Within the European script class, language identification is described in separate papers.

🛈

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Research Organization:: Nevada Univ., Las Vegas, NV (United States)

OSTI ID:: 68579

Report Number(s):: CONF-9404212--

Country of Publication:: United States

Language:: English

Similar Records

1988 International Conference on Computer Processing of Chinese and Oriental Languages, Toronto, Canada, Aug. 29-Sept. 1, 1988, Proceedings

Conference · Thu Dec 31 23:00:00 EST 1987 · OSTI ID:6996062

Automatic script identification from images using cluster-based templates

Conference · Tue Jan 31 23:00:00 EST 1995 · OSTI ID:62630

Page segmentation using script identification vectors: A first look

Technical Report · Tue Jul 01 00:00:00 EDT 1997 · OSTI ID:495845

Related Subjects

99 GENERAL AND MISCELLANEOUS
ACCURACY
CLASSIFICATION
IMAGES
INFORMATION
MACHINE TRANSLATIONS
OPTICAL SCANNERS
SPATIAL DISTRIBUTION

Script and language determination from document images

Citation Formats

Similar Records

Related Subjects