Script identification from images using cluster-based templates
- Los Alamos, NM
- Santa Fe, NM
A computer-implemented method identifies a script used to create a document. A set of training documents for each script to be identified is scanned into the computer to store a series of exemplary images representing each script. Pixels forming the exemplary images are electronically processed to define a set of textual symbols corresponding to the exemplary images. Each textual symbol is assigned to a cluster of textual symbols that most closely represents the textual symbol. The cluster of textual symbols is processed to form a representative electronic template for each cluster. A document having a script to be identified is scanned into the computer to form one or more document images representing the script to be identified. Pixels forming the document images are electronically processed to define a set of document textual symbols corresponding to the document images. The set of document textual symbols is compared to the electronic templates to identify the script.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM
- DOE Contract Number:
- W-7405-ENG-36
- Assignee:
- Regents of University of California (Oakland, CA)
- Patent Number(s):
- US 5844991
- OSTI ID:
- 872016
- Country of Publication:
- United States
- Language:
- English
Similar Records
Automatic script identification from images using cluster-based templates
Page segmentation using script identification vectors: A first look
Related Subjects
assigned
closely
cluster
cluster-based
compared
computer
computer-implemented
corresponding
create
define
document
documents
electronic
electronically
exemplary
form
forming
identification
identified
identifies
identify
images
method
method identifies
pixels
processed
representative
representing
represents
scanned
series
set
store
symbol
symbols
template
templates
textual
training