DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Script identification from images using cluster-based templates

Abstract

A computer-implemented method identifies a script used to create a document. A set of training documents for each script to be identified is scanned into the computer to store a series of exemplary images representing each script. Pixels forming the exemplary images are electronically processed to define a set of textual symbols corresponding to the exemplary images. Each textual symbol is assigned to a cluster of textual symbols that most closely represents the textual symbol. The cluster of textual symbols is processed to form a representative electronic template for each cluster. A document having a script to be identified is scanned into the computer to form one or more document images representing the script to be identified. Pixels forming the document images are electronically processed to define a set of document textual symbols corresponding to the document images. The set of document textual symbols is compared to the electronic templates to identify the script.

Inventors:
 [1];  [1];  [2]
  1. Los Alamos, NM
  2. Santa Fe, NM
Issue Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
OSTI Identifier:
872016
Patent Number(s):
5844991
Assignee:
Regents of University of California (Oakland, CA)
DOE Contract Number:  
W-7405-ENG-36
Resource Type:
Patent
Country of Publication:
United States
Language:
English
Subject:
identification; images; cluster-based; templates; computer-implemented; method; identifies; create; document; set; training; documents; identified; scanned; computer; store; series; exemplary; representing; pixels; forming; electronically; processed; define; textual; symbols; corresponding; symbol; assigned; cluster; closely; represents; form; representative; electronic; template; compared; identify; method identifies; /382/

Citation Formats

Hochberg, Judith G, Kelly, Patrick M, and Thomas, Timothy R. Script identification from images using cluster-based templates. United States: N. p., 1998. Web.
Hochberg, Judith G, Kelly, Patrick M, & Thomas, Timothy R. Script identification from images using cluster-based templates. United States.
Hochberg, Judith G, Kelly, Patrick M, and Thomas, Timothy R. Thu . "Script identification from images using cluster-based templates". United States. https://www.osti.gov/servlets/purl/872016.
@article{osti_872016,
title = {Script identification from images using cluster-based templates},
author = {Hochberg, Judith G and Kelly, Patrick M and Thomas, Timothy R},
abstractNote = {A computer-implemented method identifies a script used to create a document. A set of training documents for each script to be identified is scanned into the computer to store a series of exemplary images representing each script. Pixels forming the exemplary images are electronically processed to define a set of textual symbols corresponding to the exemplary images. Each textual symbol is assigned to a cluster of textual symbols that most closely represents the textual symbol. The cluster of textual symbols is processed to form a representative electronic template for each cluster. A document having a script to be identified is scanned into the computer to form one or more document images representing the script to be identified. Pixels forming the document images are electronically processed to define a set of document textual symbols corresponding to the document images. The set of document textual symbols is compared to the electronic templates to identify the script.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jan 01 00:00:00 EST 1998},
month = {Thu Jan 01 00:00:00 EST 1998}
}