Validation of simulated OCR data sets
Technical Report
·
OSTI ID:68570
- Rensselaer Polytechnic Institute, Troy, NY (United States)
Recent interest in synthetic data sets for improving classifier performance raises the question whether pseudo-random defect models provide a good approximation to live data from an OCR perspective. A proposal is presented to evaluate artificial data sets by comparing the confusion matrices genuerated on scanned and synthesized data by a given classifier. The proposed measure applies, in principle, to both isolated character recognition and to printed text. It is argued that the proposed method is more practical than direct comparison of synthetic data with real data.
- Research Organization:
- Nevada Univ., Las Vegas, NV (United States)
- OSTI ID:
- 68570
- Report Number(s):
- CONF-9404212--
- Country of Publication:
- United States
- Language:
- English
Similar Records
Prediction of OCR accuracy using simple image features
Performance evaluation of two OCR systems
An evaluation of information retrieval accuracy with simulated OCR output
Technical Report
·
Fri Mar 31 23:00:00 EST 1995
·
OSTI ID:46719
Performance evaluation of two OCR systems
Technical Report
·
Fri Dec 30 23:00:00 EST 1994
·
OSTI ID:68585
An evaluation of information retrieval accuracy with simulated OCR output
Technical Report
·
Fri Dec 30 23:00:00 EST 1994
·
OSTI ID:68569