Asymptotic accuracy of two-class discrimination

Ho, T K; Baird, H S

Asymptotic accuracy of two-class discrimination

Technical Report · Sat Dec 31 04:00:00 EST 1994

OSTI ID:68583

Ho, T K; Baird, H S ^[1]

AT&T Bell Laboratories, Murray Hill, NJ (United States)

Poor quality-e.g. sparse or unrepresentative-training data is widely suspected to be one cause of disappointing accuracy of isolated-character classification in modern OCR machines. We conjecture that, for many trainable classification techniques, it is in fact the dominant factor affecting accuracy. To test this, we have carried out a study of the asymptotic accuracy of three dissimilar classifiers on a difficult two-character recognition problem. We state this problem precisely in terms of high-quality prototype images and an explicit model of the distribution of image defects. So stated, the problem can be represented as a stochastic source of an indefinitely long sequence of simulated images labeled with ground truth. Using this sequence, we were able to train all three classifiers to high and statistically indistinguishable asymptotic accuracies (99.9%). This result suggests that the quality of training data was the dominant factor affecting accuracy. The speed of convergence during training, as well as time/space trade-offs during recognition, differed among the classifiers.

🛈

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Research Organization:: Nevada Univ., Las Vegas, NV (United States)

OSTI ID:: 68583

Report Number(s):: CONF-9404212--

Country of Publication:: United States

Language:: English

Similar Records

Prediction of OCR accuracy using simple image features

Technical Report · Fri Mar 31 23:00:00 EST 1995 · OSTI ID:46719

An evaluation of information retrieval accuracy with simulated OCR output

Technical Report · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:68569

Adaptive image enhancement of text images that contain touching or broken characters

Technical Report · Mon Nov 28 23:00:00 EST 1994 · OSTI ID:42491

Related Subjects

99 GENERAL AND MISCELLANEOUS
ACCURACY
DECISION TREE ANALYSIS
ERRORS
IMAGE PROCESSING
MACHINE TRANSLATIONS
PATTERN RECOGNITION

Asymptotic accuracy of two-class discrimination

Citation Formats

Similar Records

Related Subjects