Convergence Rates for Empirical Estimation of Binary Classification Bounds
- Univ. of Maine, Orono, ME (United States). School of Computing and Information Science; OSTI
- Univ. of Michigan, Ann Arbor, MI (United States). Dept. of Electrical Engineering and Computer Science
- Utah State Univ., Logan, UT (United States). Dept. of Mathematics and Statistics
Bounding the best achievable error probability for binary classification problems is relevant to many applications including machine learning, signal processing, and information theory. Many bounds on the Bayes binary classification error rate depend on information divergences between the pair of class distributions. Recently, the Henze–Penrose (HP) divergence has been proposed for bounding classification error probability. We consider the problem of empirically estimating the HP-divergence from random samples. We derive a bound on the convergence rate for the Friedman–Rafsky (FR) estimator of the HP-divergence, which is related to a multivariate runs statistic for testing between two distributions. The FR estimator is derived from a multicolored Euclidean minimal spanning tree (MST) that spans the merged samples. We obtain a concentration inequality for the Friedman–Rafsky estimator of the Henze–Penrose divergence. We validate our results experimentally and illustrate their application to real datasets.
- Research Organization:
- Georgia Institute of Technology, Atlanta, GA (United States); Univ. of Michigan, Ann Arbor, MI (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- NA0002534; NA0003921
- OSTI ID:
- 1801119
- Journal Information:
- Entropy, Journal Name: Entropy Journal Issue: 12 Vol. 21; ISSN ENTRFG; ISSN 1099-4300
- Publisher:
- MDPICopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Automated infrared detection of organophosphorus compounds in multicomponent solutions. Master`s thesis
On the time growth of the error of the DG method for advective problems