Class-specific Error Bounds for Ensemble Classifiers
The generalization error, or probability of misclassification, of ensemble classifiers has been shown to be bounded above by a function of the mean correlation between the constituent (i.e., base) classifiers and their average strength. This bound suggests that increasing the strength and/or decreasing the correlation of an ensemble's base classifiers may yield improved performance under the assumption of equal error costs. However, this and other existing bounds do not directly address application spaces in which error costs are inherently unequal. For applications involving binary classification, Receiver Operating Characteristic (ROC) curves, performance curves that explicitly trade off false alarms and missed detections, are often utilized to support decision making. To address performance optimization in this context, we have developed a lower bound for the entire ROC curve that can be expressed in terms of the class-specific strength and correlation of the base classifiers. We present empirical analyses demonstrating the efficacy of these bounds in predicting relative classifier performance. In addition, we specify performance regions of the ROC curve that are naturally delineated by the class-specific strengths of the base classifiers and show that each of these regions can be associated with a unique set of guidelines for performance optimization of binary classifiers within unequal error cost regimes.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 978906
- Report Number(s):
- LLNL-CONF-417778; TRN: US201010%%266
- Resource Relation:
- Conference: Presented at: SIAM Conference on Data Mining (SDM10), Columbus, OH, United States, Apr 29 - May 01, 2010
- Country of Publication:
- United States
- Language:
- English
Similar Records
TU-G-BRD-08: In-Vivo EPID Dosimetry: Quantifying the Detectability of Four Classes of Errors
Creation of a Robust and Generalizable Machine Learning Classifier for Patient Ventilator Asynchrony