Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Selecting a Classification Ensemble and Detecting Process Drift in an Evolving Data Stream

Conference ·
OSTI ID:1334906
We characterize the commercial behavior of a group of companies in a common line of business using a small ensemble of classifiers on a stream of records containing commercial activity information. This approach is able to effectively find a subset of classifiers that can be used to predict company labels with reasonable accuracy. Performance of the ensemble, its error rate under stable conditions, can be characterized using an exponentially weighted moving average (EWMA) statistic. The behavior of the EWMA statistic can be used to monitor a record stream from the commercial network and determine when significant changes have occurred. Results indicate that larger classification ensembles may not necessarily be optimal, pointing to the need to search the combinatorial classifier space in a systematic way. Results also show that current and past performance of an ensemble can be used to detect when statistically significant changes in the activity of the network have occurred. The dataset used in this work contains tens of thousands of high level commercial activity records with continuous and categorical variables and hundreds of labels, making classification challenging.
Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1334906
Report Number(s):
PNNL-SA-109834
Country of Publication:
United States
Language:
English

Similar Records

Modality-Driven Classification and Visualization of Ensemble Variance
Journal Article · Sat Oct 01 00:00:00 EDT 2016 · IEEE Transactions on Visualization and Computer Graphics · OSTI ID:1330296

Real-time detection and classification of anomalous events in streaming data
Patent · Tue Apr 19 00:00:00 EDT 2016 · OSTI ID:1247988

Evolving Ensembles of Spiking Neural Networks for Neuromorphic Systems
Conference · Mon Nov 30 23:00:00 EST 2020 · OSTI ID:1760126