skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Builtin vs. auxiliary detection of extrapolation risk.

Technical Report ·
DOI:https://doi.org/10.2172/1095941· OSTI ID:1095941

A key assumption in supervised machine learning is that future data will be similar to historical data. This assumption is often false in real world applications, and as a result, prediction models often return predictions that are extrapolations. We compare four approaches to estimating extrapolation risk for machine learning predictions. Two builtin methods use information available from the classification model to decide if the model would be extrapolating for an input data point. The other two build auxiliary models to supplement the classification model and explicitly model extrapolation risk. Experiments with synthetic and real data sets show that the auxiliary models are more reliable risk detectors. To best safeguard against extrapolating predictions, however, we recommend combining builtin and auxiliary diagnostics.

Research Organization:
Sandia National Lab. (SNL-CA), Livermore, CA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
AC04-94AL85000
OSTI ID:
1095941
Report Number(s):
SAND2013-2534; 463450
Country of Publication:
United States
Language:
English

Similar Records

An ensemble model of QSAR tools for regulatory risk assessment
Journal Article · Thu Sep 22 00:00:00 EDT 2016 · Journal of Cheminformatics · OSTI ID:1095941

Process Anomaly Detection for Sparsely Labeled Events in Nuclear Power Plants
Technical Report · Wed Sep 01 00:00:00 EDT 2021 · OSTI ID:1095941

A data-centric weak supervised learning for highway traffic incident detection
Journal Article · Fri Aug 19 00:00:00 EDT 2022 · Accident Analysis and Prevention · OSTI ID:1095941

Related Subjects