DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Statistical model selection for better prediction and discovering science mechanisms that affect reliability

Abstract

Understanding the impact of production, environmental exposure and age characteristics on the reliability of a population is frequently based on underlying science and empirical assessment. When there is incomplete science to prescribe which inputs should be included in a model of reliability to predict future trends, statistical model/variable selection techniques can be leveraged on a stockpile or population of units to improve reliability predictions as well as suggest new mechanisms affecting reliability to explore. We describe a five-step process for exploring relationships between available summaries of age, usage and environmental exposure and reliability. The process involves first identifying potential candidate inputs, then second organizing data for the analysis. Third, a variety of models with different combinations of the inputs are estimated, and fourth, flexible metrics are used to compare them. As a result, plots of the predicted relationships are examined to distill leading model contenders into a prioritized list for subject matter experts to understand and compare. The complexity of the model, quality of prediction and cost of future data collection are all factors to be considered by the subject matter experts when selecting a final model.

Authors:
 [1];  [1];  [2]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States). Statistical Sciences Group
  2. ARDEC, Picatinny Arsenal, Township, NJ (United States)
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1224056
Report Number(s):
LA-UR-15-21723
Journal ID: ISSN 2079-8954; PII: systems3030109
Grant/Contract Number:  
AC52-06NA25396
Resource Type:
Accepted Manuscript
Journal Name:
Systems
Additional Journal Information:
Journal Volume: 3; Journal Issue: 3; Journal ID: ISSN 2079-8954
Publisher:
MDPI
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; automated model evaluation; variable selection, environmental exposure, system usage, advancing underlying theory

Citation Formats

Anderson-Cook, Christine M., Morzinski, Jerome, and Blecker, Kenneth D. Statistical model selection for better prediction and discovering science mechanisms that affect reliability. United States: N. p., 2015. Web. doi:10.3390/systems3030109.
Anderson-Cook, Christine M., Morzinski, Jerome, & Blecker, Kenneth D. Statistical model selection for better prediction and discovering science mechanisms that affect reliability. United States. https://doi.org/10.3390/systems3030109
Anderson-Cook, Christine M., Morzinski, Jerome, and Blecker, Kenneth D. Wed . "Statistical model selection for better prediction and discovering science mechanisms that affect reliability". United States. https://doi.org/10.3390/systems3030109. https://www.osti.gov/servlets/purl/1224056.
@article{osti_1224056,
title = {Statistical model selection for better prediction and discovering science mechanisms that affect reliability},
author = {Anderson-Cook, Christine M. and Morzinski, Jerome and Blecker, Kenneth D.},
abstractNote = {Understanding the impact of production, environmental exposure and age characteristics on the reliability of a population is frequently based on underlying science and empirical assessment. When there is incomplete science to prescribe which inputs should be included in a model of reliability to predict future trends, statistical model/variable selection techniques can be leveraged on a stockpile or population of units to improve reliability predictions as well as suggest new mechanisms affecting reliability to explore. We describe a five-step process for exploring relationships between available summaries of age, usage and environmental exposure and reliability. The process involves first identifying potential candidate inputs, then second organizing data for the analysis. Third, a variety of models with different combinations of the inputs are estimated, and fourth, flexible metrics are used to compare them. As a result, plots of the predicted relationships are examined to distill leading model contenders into a prioritized list for subject matter experts to understand and compare. The complexity of the model, quality of prediction and cost of future data collection are all factors to be considered by the subject matter experts when selecting a final model.},
doi = {10.3390/systems3030109},
journal = {Systems},
number = 3,
volume = 3,
place = {United States},
year = {Wed Aug 19 00:00:00 EDT 2015},
month = {Wed Aug 19 00:00:00 EDT 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Figure 1 Figure 1: Flowchart of process for investigating the relationship between potential inputs and reliability.

Save / Share:

Works referenced in this record:

System Health Assessment
journal, March 2011

  • Collins, David H.; Anderson-Cook, Christine M.; Huzurbazar, Aparna V.
  • Quality Engineering, Vol. 23, Issue 2
  • DOI: 10.1080/08982112.2010.529484

A new look at the statistical model identification
journal, December 1974


Estimating the Dimension of a Model
journal, March 1978


Bayesian measures of model complexity and fit
journal, October 2002

  • Spiegelhalter, David J.; Best, Nicola G.; Carlin, Bradley P.
  • Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 64, Issue 4
  • DOI: 10.1111/1467-9868.00353

Optimal predictive model selection
journal, June 2004


Model Selection for Good Estimation and Prediction over a User-Specified Covariate Distribution for Linear Models under the Frequentist Paradigm
journal, October 2011

  • Pintar, Adam; Anderson-Cook, Christine M.; Wu, Huaiqing
  • Quality and Reliability Engineering International, Vol. 28, Issue 7
  • DOI: 10.1002/qre.1273

Optimization of Designed Experiments Based on Multiple Criteria Utilizing a Pareto Frontier
journal, November 2011

  • Lu, Lu; Anderson-Cook, Christine M.; Robinson, Timothy J.
  • Technometrics, Vol. 53, Issue 4
  • DOI: 10.1198/TECH.2011.10087

Incorporating response variability and estimation uncertainty into Pareto front optimization
journal, October 2014

  • Chapman, Jessica L.; Lu, Lu; Anderson-Cook, Christine M.
  • Computers & Industrial Engineering, Vol. 76
  • DOI: 10.1016/j.cie.2014.07.028

Least angle regression
journal, April 2004


Optimal predictive model selection
text, January 2004


Works referencing / citing this record:

Comparing the Reliability of Related Populations With the Probability of Agreement
journal, July 2016


Comparing the Reliability of Related Populations With the Probability of Agreement
dataset, April 2017


Comparing the Reliability of Related Populations With the Probability of Agreement [Supplemental Data]
dataset, April 2017

  • Stevens, Nathaniel; Anderson-Cook, Christine
  • figshare-Supplementary information for journal article at DOI: 10.1080/00401706.2016.1214180, 1 PDF file (1.08 MB)
  • DOI: 10.6084/m9.figshare.3502052