skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Conservative Estimation of Tail Probabilities from Limited Sample Data

Abstract

Several sparse-sample uncertainty quantification (UQ) methods are compared for conservative but not overly conservative estimation of small tail probabilities involving responses that lay beyond specified thresholds in the tails of probability distributions. Sixteen very differently shaped distributions (or probability density functions, PDFs) and tail probability magnitudes ranging from 10-5 to 10-1 are considered in order for the study to be relevant to a wide range of risk analysis and quantification of margins and uncertainty (QMU) problems. The emphasis of the study is on limited data regimes ranging from N = 2 to 20 samples, reflective of most experimental and some expensive computational situations. Relatively simple sparse-sample UQ methods tested for this regime involve statistical tolerance interval "Equivalent Normal and related "Ensemble of Normals" and "Superdistribution (SD) approaches. (The independently derived SD is effectively equivalent to the Bayesian posterior predictive distribution given the assumptions of the derivation.) The performance of the methods was generally improved for N ≥ 5 samples with a generalized Jackknife resampling technique, which determines a tail probability estimate by averaging estimates from smaller sub-samples. Several quantitative metrics for method conservatism and accuracy of tail probability estimation are used to assess and rank the methods' performance over manymore » random trials for each test PDF and probability magnitude. A variant of Bootstrap resampling was also tried, but did not significantly improve tail probability estimates in most cases. Detailed results are presented from over 100-million tests over the above factors that provide useful granular information on which methods or combination of methods perform best in various areas of the factor space.« less

Authors:
 [1];  [1]
  1. Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1605343
Report Number(s):
SAND-2020-2828
684818
DOE Contract Number:  
AC04-94AL85000; NA0003525
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Jekel, Charles F., and Romero, Vincente J.. Conservative Estimation of Tail Probabilities from Limited Sample Data. United States: N. p., 2020. Web. doi:10.2172/1605343.
Jekel, Charles F., & Romero, Vincente J.. Conservative Estimation of Tail Probabilities from Limited Sample Data. United States. https://doi.org/10.2172/1605343
Jekel, Charles F., and Romero, Vincente J.. 2020. "Conservative Estimation of Tail Probabilities from Limited Sample Data". United States. https://doi.org/10.2172/1605343. https://www.osti.gov/servlets/purl/1605343.
@article{osti_1605343,
title = {Conservative Estimation of Tail Probabilities from Limited Sample Data},
author = {Jekel, Charles F. and Romero, Vincente J.},
abstractNote = {Several sparse-sample uncertainty quantification (UQ) methods are compared for conservative but not overly conservative estimation of small tail probabilities involving responses that lay beyond specified thresholds in the tails of probability distributions. Sixteen very differently shaped distributions (or probability density functions, PDFs) and tail probability magnitudes ranging from 10-5 to 10-1 are considered in order for the study to be relevant to a wide range of risk analysis and quantification of margins and uncertainty (QMU) problems. The emphasis of the study is on limited data regimes ranging from N = 2 to 20 samples, reflective of most experimental and some expensive computational situations. Relatively simple sparse-sample UQ methods tested for this regime involve statistical tolerance interval "Equivalent Normal and related "Ensemble of Normals" and "Superdistribution (SD) approaches. (The independently derived SD is effectively equivalent to the Bayesian posterior predictive distribution given the assumptions of the derivation.) The performance of the methods was generally improved for N ≥ 5 samples with a generalized Jackknife resampling technique, which determines a tail probability estimate by averaging estimates from smaller sub-samples. Several quantitative metrics for method conservatism and accuracy of tail probability estimation are used to assess and rank the methods' performance over many random trials for each test PDF and probability magnitude. A variant of Bootstrap resampling was also tried, but did not significantly improve tail probability estimates in most cases. Detailed results are presented from over 100-million tests over the above factors that provide useful granular information on which methods or combination of methods perform best in various areas of the factor space.},
doi = {10.2172/1605343},
url = {https://www.osti.gov/biblio/1605343}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2020},
month = {3}
}