skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Model Calibration with Censored Data

Abstract

Here, the purpose of model calibration is to make the model predictions closer to reality. The classical Kennedy-O'Hagan approach is widely used for model calibration, which can account for the inadequacy of the computer model while simultaneously estimating the unknown calibration parameters. In many applications, the phenomenon of censoring occurs when the exact outcome of the physical experiment is not observed, but is only known to fall within a certain region. In such cases, the Kennedy-O'Hagan approach cannot be used directly, and we propose a method to incorporate the censoring information when performing model calibration. The method is applied to study the compression phenomenon of liquid inside a bottle. The results show significant improvement over the traditional calibration methods, especially when the number of censored observations is large.

Authors:
 [1];  [2];  [2];  [1]
  1. Georgia Inst. of Technology, Atlanta, GA (United States)
  2. Procter & Gamble Co., Mason, OH (United States)
Publication Date:
Research Org.:
Georgia Tech Research Corp., Atlanta, GA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1405186
Report Number(s):
DOE-GT-0010548-11
Journal ID: ISSN 0040-1706; FG02-13ER26159
Grant/Contract Number:
SC0010548
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Technometrics
Additional Journal Information:
Journal Name: Technometrics; Journal ID: ISSN 0040-1706
Publisher:
Taylor & Francis
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Bayesian calibration; Computer experiments; Gaussian process; Model discrepancy

Citation Formats

Cao, Fang, Ba, Shan, Brenneman, William A., and Joseph, V. Roshan. Model Calibration with Censored Data. United States: N. p., 2017. Web. doi:10.1080/00401706.2017.1345704.
Cao, Fang, Ba, Shan, Brenneman, William A., & Joseph, V. Roshan. Model Calibration with Censored Data. United States. doi:10.1080/00401706.2017.1345704.
Cao, Fang, Ba, Shan, Brenneman, William A., and Joseph, V. Roshan. Wed . "Model Calibration with Censored Data". United States. doi:10.1080/00401706.2017.1345704.
@article{osti_1405186,
title = {Model Calibration with Censored Data},
author = {Cao, Fang and Ba, Shan and Brenneman, William A. and Joseph, V. Roshan},
abstractNote = {Here, the purpose of model calibration is to make the model predictions closer to reality. The classical Kennedy-O'Hagan approach is widely used for model calibration, which can account for the inadequacy of the computer model while simultaneously estimating the unknown calibration parameters. In many applications, the phenomenon of censoring occurs when the exact outcome of the physical experiment is not observed, but is only known to fall within a certain region. In such cases, the Kennedy-O'Hagan approach cannot be used directly, and we propose a method to incorporate the censoring information when performing model calibration. The method is applied to study the compression phenomenon of liquid inside a bottle. The results show significant improvement over the traditional calibration methods, especially when the number of censored observations is large.},
doi = {10.1080/00401706.2017.1345704},
journal = {Technometrics},
number = ,
volume = ,
place = {United States},
year = {Wed Jun 28 00:00:00 EDT 2017},
month = {Wed Jun 28 00:00:00 EDT 2017}
}

Journal Article:
Free Publicly Available Full Text
This content will become publicly available on June 28, 2018
Publisher's Version of Record

Save / Share:
  • To improve the understanding of the problem of long-range transport and source-receptor relationships for trace-level toxic air contaminants, the authors examine the use of several multiple comparison procedures (MCPs) in the analysis and interpretation of multiply-censored data sets. Censoring is a chronic problem for some of the toxic elements of interest (As, Se, Mn, etc.) because their atmospheric concentrations are often too low to be measured precisely. Such concentrations are commonly reported in a nonquantitative way as below the limit of detection, leaving the data analyst with censored data sets. Since the standard statistical MCPs are not readily applicable tomore » such data sets, they employ Monte Carlo simulations to evaluate two nonparametric rank-type MCPs for their applicability to the interpretation of censored data. Two different methods for ranking censored data are evaluated: average rank method and substitution with half the detection limit. The results suggest that the Kruskal-Wallis-Dunn MCP with the half-detection limit replacement for censored data is most appropriate for comparing independent, multiply-censored samples of moderate size (20--100 elements). Application of this method to pollutant clusters at several sites in the northeastern USA enabled them to identify potential pollution source regions and atmospheric patterns associated with the long-range transport of air pollutants.« less
  • A recurring difficulty encountered in investigations of many metals and organic contaminants in ambient waters is that a substantial portion of water sample concentrations are below limits of detection established by analytical laboratories. Several methods were evaluated for estimating distributional parameters for such censored data sets using only uncensored observations. Their reliabilities were evaluated by a Monte Carlo experiment in which small samples were generated from a wide range of parent distributions and censored at varying levels. Eight methods were used to estimate the mean, standard deviation, median, and interquartile range. Criteria were developed, based on the distribution of uncensoredmore » observations, for determining the best performing parameter estimation method for any particular data det. The most robust method for minimizing error in censored-sample estimates of the four distributional parameters over all simulation conditions was the log-probability regression method. With this method, censored observations are assumed to follow the zero-to-censoring level portion of a lognormal distribution obtained by a least squares regression between logarithms of uncensored concentration observations and their z scores. When method performance was separately evaluated for each distributional parameter over all simulation conditions, the log-probability regression method still had the smallest errors for the mean and standard deviation, but the lognormal maximum likelihood method had the smallest errors for the median and interquartile range. When data sets were classified prior to parameter estimation into groups reflecting their probable parent distributions, the ranking of estimation methods was similar, but the accuracy of error estimates was markedly improved over those without classification.« less
  • Estimates of distributional parameters (mean, standard deviation, median, interquartile range) are often desired for data sets containing censored observations. Eight methods for estimating these parameters have been evaluated by R. J. Gilliom and D. R. Helsel (this issue) using Monte Carlo simulations. To verify those findings, the same methods are now applied to actual water quality data. The best method (lowest root-mean-squared error (rmse)) over all parameters, sample sizes, and censoring levels is log probability regression (LR), the method found best in the Monte Carlo simulations. Best methods for estimating moment or percentile parameters separately are also identical to themore » simulations. Reliability of these estimates can be expressed as confidence intervals using rmse and bias values taken from the simulation results. Finally, a new simulation study shows that best methods for estimating uncensored sample statistics from censored data sets are identical to those for estimating population parameters. Thus this study and the companion study by Gilliom and Helsel form the basis for making the best possible estimates of either population parameters or sample statistics from censored water quality data, and for assessments of their reliability.« less
  • Linear rank statistics are described for testing for differences between groups when the data are interval-censored. The statistics are closely related to those described by Prentice for right-censored data. Problems in calculating the statistics are discussed and several approaches to computation including estimation of the efficient rank scores are described. Results from a small simulation study are presented. The methods are applied to data from a study relating tissue levels of PCBs to occupational exposure.
  • Reliability test plans are important for producing precise and accurate assessment of reliability characteristics. This paper explores different strategies for choosing between possible inspection plans for interval censored data given a fixed testing timeframe and budget. A new general cost structure is proposed for guiding precise quantification of total cost in inspection test plan. Multiple summaries of reliability are considered and compared as the criteria for choosing the best plans using an easily adapted method. Different cost structures and representative true underlying reliability curves demonstrate how to assess different strategies given the logistical constraints and nature of the problem. Resultsmore » show several general patterns exist across a wide variety of scenarios. Given the fixed total cost, plans that inspect more units with less frequency based on equally spaced time points are favored due to the ease of implementation and consistent good performance across a large number of case study scenarios. Plans with inspection times chosen based on equally spaced probabilities offer improved reliability estimates for the shape of the distribution, mean lifetime, and failure time for a small fraction of population only for applications with high infant mortality rates. The paper uses a Monte Carlo simulation based approach in addition to the common evaluation based on the asymptotic variance and offers comparison and recommendation for different applications with different objectives. Additionally, the paper outlines a variety of different reliability metrics to use as criteria for optimization, presents a general method for evaluating different alternatives, as well as provides case study results for different common scenarios.« less