Bayesian Design of Experiments for Logistic Regression to Evaluate Multiple Forensic Algorithms
Abstract
When evaluating the performance of several forensic classification algorithms, it is desirable to construct a design that considers a variety of performance levels for each of the algorithms. We describe a strategy to use Bayesian design of experiments with multiple prior estimates to capture anticipated performance. Our goal is to characterize results from the different classification algorithms as a function of multiple explanatory variables and use this to choose a design about which units to test. Bayesian design of experiments has been successful for generalized linear models, including logistic regression models. Here, we develop methodology for the case where there are several potentially nonoverlapping priors for anticipated performance under consideration. The weighted priors method performs well for a broad range of true underlying model parameter choices and is more robust when compared to other candidate design choices. Additionally, we show how this can be applied in the multivariate input case and provide some useful summary measures. The shared information plot is used to evaluate design point allocation, and the D-value difference plot allows for the comparison of design performance across multiple potential parameter values in higher dimensions. Here, we illustrate the methods with several examples.
- Authors:
-
- Pennsylvania State Univ., University Park, PA (United States). Dept. of Statistics
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Publication Date:
- Research Org.:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Org.:
- USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 1481136
- Report Number(s):
- LA-UR-17-30911
Journal ID: ISSN 1524-1904
- Grant/Contract Number:
- AC52-06NA25396
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Applied Stochastic Models in Business and Industry
- Additional Journal Information:
- Journal Volume: 34; Journal Issue: 6; Journal ID: ISSN 1524-1904
- Publisher:
- Wiley
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; Mathematics; Bayesian D‐optimal designs; design of experiments; D‐value difference plot; forensic statistics; shared information plot
Citation Formats
Quinlan, Kevin R., and Anderson-Cook, Christine Michaela. Bayesian Design of Experiments for Logistic Regression to Evaluate Multiple Forensic Algorithms. United States: N. p., 2018.
Web. doi:10.1002/asmb.2359.
Quinlan, Kevin R., & Anderson-Cook, Christine Michaela. Bayesian Design of Experiments for Logistic Regression to Evaluate Multiple Forensic Algorithms. United States. doi:10.1002/asmb.2359.
Quinlan, Kevin R., and Anderson-Cook, Christine Michaela. Tue .
"Bayesian Design of Experiments for Logistic Regression to Evaluate Multiple Forensic Algorithms". United States. doi:10.1002/asmb.2359. https://www.osti.gov/servlets/purl/1481136.
@article{osti_1481136,
title = {Bayesian Design of Experiments for Logistic Regression to Evaluate Multiple Forensic Algorithms},
author = {Quinlan, Kevin R. and Anderson-Cook, Christine Michaela},
abstractNote = {When evaluating the performance of several forensic classification algorithms, it is desirable to construct a design that considers a variety of performance levels for each of the algorithms. We describe a strategy to use Bayesian design of experiments with multiple prior estimates to capture anticipated performance. Our goal is to characterize results from the different classification algorithms as a function of multiple explanatory variables and use this to choose a design about which units to test. Bayesian design of experiments has been successful for generalized linear models, including logistic regression models. Here, we develop methodology for the case where there are several potentially nonoverlapping priors for anticipated performance under consideration. The weighted priors method performs well for a broad range of true underlying model parameter choices and is more robust when compared to other candidate design choices. Additionally, we show how this can be applied in the multivariate input case and provide some useful summary measures. The shared information plot is used to evaluate design point allocation, and the D-value difference plot allows for the comparison of design performance across multiple potential parameter values in higher dimensions. Here, we illustrate the methods with several examples.},
doi = {10.1002/asmb.2359},
journal = {Applied Stochastic Models in Business and Industry},
number = 6,
volume = 34,
place = {United States},
year = {2018},
month = {7}
}
Web of Science
Works referenced in this record:
Selecting an Informative/Discriminating Multivariate Response for Inverse Prediction
journal, July 2017
- Thomas, Edward V.; Lewis, John R.; Anderson-Cook, Christine M.
- Journal of Quality Technology, Vol. 49, Issue 3
Optimal Bayesian design applied to logistic regression experiments
journal, February 1989
- Chaloner, Kathryn; Larntz, Kinley
- Journal of Statistical Planning and Inference, Vol. 21, Issue 2
The Coordinate-Exchange Algorithm for Constructing Exact Optimal Experimental Designs
journal, February 1995
- Meyer, Ruth K.; Nachtsheim, Christopher J.
- Technometrics, Vol. 37, Issue 1
Fingerprint-Based Recognition
journal, August 2007
- Dass, Sarat C.; Jain, Anil K.
- Technometrics, Vol. 49, Issue 3
Experimental Design for Binary Data
journal, March 1983
- Abdelbasit, K. M.; Plackett, R. L.
- Journal of the American Statistical Association, Vol. 78, Issue 381
Design of experiments and data analysis challenges in calibration for forensics applications
journal, December 2015
- Anderson-Cook, Christine; Burr, Tom; Hamada, Michael S.
- Chemometrics and Intelligent Laboratory Systems, Vol. 149
Quantifying the weight of evidence from a forensic fingerprint comparison: a new paradigm: Quantifying Weight of Evidence from Fingerprint Comparison
journal, March 2012
- Neumann, C.; Evett, I. W.; Skerrett, J.
- Journal of the Royal Statistical Society: Series A (Statistics in Society), Vol. 175, Issue 2
Monte Carlo and quasi-Monte Carlo methods
journal, January 1998
- Caflisch, Russel E.
- Acta Numerica, Vol. 7
Optimum Designs in Regression Problems
journal, June 1959
- Kiefer, J.; Wolfowitz, J.
- The Annals of Mathematical Statistics, Vol. 30, Issue 2
The weighted priors approach for combining expert opinions in logistic regression experiments
journal, April 2017
- Quinlan, Kevin R.; Anderson-Cook, Christine M.; Myers, Kary L.
- Quality Engineering, Vol. 29, Issue 3
Works referencing / citing this record:
How to Host An Effective Data Competition: Statistical Advice for Competition Design and Analysis
journal, February 2019
- Anderson‐Cook, Christine M.; Myers, Kary L.; Lu, Lu
- Statistical Analysis and Data Mining: The ASA Data Science Journal, Vol. 12, Issue 4