Identifying Nuclear Data Correlated Through Predicting Bias in Integral Experiments via Applying Principal Component Analysis to Random Forest
Journal Article
·
· Statistical Analysis and Data Mining
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); Univ. of Arizona, Tucson, AZ (United States)
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Nuclear data (ND) are the input data for neutron‐transport simulations to answer questions related to nuclear technologies. Subsets of ND, here > 20,000 data points, are validated with respect to thousands of criticality experiments that represent various applications on a small scale. The aim of validation with these experiments is to find errors in ND or methods. The key challenge here is that several hundreds of ND are used to simulate one integral value. Hence, one cannot clearly identify what ND are leading to bias in criticality measurements. In fact, a mistake in one nuclear‐data observable can be compensated with an error in another, and the predicted criticality value would still be predicted in agreement with experimental data. Random forest (RF) was previously employed to predict bias in criticality measurements using sensitivities of simulated criticality experiments to ND. The SHapley Additive exPlanations (SHAP) metric was then applied to attribute the importance of each ND experiment and observable to bias prediction. This, however, did not highlight what ND were jointly related to predicting bias. This is important as it could inform us about where compensating errors in ND could hide. We tackle this shortcoming here by first decomposing the ND sensitivities to integral‐experiment simulations into principal components. Then we use principal component projections to predict bias via the RF and SHAP. The SHAP values and principal components are employed to reconstruct detailed SHAP values for each ND observable. We demonstrate that these extended SHAP bias predictions are more robust, less noisy, and more efficient. In addition, we show that this approach accounts for covariance in ND sensitivities and automates the identification of where compensating errors could hide in ND.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE; USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- 89233218CNA000001
- OSTI ID:
- 2560960
- Alternate ID(s):
- OSTI ID: 2560961
OSTI ID: 2570776
- Report Number(s):
- LA-UR--23-26360; 1932-1872; 10.1002/sam.70014
- Journal Information:
- Statistical Analysis and Data Mining, Journal Name: Statistical Analysis and Data Mining Journal Issue: 2 Vol. 18; ISSN 1932-1864; ISSN 1932-1872
- Publisher:
- WileyCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Application of Machine Learning Algorithms to Identify Problematic Nuclear Data
Technical Report
·
Tue Jan 19 23:00:00 EST 2021
·
OSTI ID:1906466