Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Identifying Nuclear Data Correlated Through Predicting Bias in Integral Experiments via Applying Principal Component Analysis to Random Forest

Journal Article · · Statistical Analysis and Data Mining
DOI:https://doi.org/10.1002/sam.70014· OSTI ID:2560960
Nuclear data (ND) are the input data for neutron‐transport simulations to answer questions related to nuclear technologies. Subsets of ND, here > 20,000 data points, are validated with respect to thousands of criticality experiments that represent various applications on a small scale. The aim of validation with these experiments is to find errors in ND or methods. The key challenge here is that several hundreds of ND are used to simulate one integral value. Hence, one cannot clearly identify what ND are leading to bias in criticality measurements. In fact, a mistake in one nuclear‐data observable can be compensated with an error in another, and the predicted criticality value would still be predicted in agreement with experimental data. Random forest (RF) was previously employed to predict bias in criticality measurements using sensitivities of simulated criticality experiments to ND. The SHapley Additive exPlanations (SHAP) metric was then applied to attribute the importance of each ND experiment and observable to bias prediction. This, however, did not highlight what ND were jointly related to predicting bias. This is important as it could inform us about where compensating errors in ND could hide. We tackle this shortcoming here by first decomposing the ND sensitivities to integral‐experiment simulations into principal components. Then we use principal component projections to predict bias via the RF and SHAP. The SHAP values and principal components are employed to reconstruct detailed SHAP values for each ND observable. We demonstrate that these extended SHAP bias predictions are more robust, less noisy, and more efficient. In addition, we show that this approach accounts for covariance in ND sensitivities and automates the identification of where compensating errors could hide in ND.
Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE; USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
89233218CNA000001
OSTI ID:
2560960
Alternate ID(s):
OSTI ID: 2560961
OSTI ID: 2570776
Report Number(s):
LA-UR--23-26360; 1932-1872; 10.1002/sam.70014
Journal Information:
Statistical Analysis and Data Mining, Journal Name: Statistical Analysis and Data Mining Journal Issue: 2 Vol. 18; ISSN 1932-1864; ISSN 1932-1872
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English

References (17)

Updating Formulae and a Pairwise Algorithm for Computing Sample Variances book January 1982
Interaction forests: Identifying and exploiting interpretable quantitative and qualitative interaction effects journal July 2022
ENDF/B-VII.1 Neutron Cross Section Data Testing with Critical Assembly Benchmarks and Reactor Experiments journal December 2011
ENDF/B-VIII.0: The 8 th Major Release of the Nuclear Reaction Data Library with CIELO-project Cross Sections, New Standards and Thermal Scattering Data journal February 2018
CIELO Collaboration Summary Results: International Evaluations of Neutron Reactions on Uranium, Plutonium, Iron, Oxygen and Hydrogen journal February 2018
Enhancing nuclear data validation analysis by using machine learning journal July 2020
Random Forests journal January 2001
How can a diverse set of integral and semi-integral measurements inform identification of discrepant nuclear data? journal January 2023
Uncovering Where Compensating Errors Could Hide in ENDF/B-VIII.0 journal January 2023
Application of Machine Learning Algorithms to Identify Problematic Nuclear Data journal July 2021
LIII. On lines and planes of closest fit to systems of points in space journal November 1901
Using Machine Learning Methods to Predict Bias in Nuclear Criticality Safety journal September 2018
Informing nuclear physics via machine learning methods with differential and integral experiments journal September 2021
XGBoost: A Scalable Tree Boosting System conference January 2016
EUCLID Sensitivity Database report February 2022
EUCLID: A New Approach to Improve Nuclear Data Coupling Optimized Experiments with Validation using Machine Learning [Slides]
  • Hutchinson, Jesson; Alwin, Jennifer; Clark, Alexander
  • 15.International Conference on Nuclear Data for Science and Technology (ND2022), Held Virtually, Sacramento, CA (United States), 21-29 Jul 2022 https://doi.org/10.2172/1898108
conference July 2022
Identifying Questionable ICSBEP Benchmark Data and Underestimated Uncertainties using Machine Learning Methods [Slides]
  • Neudecker, Denise; Alwin, Jennifer; Clark, Alexander
  • 2020 NCSP Technical Program Review, Santa Fe, NM (United States), 11-12 Feb 2020 https://doi.org/10.2172/1909404
conference February 2020