Identifying Nuclear Data Correlated Through Predicting Bias in Integral Experiments via Applying Principal Component Analysis to Random Forest

Bell, Brian; Neudecker, Denise; Grosskopf, Michael; Hutchinson, Jesson

doi:10.1002/sam.70014

Identifying Nuclear Data Correlated Through Predicting Bias in Integral Experiments via Applying Principal Component Analysis to Random Forest

Journal Article · Mon Apr 21 00:00:00 EDT 2025 · Statistical Analysis and Data Mining

DOI:https://doi.org/10.1002/sam.70014· OSTI ID:2560960

^[1]; ^[2]; ^[2]; ^[2]

Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); Univ. of Arizona, Tucson, AZ (United States)
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)

Nuclear data (ND) are the input data for neutron‐transport simulations to answer questions related to nuclear technologies. Subsets of ND, here > 20,000 data points, are validated with respect to thousands of criticality experiments that represent various applications on a small scale. The aim of validation with these experiments is to find errors in ND or methods. The key challenge here is that several hundreds of ND are used to simulate one integral value. Hence, one cannot clearly identify what ND are leading to bias in criticality measurements. In fact, a mistake in one nuclear‐data observable can be compensated with an error in another, and the predicted criticality value would still be predicted in agreement with experimental data. Random forest (RF) was previously employed to predict bias in criticality measurements using sensitivities of simulated criticality experiments to ND. The SHapley Additive exPlanations (SHAP) metric was then applied to attribute the importance of each ND experiment and observable to bias prediction. This, however, did not highlight what ND were jointly related to predicting bias. This is important as it could inform us about where compensating errors in ND could hide. We tackle this shortcoming here by first decomposing the ND sensitivities to integral‐experiment simulations into principal components. Then we use principal component projections to predict bias via the RF and SHAP. The SHAP values and principal components are employed to reconstruct detailed SHAP values for each ND observable. We demonstrate that these extended SHAP bias predictions are more robust, less noisy, and more efficient. In addition, we show that this approach accounts for covariance in ND sensitivities and automates the identification of where compensating errors could hide in ND.

View Journal Article

Research Organization:: Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)

Sponsoring Organization:: USDOE; USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)

Grant/Contract Number:: 89233218CNA000001

OSTI ID:: 2560960

Alternate ID(s):: OSTI ID: 2560961
OSTI ID: 2570776

Report Number(s):: LA-UR--23-26360; 1932-1872; 10.1002/sam.70014

Journal Information:: Statistical Analysis and Data Mining, Journal Name: Statistical Analysis and Data Mining Journal Issue: 2 Vol. 18; ISSN 1932-1864; ISSN 1932-1872

Publisher:: WileyCopyright Statement

Country of Publication:: United States

Language:: English

References (17)

Updating Formulae and a Pairwise Algorithm for Computing Sample Variances Chan, T. F.; Golub, G. H.; LeVeque, R. J. COMPSTAT 1982 5th Symposium held at Toulouse 1982 https://doi.org/10.1007/978-3-642-51461-6_3	book	January 1982
Interaction forests: Identifying and exploiting interpretable quantitative and qualitative interaction effects Hornung, Roman; Boulesteix, Anne-Laure Computational Statistics & Data Analysis, Vol. 171 https://doi.org/10.1016/j.csda.2022.107460	journal	July 2022
ENDF/B-VII.1 Neutron Cross Section Data Testing with Critical Assembly Benchmarks and Reactor Experiments Kahler, A. C.; MacFarlane, R. E.; Mosteller, R. D. Nuclear Data Sheets, Vol. 112, Issue 12 https://doi.org/10.1016/j.nds.2011.11.003	journal	December 2011
ENDF/B-VIII.0: The 8 th Major Release of the Nuclear Reaction Data Library with CIELO-project Cross Sections, New Standards and Thermal Scattering Data Brown, D. A.; Chadwick, M. B.; Capote, R. Nuclear Data Sheets, Vol. 148 https://doi.org/10.1016/j.nds.2018.02.001	journal	February 2018
CIELO Collaboration Summary Results: International Evaluations of Neutron Reactions on Uranium, Plutonium, Iron, Oxygen and Hydrogen Chadwick, M. B.; Capote, R.; Trkov, A. Nuclear Data Sheets, Vol. 148 https://doi.org/10.1016/j.nds.2018.02.003	journal	February 2018
Enhancing nuclear data validation analysis by using machine learning Neudecker, D.; Grosskopf, M.; Herman, M. Nuclear Data Sheets, Vol. 167 https://doi.org/10.1016/j.nds.2020.07.002	journal	July 2020
Random Forests Breiman, Leo Machine Learning, Vol. 45, Issue 1, p. 5-32 https://doi.org/10.1023/A:1010933404324	journal	January 2001
How can a diverse set of integral and semi-integral measurements inform identification of discrepant nuclear data? Clark, Alexander R.; Neudecker, Denise; Grosskopf, Michael EPJ Web of Conferences, Vol. 284 https://doi.org/10.1051/epjconf/202328415004	journal	January 2023
Uncovering Where Compensating Errors Could Hide in ENDF/B-VIII.0 Neudecker, D.; Alwin, J.; Clark, A. R. EPJ Web of Conferences, Vol. 284 https://doi.org/10.1051/epjconf/202328416003	journal	January 2023
Application of Machine Learning Algorithms to Identify Problematic Nuclear Data Grechanuk, Pavel A.; Rising, Michael E.; Palmer, Todd S. Nuclear Science and Engineering, Vol. 195, Issue 12 https://doi.org/10.1080/00295639.2021.1935102	journal	July 2021
LIII. On lines and planes of closest fit to systems of points in space Pearson, Karl The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, Vol. 2, Issue 11 https://doi.org/10.1080/14786440109462720	journal	November 1901
Using Machine Learning Methods to Predict Bias in Nuclear Criticality Safety Grechanuk, Pavel; Rising, Michael E.; Palmer, Todd S. Journal of Computational and Theoretical Transport, Vol. 47, Issue 4-6 https://doi.org/10.1080/23324309.2019.1585877	journal	September 2018
Informing nuclear physics via machine learning methods with differential and integral experiments Neudecker, Denise; Cabellos, Oscar; Clark, Alexander R. Physical Review C, Vol. 104, Issue 3 https://doi.org/10.1103/PhysRevC.104.034611	journal	September 2021
XGBoost: A Scalable Tree Boosting System Chen, Tianqi; Guestrin, Carlos Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16 https://doi.org/10.1145/2939672.2939785	conference	January 2016
EUCLID Sensitivity Database Alwin, Jennifer; Clark, Alexander; Cutler, Theresa https://doi.org/10.2172/1846880	report	February 2022
EUCLID: A New Approach to Improve Nuclear Data Coupling Optimized Experiments with Validation using Machine Learning [Slides] Hutchinson, Jesson; Alwin, Jennifer; Clark, Alexander 15.International Conference on Nuclear Data for Science and Technology (ND2022), Held Virtually, Sacramento, CA (United States), 21-29 Jul 2022 https://doi.org/10.2172/1898108	conference	July 2022
Identifying Questionable ICSBEP Benchmark Data and Underestimated Uncertainties using Machine Learning Methods [Slides] Neudecker, Denise; Alwin, Jennifer; Clark, Alexander 2020 NCSP Technical Program Review, Santa Fe, NM (United States), 11-12 Feb 2020 https://doi.org/10.2172/1909404	conference	February 2020

Similar Records

Application of Machine Learning Algorithms to Identify Problematic Nuclear Data

Technical Report · Tue Jan 19 23:00:00 EST 2021 · OSTI ID:1906466

Related Subjects

73 NUCLEAR PHYSICS AND RADIATION PHYSICS
compensating errors
nuclear data validation
principal component analysis
random forest

Identifying Nuclear Data Correlated Through Predicting Bias in Integral Experiments via Applying Principal Component Analysis to Random Forest

Citation Formats

References (17)

Similar Records

Related Subjects