Replicating Machine Learning Experiments in Materials Science
- Brookhaven National Lab. (BNL), Upton, NY (United States). Computational Science Center
Transparency and reproducibility are important aspects of validation for Machine Learning (ML) models that are not fully understood and applies independently of the application domain.We offer a case study of reproducibility that highlights the challenges encountered when attempting to reproduce analyzes obtained with Machine Learning methods in materials informatics. Our study explores prediction results obtained with ML models and issues in training data serving as input. We discuss challenges related to theory-driven and numerical errors in training data, lack of reproducibility across platforms and versions, and effects of randomness when varying hyperparameters. In addition to model accuracy, a main metric of interest in the ML community, our results show that model sensitivity may be equally important for applying ML in domain applications such a materials science.
- Research Organization:
- Brookhaven National Lab. (BNL), Upton, NY (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
- DOE Contract Number:
- SC0012704
- OSTI ID:
- 1635098
- Report Number(s):
- BNL-216071-2020-BOOK
- Country of Publication:
- United States
- Language:
- English
Similar Records
Models, data, and scripts associated with “Prediction of Distributed River Sediment Respiration Rates using Community-Generated Data and Machine Learning”
Robust and scalable uncertainty estimation with conformal prediction for machine-learned interatomic potentials