Advances in machine learning (ML) have enabled the development of interatomic potentials that promise the accuracy of first principles methods and the low-cost, parallel efficiency of empirical potentials. However, ML-based potentials struggle to achieve transferability, i.e., provide consistent accuracy across configurations that differ from those used during training. In order to realize the promise of ML-based potentials, systematic and scalable approaches to generate diverse training sets need to be developed. This work creates a diverse training set for tungsten in an automated manner using an entropy optimization approach. Subsequently, multiple polynomial and neural network potentials are trained on the entropy-optimized dataset. A corresponding set of potentials are trained on an expert-curated dataset for tungsten for comparison. The models trained to the entropy-optimized data exhibited superior transferability compared to the expert-curated models. Furthermore, the models trained to the expert-curated set exhibited a significant decrease in performance when evaluated on out-of-sample configurations.
Montes de Oca Zapiain, David, et al. "Training data selection for accuracy and transferability of interatomic potentials." npj Computational Materials, vol. 8, no. 1, Sep. 2022. https://doi.org/10.1038/s41524-022-00872-x
Montes de Oca Zapiain, David, Wood, Mitchell A., Lubbers, Nicholas, Pereyra, Carlos Z., Thompson, Aidan P., & Perez, Danny (2022). Training data selection for accuracy and transferability of interatomic potentials. npj Computational Materials, 8(1). https://doi.org/10.1038/s41524-022-00872-x
Montes de Oca Zapiain, David, Wood, Mitchell A., Lubbers, Nicholas, et al., "Training data selection for accuracy and transferability of interatomic potentials," npj Computational Materials 8, no. 1 (2022), https://doi.org/10.1038/s41524-022-00872-x
@article{osti_1885072,
author = {Montes de Oca Zapiain, David and Wood, Mitchell A. and Lubbers, Nicholas and Pereyra, Carlos Z. and Thompson, Aidan P. and Perez, Danny},
title = {Training data selection for accuracy and transferability of interatomic potentials},
annote = {Abstract Advances in machine learning (ML) have enabled the development of interatomic potentials that promise the accuracy of first principles methods and the low-cost, parallel efficiency of empirical potentials. However, ML-based potentials struggle to achieve transferability, i.e., provide consistent accuracy across configurations that differ from those used during training. In order to realize the promise of ML-based potentials, systematic and scalable approaches to generate diverse training sets need to be developed. This work creates a diverse training set for tungsten in an automated manner using an entropy optimization approach. Subsequently, multiple polynomial and neural network potentials are trained on the entropy-optimized dataset. A corresponding set of potentials are trained on an expert-curated dataset for tungsten for comparison. The models trained to the entropy-optimized data exhibited superior transferability compared to the expert-curated models. Furthermore, the models trained to the expert-curated set exhibited a significant decrease in performance when evaluated on out-of-sample configurations.},
doi = {10.1038/s41524-022-00872-x},
url = {https://www.osti.gov/biblio/1885072},
journal = {npj Computational Materials},
issn = {ISSN 2057-3960},
number = {1},
volume = {8},
place = {United Kingdom},
publisher = {Nature Publishing Group},
year = {2022},
month = {09}}