This Methods/Protocols article is intended for materials scientists interested in performing machine learning-centered research. Herein, we cover broad guidelines and best practices regarding the obtaining and treatment of data, feature engineering, model training, validation, evaluation and comparison, popular repositories for materials data and benchmarking data sets, model and architecture sharing, and finally publication. In addition, we include interactive Jupyter notebooks with example Python code to demonstrate some of the concepts, workflows, and best practices discussed. Overall, the data-driven methods and machine learning workflows and considerations are presented in a simple way, allowing interested readers to more intelligently guide their machine learning research using the suggested references, best practices, and their own materials domain expertise.
Wang, Anthony Yu-Tung, et al. "Machine Learning for Materials Scientists: An Introductory Guide toward Best Practices." Chemistry of Materials, vol. 32, no. 12, May. 2020. https://doi.org/10.1021/acs.chemmater.0c01907
Wang, Anthony Yu-Tung, Murdock, Ryan J., Kauwe, Steven K., Oliynyk, Anton O., Gurlo, Aleksander, Brgoch, Jakoah, Persson, Kristin A., & Sparks, Taylor D. (2020). Machine Learning for Materials Scientists: An Introductory Guide toward Best Practices. Chemistry of Materials, 32(12). https://doi.org/10.1021/acs.chemmater.0c01907
Wang, Anthony Yu-Tung, Murdock, Ryan J., Kauwe, Steven K., et al., "Machine Learning for Materials Scientists: An Introductory Guide toward Best Practices," Chemistry of Materials 32, no. 12 (2020), https://doi.org/10.1021/acs.chemmater.0c01907
@article{osti_1766496,
author = {Wang, Anthony Yu-Tung and Murdock, Ryan J. and Kauwe, Steven K. and Oliynyk, Anton O. and Gurlo, Aleksander and Brgoch, Jakoah and Persson, Kristin A. and Sparks, Taylor D.},
title = {Machine Learning for Materials Scientists: An Introductory Guide toward Best Practices},
annote = {This Methods/Protocols article is intended for materials scientists interested in performing machine learning-centered research. Herein, we cover broad guidelines and best practices regarding the obtaining and treatment of data, feature engineering, model training, validation, evaluation and comparison, popular repositories for materials data and benchmarking data sets, model and architecture sharing, and finally publication. In addition, we include interactive Jupyter notebooks with example Python code to demonstrate some of the concepts, workflows, and best practices discussed. Overall, the data-driven methods and machine learning workflows and considerations are presented in a simple way, allowing interested readers to more intelligently guide their machine learning research using the suggested references, best practices, and their own materials domain expertise.},
doi = {10.1021/acs.chemmater.0c01907},
url = {https://www.osti.gov/biblio/1766496},
journal = {Chemistry of Materials},
issn = {ISSN 0897-4756},
number = {12},
volume = {32},
place = {United States},
publisher = {American Chemical Society (ACS)},
year = {2020},
month = {05}}
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
National Science Foundation (NSF); USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE Office of Science (SC), Basic Energy Sciences (BES); Welch Foundation
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1766496
Journal Information:
Chemistry of Materials, Journal Name: Chemistry of Materials Journal Issue: 12 Vol. 32; ISSN 0897-4756
Olson, Randal S.; Bartley, Nathan; Urbanowicz, Ryan J.
GECCO '16: Genetic and Evolutionary Computation Conference, Proceedings of the Genetic and Evolutionary Computation Conference 2016https://doi.org/10.1145/2908812.2908918
KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mininghttps://doi.org/10.1145/3292500.3330703