skip to main content

DOE PAGESDOE PAGES

Title: Structure classification and melting temperature prediction in octet AB solids via machine learning

Machine learning methods are being increasingly used in condensed matter physics and materials science to classify crystals structures and predict material properties. However, the reliability of these methods for a given problem, especially when large data sets are unavailable, has not been well studied. By addressing the tasks of classifying crystal structure and predicting melting temperatures of the octet subset of AB solids, we performed such a study and found potential problems with using machine learning methods on relatively small data sets. At the same time, however, we can reaffirm the potential power of such methods for these tasks. In particular, we uncovered an important new material feature, the excess Born effective charge, that significantly increased the accuracy of the predictions for the classification problem we defined. This discovery leads us to propose a new scale for the degree of ionicity and covalency in these solids. More specifically, we partitioned the crystal structures of a set of 75 octet solids into those that are ionic and covalent bonded and thus performed a binary classification task. We found that using the standard indices (r σ,r π), suggested by St. John and Bloch several decades ago, enabled an average success in classificationmore » of 92%. Using just r σ and the excess Born effective charge ΔZ A of the A atom enabled an average success of 97%, but we also found relatively large variations about these averages that were dependent on how certain machine learning methods were used and for which a standard deviation was not a proper measure of the degree of confidence we can place in either average. Instead, we calculated and report with 95 % confidence that the traditional classification pair predicts an accuracy in the interval [ 89%, 95%] and the accuracy of the new pair lies in the interval [96 %, 99%]. For melting temperature predictions, the size of our data set was 46. We estimate the root-mean-squared error of our resulting model to be 11% of the mean melting temperature of the data, but we note that if the accuracy of this predicted error is itself measured, our estimated fitting error itself has a root-mean-square error of 50%. In short, what we illustrate is that classification and regression predictions can vary significantly, depending on the details of how machine learning methods are applied to small data sets. This variation makes it important, if not essential, to average the predictions and compute confidence intervals about these averages to report results meaningfully. However, when properly used, these statistical methods can advance our understanding and improve predictions of material properties even for small data sets.« less
Authors:
ORCiD logo [1] ; ORCiD logo [1] ; ORCiD logo [1]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Report Number(s):
LA-UR-14-28547
Journal ID: ISSN 1098-0121; PRBMDO
Grant/Contract Number:
AC52-06NA25396
Type:
Accepted Manuscript
Journal Name:
Physical Review. B, Condensed Matter and Materials Physics
Additional Journal Information:
Journal Volume: 91; Journal Issue: 21; Journal ID: ISSN 1098-0121
Publisher:
American Physical Society (APS)
Research Org:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org:
USDOE Laboratory Directed Research and Development (LDRD) Program
Country of Publication:
United States
Language:
English
Subject:
36 MATERIALS SCIENCE; 97 MATHEMATICS AND COMPUTING; Information Science; Material Science
OSTI Identifier:
1469521
Alternate Identifier(s):
OSTI ID: 1184701

Pilania, Ghanshyam, Gubernatis, James E., and Lookman, Turab. Structure classification and melting temperature prediction in octet AB solids via machine learning. United States: N. p., Web. doi:10.1103/PhysRevB.91.214302.
Pilania, Ghanshyam, Gubernatis, James E., & Lookman, Turab. Structure classification and melting temperature prediction in octet AB solids via machine learning. United States. doi:10.1103/PhysRevB.91.214302.
Pilania, Ghanshyam, Gubernatis, James E., and Lookman, Turab. 2015. "Structure classification and melting temperature prediction in octet AB solids via machine learning". United States. doi:10.1103/PhysRevB.91.214302. https://www.osti.gov/servlets/purl/1469521.
@article{osti_1469521,
title = {Structure classification and melting temperature prediction in octet AB solids via machine learning},
author = {Pilania, Ghanshyam and Gubernatis, James E. and Lookman, Turab},
abstractNote = {Machine learning methods are being increasingly used in condensed matter physics and materials science to classify crystals structures and predict material properties. However, the reliability of these methods for a given problem, especially when large data sets are unavailable, has not been well studied. By addressing the tasks of classifying crystal structure and predicting melting temperatures of the octet subset of AB solids, we performed such a study and found potential problems with using machine learning methods on relatively small data sets. At the same time, however, we can reaffirm the potential power of such methods for these tasks. In particular, we uncovered an important new material feature, the excess Born effective charge, that significantly increased the accuracy of the predictions for the classification problem we defined. This discovery leads us to propose a new scale for the degree of ionicity and covalency in these solids. More specifically, we partitioned the crystal structures of a set of 75 octet solids into those that are ionic and covalent bonded and thus performed a binary classification task. We found that using the standard indices (rσ,rπ), suggested by St. John and Bloch several decades ago, enabled an average success in classification of 92%. Using just rσ and the excess Born effective charge ΔZA of the A atom enabled an average success of 97%, but we also found relatively large variations about these averages that were dependent on how certain machine learning methods were used and for which a standard deviation was not a proper measure of the degree of confidence we can place in either average. Instead, we calculated and report with 95 % confidence that the traditional classification pair predicts an accuracy in the interval [ 89%, 95%] and the accuracy of the new pair lies in the interval [96 %, 99%]. For melting temperature predictions, the size of our data set was 46. We estimate the root-mean-squared error of our resulting model to be 11% of the mean melting temperature of the data, but we note that if the accuracy of this predicted error is itself measured, our estimated fitting error itself has a root-mean-square error of 50%. In short, what we illustrate is that classification and regression predictions can vary significantly, depending on the details of how machine learning methods are applied to small data sets. This variation makes it important, if not essential, to average the predictions and compute confidence intervals about these averages to report results meaningfully. However, when properly used, these statistical methods can advance our understanding and improve predictions of material properties even for small data sets.},
doi = {10.1103/PhysRevB.91.214302},
journal = {Physical Review. B, Condensed Matter and Materials Physics},
number = 21,
volume = 91,
place = {United States},
year = {2015},
month = {6}
}