DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine learning of molecular properties: Locality and active learning

Journal Article · · Journal of Chemical Physics
DOI: https://doi.org/10.1063/1.5005095 · OSTI ID:1433745

In recent years, the machine learning techniques have shown great potent1ial in various problems from a multitude of disciplines, including materials design and drug discovery. The high computational speed on the one hand and the accuracy comparable to that of density functional theory on another hand make machine learning algorithms efficient for high-throughput screening through chemical and configurational space. However, the machine learning algorithms available in the literature require large training datasets to reach the chemical accuracy and also show large errors for the so-called outliers—the out-of-sample molecules, not well-represented in the training set. In the present paper, we propose a new machine learning algorithm for predicting molecular properties that addresses these two issues: it is based on a local model of interatomic interactions providing high accuracy when trained on relatively small training sets and an active learning algorithm of optimally choosing the training set that significantly reduces the errors for the outliers. We compare our model to the other state-of-the-art algorithms from the literature on the widely used benchmark tests.

Sponsoring Organization:
USDOE
OSTI ID:
1433745
Journal Information:
Journal of Chemical Physics, Journal Name: Journal of Chemical Physics Journal Issue: 24 Vol. 148; ISSN 0021-9606
Publisher:
American Institute of PhysicsCopyright Statement
Country of Publication:
United States
Language:
English

References (16)

A computational high-throughput search for new ternary superalloys journal January 2017
Active learning of linearly parametrized interatomic potentials journal December 2017
Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach journal April 2015
Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error journal October 2017
Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space journal June 2015
Machine Learning for Quantum Mechanical Properties of Atoms in Molecules journal July 2015
Genetic Optimization of Training Sets for Improved Machine Learning Models of Molecular Properties journal March 2017
Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17 journal November 2012
Quantum-chemical insights from deep tensor neural networks journal January 2017
Quantum chemistry structures and properties of 134 kilo molecules journal August 2014
Comparing molecules and solids across structural and alchemical space journal January 2016
Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity journal October 2016
Hierarchical modeling of molecular energies using a deep neural network journal June 2018
Machine learning of molecular electronic properties in chemical compound space journal September 2013
Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning journal January 2012
Moment Tensor Potentials: A Class of Systematically Improvable Interatomic Potentials journal January 2016