skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space

Abstract

Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.

Authors:
 [1];  [2];  [3];  [2];  [4];  [5];  [1]
  1. Max-Planck-Gesellschaft, Berlin (Germany)
  2. Techincal Univ. of Berlin, Berlin (Germany)
  3. Univ. of Basel, Basel (Switzerland)
  4. Univ. of Basel, Basel (Switzerland); Argonne National Lab. (ANL), Argonne, IL (United States)
  5. Techincal Univ. of Berlin, Berlin (Germany); Korea Univ., Seoul (Korea)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1221601
Grant/Contract Number:  
AC02-06CH11357; NSF PP00P2_138932
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Physical Chemistry Letters
Additional Journal Information:
Journal Volume: 6; Journal Issue: 12; Journal ID: ISSN 1948-7185
Publisher:
American Chemical Society
Country of Publication:
United States
Language:
English
Subject:
74 ATOMIC AND MOLECULAR PHYSICS; chemical compound space; machine learning; atomization energies; molecular properties; many-body potentials

Citation Formats

Hansen, Katja, Biegler, Franziska, Ramakrishnan, Raghunathan, Pronobis, Wiktor, von Lilienfeld, O. Anatole, Müller, Klaus -Robert, and Tkatchenko, Alexandre. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space. United States: N. p., 2015. Web. doi:10.1021/acs.jpclett.5b00831.
Hansen, Katja, Biegler, Franziska, Ramakrishnan, Raghunathan, Pronobis, Wiktor, von Lilienfeld, O. Anatole, Müller, Klaus -Robert, & Tkatchenko, Alexandre. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space. United States. doi:10.1021/acs.jpclett.5b00831.
Hansen, Katja, Biegler, Franziska, Ramakrishnan, Raghunathan, Pronobis, Wiktor, von Lilienfeld, O. Anatole, Müller, Klaus -Robert, and Tkatchenko, Alexandre. Thu . "Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space". United States. doi:10.1021/acs.jpclett.5b00831. https://www.osti.gov/servlets/purl/1221601.
@article{osti_1221601,
title = {Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space},
author = {Hansen, Katja and Biegler, Franziska and Ramakrishnan, Raghunathan and Pronobis, Wiktor and von Lilienfeld, O. Anatole and Müller, Klaus -Robert and Tkatchenko, Alexandre},
abstractNote = {Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.},
doi = {10.1021/acs.jpclett.5b00831},
journal = {Journal of Physical Chemistry Letters},
number = 12,
volume = 6,
place = {United States},
year = {2015},
month = {6}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 49 works
Citation information provided by
Web of Science

Save / Share:

Works referencing / citing this record:

Quantifying Chemical Structure and Machine-Learned Atomic Energies in Amorphous and Liquid Silicon
journal, April 2019

  • Bernstein, Noam; Bhattarai, Bishal; Csányi, Gábor
  • Angewandte Chemie International Edition, Vol. 58, Issue 21
  • DOI: 10.1002/anie.201902625

Band Gap Prediction for Large Organic Crystal Structures with Machine Learning
journal, July 2019

  • Olsthoorn, Bart; Geilhufe, R. Matthias; Borysov, Stanislav S.
  • Advanced Quantum Technologies, Vol. 2, Issue 7-8
  • DOI: 10.1002/qute.201900023

Representing molecular and materials data for unsupervised machine learning
journal, April 2018


Enumeration of de novo inorganic complexes for chemical discovery and machine learning
journal, January 2020

  • Gugler, Stefan; Janet, Jon Paul; Kulik, Heather J.
  • Molecular Systems Design & Engineering, Vol. 5, Issue 1
  • DOI: 10.1039/c9me00069k

Mapping and classifying molecules from a high-throughput structural database
journal, February 2017


Dataset’s chemical diversity limits the generalizability of machine learning predictions
journal, November 2019

  • Glavatskikh, Marta; Leguy, Jules; Hunault, Gilles
  • Journal of Cheminformatics, Vol. 11, Issue 1
  • DOI: 10.1186/s13321-019-0391-2

Quantifying Chemical Structure and Machine-Learned Atomic Energies in Amorphous and Liquid Silicon
journal, April 2019

  • Bernstein, Noam; Bhattarai, Bishal; Csányi, Gábor
  • Angewandte Chemie International Edition, Vol. 58, Issue 21
  • DOI: 10.1002/anie.201902625

Band Gap Prediction for Large Organic Crystal Structures with Machine Learning
journal, July 2019

  • Olsthoorn, Bart; Geilhufe, R. Matthias; Borysov, Stanislav S.
  • Advanced Quantum Technologies, Vol. 2, Issue 7-8
  • DOI: 10.1002/qute.201900023

Enumeration of de novo inorganic complexes for chemical discovery and machine learning
journal, January 2020

  • Gugler, Stefan; Janet, Jon Paul; Kulik, Heather J.
  • Molecular Systems Design & Engineering, Vol. 5, Issue 1
  • DOI: 10.1039/c9me00069k

Representing molecular and materials data for unsupervised machine learning
journal, April 2018


Mapping and classifying molecules from a high-throughput structural database
journal, February 2017


Dataset’s chemical diversity limits the generalizability of machine learning predictions
journal, November 2019

  • Glavatskikh, Marta; Leguy, Jules; Hunault, Gilles
  • Journal of Cheminformatics, Vol. 11, Issue 1
  • DOI: 10.1186/s13321-019-0391-2