skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Modeling of molecular atomization energies using machine learning

Journal Article · · Journal of Cheminformatics
 [1];  [2];  [1];  [3]
  1. Technical Univ. of Berlin (Germany). Machine Learning Group
  2. Max-Planck Society, Berlin (Germany). Fritz-Haber-Inst.
  3. Argonne National Lab. (ANL), Argonne, IL (United States)

Atomization energies are an important measure of chemical stability. Machine learning is used to model atomization energies of a diverse set of organic molecules, based on nuclear charges and atomic positions only. Our scheme maps the problem of solving the molecular time-independent Schrödinger equation onto a non-linear statistical regression problem. Kernel ridge regression models are trained on and compared to reference atomization energies computed using density functional theory (PBE0 approximation to Kohn-Sham level of theory). We use a diagonalized matrix representation of molecules based on the inter-nuclear Coulomb repulsion operator in conjunction with a Gaussian kernel. Validation on a set of over 7000 small organic molecules from the GDB database yields mean absolute error of ~10 kcal/mol, while reducing computational effort by several orders of magnitude. Applicability is demonstrated for prediction of binding energy curves using augmentation samples based on physical limits.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
OSTI ID:
1629374
Journal Information:
Journal of Cheminformatics, Vol. 4, Issue S1; Conference: 7.German Conference on Chemoinformatics: 25 CIC-Workshop, Goslar (Germany), 6-8 Nov 2011; ISSN 1758-2946
Publisher:
Chemistry Central Ltd.Copyright Statement
Country of Publication:
United States
Language:
English

References (6)

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning journal January 2012
Rationale for mixing exact exchange with density functional approximations journal December 1996
The Elements of Statistical Learning book January 2009
970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13 journal July 2009
Inhomogeneous Electron Gas journal November 1964
Self-Consistent Equations Including Exchange and Correlation Effects journal November 1965