Modeling of molecular atomization energies using machine learning
Abstract
Atomization energies are an important measure of chemical stability. Machine learning is used to model atomization energies of a diverse set of organic molecules, based on nuclear charges and atomic positions only. Our scheme maps the problem of solving the molecular time-independent Schrödinger equation onto a non-linear statistical regression problem. Kernel ridge regression models are trained on and compared to reference atomization energies computed using density functional theory (PBE0 approximation to Kohn-Sham level of theory). We use a diagonalized matrix representation of molecules based on the inter-nuclear Coulomb repulsion operator in conjunction with a Gaussian kernel. Validation on a set of over 7000 small organic molecules from the GDB database yields mean absolute error of ~10 kcal/mol, while reducing computational effort by several orders of magnitude. Applicability is demonstrated for prediction of binding energy curves using augmentation samples based on physical limits.
- Authors:
-
- Technical Univ. of Berlin (Germany). Machine Learning Group
- Max-Planck Society, Berlin (Germany). Fritz-Haber-Inst.
- Argonne National Lab. (ANL), Argonne, IL (United States)
- Publication Date:
- Research Org.:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1629374
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Cheminformatics
- Additional Journal Information:
- Journal Volume: 4; Journal Issue: S1; Conference: 7.German Conference on Chemoinformatics: 25 CIC-Workshop, Goslar (Germany), 6-8 Nov 2011; Journal ID: ISSN 1758-2946
- Publisher:
- Chemistry Central Ltd.
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; Density Functional Theory; Machine Learning; Organic Molecule; Gaussian Kernel; Energy Curve
Citation Formats
Rupp, Matthias, Tkatchenko, Alexandre, Müller, Klaus-Robert, and von Lilienfeld, O. Anatole. Modeling of molecular atomization energies using machine learning. United States: N. p., 2012.
Web. doi:10.1186/1758-2946-4-S1-P33.
Rupp, Matthias, Tkatchenko, Alexandre, Müller, Klaus-Robert, & von Lilienfeld, O. Anatole. Modeling of molecular atomization energies using machine learning. United States. https://doi.org/10.1186/1758-2946-4-S1-P33
Rupp, Matthias, Tkatchenko, Alexandre, Müller, Klaus-Robert, and von Lilienfeld, O. Anatole. Tue .
"Modeling of molecular atomization energies using machine learning". United States. https://doi.org/10.1186/1758-2946-4-S1-P33. https://www.osti.gov/servlets/purl/1629374.
@article{osti_1629374,
title = {Modeling of molecular atomization energies using machine learning},
author = {Rupp, Matthias and Tkatchenko, Alexandre and Müller, Klaus-Robert and von Lilienfeld, O. Anatole},
abstractNote = {Atomization energies are an important measure of chemical stability. Machine learning is used to model atomization energies of a diverse set of organic molecules, based on nuclear charges and atomic positions only. Our scheme maps the problem of solving the molecular time-independent Schrödinger equation onto a non-linear statistical regression problem. Kernel ridge regression models are trained on and compared to reference atomization energies computed using density functional theory (PBE0 approximation to Kohn-Sham level of theory). We use a diagonalized matrix representation of molecules based on the inter-nuclear Coulomb repulsion operator in conjunction with a Gaussian kernel. Validation on a set of over 7000 small organic molecules from the GDB database yields mean absolute error of ~10 kcal/mol, while reducing computational effort by several orders of magnitude. Applicability is demonstrated for prediction of binding energy curves using augmentation samples based on physical limits.},
doi = {10.1186/1758-2946-4-S1-P33},
journal = {Journal of Cheminformatics},
number = S1,
volume = 4,
place = {United States},
year = {Tue May 01 00:00:00 EDT 2012},
month = {Tue May 01 00:00:00 EDT 2012}
}
Works referenced in this record:
Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
journal, January 2012
- Rupp, Matthias; Tkatchenko, Alexandre; Müller, Klaus-Robert
- Physical Review Letters, Vol. 108, Issue 5
Rationale for mixing exact exchange with density functional approximations
journal, December 1996
- Perdew, John P.; Ernzerhof, Matthias; Burke, Kieron
- The Journal of Chemical Physics, Vol. 105, Issue 22, p. 9982-9985
The Elements of Statistical Learning
book, January 2009
- Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome
- Springer Series in Statistics
970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13
journal, July 2009
- Blum, Lorenz C.; Reymond, Jean-Louis
- Journal of the American Chemical Society, Vol. 131, Issue 25
Inhomogeneous Electron Gas
journal, November 1964
- Hohenberg, P.; Kohn, W.
- Physical Review, Vol. 136, Issue 3B, p. B864-B871
Self-Consistent Equations Including Exchange and Correlation Effects
journal, November 1965
- Kohn, W.; Sham, L. J.
- Physical Review, Vol. 140, Issue 4A, p. A1133-A1138