DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine learning of parameters for accurate semiempirical quantum chemical calculations

Abstract

We investigate possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods through the use of machine learning (ML) models for the parameters. For a given class of compounds, ML techniques require sufficiently large training sets to develop ML models that can be used for adapting SQC parameters to reflect changes in molecular composition and geometry. The ML-SQC approach allows the automatic tuning of SQC parameters for individual molecules, thereby improving the accuracy without deteriorating transferability to molecules with molecular descriptors very different from those in the training set. The performance of this approach is demonstrated for the semiempirical OM2 method using a set of 6095 constitutional isomers C7H10O2, for which accurate ab initio atomization enthalpies are available. The ML-OM2 results show improved average accuracy and a much reduced error range compared with those of standard OM2 results, with mean absolute errors in atomization enthalpies dropping from 6.3 to 1.7 kcal/mol. They are also found to be superior to the results from specific OM2 reparameterizations (rOM2) for the same set of isomers. The ML-SQC approach thus holds promise for fast and reasonably accurate high-throughput screening of materials and molecules.

Authors:
 [1];  [2];  [3]
  1. Max-Planck Institute fur Kohlenforschung, Mulheim an der Ruhr (Germany)
  2. Univ. of Basel, Basel (Switzerland); Argonne National Lab. (ANL), Argonne, IL (United States)
  3. Max-Planck Institute for Kohlenforschung, Mulheim an der Ruhr (Germany)
Publication Date:
Research Org.:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1214088
Grant/Contract Number:  
AC02-06CH11357
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Chemical Theory and Computation
Additional Journal Information:
Journal Volume: 11; Journal Issue: 5; Journal ID: ISSN 1549-9618
Publisher:
American Chemical Society
Country of Publication:
United States
Language:
English
Subject:
37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY

Citation Formats

Dral, Pavlo O., von Lilienfeld, O. Anatole, and Thiel, Walter. Machine learning of parameters for accurate semiempirical quantum chemical calculations. United States: N. p., 2015. Web. doi:10.1021/acs.jctc.5b00141.
Dral, Pavlo O., von Lilienfeld, O. Anatole, & Thiel, Walter. Machine learning of parameters for accurate semiempirical quantum chemical calculations. United States. https://doi.org/10.1021/acs.jctc.5b00141
Dral, Pavlo O., von Lilienfeld, O. Anatole, and Thiel, Walter. Tue . "Machine learning of parameters for accurate semiempirical quantum chemical calculations". United States. https://doi.org/10.1021/acs.jctc.5b00141. https://www.osti.gov/servlets/purl/1214088.
@article{osti_1214088,
title = {Machine learning of parameters for accurate semiempirical quantum chemical calculations},
author = {Dral, Pavlo O. and von Lilienfeld, O. Anatole and Thiel, Walter},
abstractNote = {We investigate possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods through the use of machine learning (ML) models for the parameters. For a given class of compounds, ML techniques require sufficiently large training sets to develop ML models that can be used for adapting SQC parameters to reflect changes in molecular composition and geometry. The ML-SQC approach allows the automatic tuning of SQC parameters for individual molecules, thereby improving the accuracy without deteriorating transferability to molecules with molecular descriptors very different from those in the training set. The performance of this approach is demonstrated for the semiempirical OM2 method using a set of 6095 constitutional isomers C7H10O2, for which accurate ab initio atomization enthalpies are available. The ML-OM2 results show improved average accuracy and a much reduced error range compared with those of standard OM2 results, with mean absolute errors in atomization enthalpies dropping from 6.3 to 1.7 kcal/mol. They are also found to be superior to the results from specific OM2 reparameterizations (rOM2) for the same set of isomers. The ML-SQC approach thus holds promise for fast and reasonably accurate high-throughput screening of materials and molecules.},
doi = {10.1021/acs.jctc.5b00141},
journal = {Journal of Chemical Theory and Computation},
number = 5,
volume = 11,
place = {United States},
year = {Tue Apr 14 00:00:00 EDT 2015},
month = {Tue Apr 14 00:00:00 EDT 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 73 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid
journal, August 2011

  • Hachmann, Johannes; Olivares-Amaya, Roberto; Atahan-Evrenk, Sule
  • The Journal of Physical Chemistry Letters, Vol. 2, Issue 17
  • DOI: 10.1021/jz200866s

Data-mined similarity function between material compositions
journal, December 2013


Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013

  • Jain, Anubhav; Ong, Shyue Ping; Hautier, Geoffroy
  • APL Materials, Vol. 1, Issue 1
  • DOI: 10.1063/1.4812323

The Many Roles of Computation in Drug Discovery
journal, March 2004


Towards the computational design of solid catalysts
journal, April 2009

  • Nørskov, J.; Bligaard, T.; Rossmeisl, J.
  • Nature Chemistry, Vol. 1, Issue 1, p. 37-46
  • DOI: 10.1038/nchem.121

Virtual screening: an endless staircase?
journal, April 2010

  • Schneider, Gisbert
  • Nature Reviews Drug Discovery, Vol. 9, Issue 4
  • DOI: 10.1038/nrd3139

Identification and design principles of low hole effective mass p-type transparent conducting oxides
journal, August 2013

  • Hautier, Geoffroy; Miglio, Anna; Ceder, Gerbrand
  • Nature Communications, Vol. 4, Issue 1
  • DOI: 10.1038/ncomms3292

High-Throughput Virtual Screening Using Quantum Mechanical Probes: Discovery of Selective Kinase Inhibitors
journal, June 2010


MNDO-Like Semiempirical Molecular Orbital Theory and Its Application to Large Systems
book, July 2011

  • Clark, Timothy; Stewart, James J. P.
  • Computational Methods for Large Systems: Electronic Structure Approaches for Biotechnology and Nanotechnology
  • DOI: 10.1002/9780470930779.ch8

A MNDO study of carbon clusters with specifically fitted parameters
journal, November 1995

  • Tseng, Shiuh-ping; Shen, Min-yi; Yu, Chin-hui
  • Theoretica Chimica Acta, Vol. 92, Issue 5
  • DOI: 10.1007/BF01113867

Direct dynamics calculations with NDDO (neglect of diatomic differential overlap) molecular orbital theory with specific reaction parameters
journal, June 1991

  • Gonzalez-Lafont, Angels; Truong, Thanh N.; Truhlar, Donald G.
  • The Journal of Physical Chemistry, Vol. 95, Issue 12
  • DOI: 10.1021/j100165a009

Specific Reaction Path Hamiltonian for Proton Transfer in Water: Reparameterized Semiempirical Models
journal, May 2013

  • Wu, Xin; Thiel, Walter; Pezeshki, Soroosh
  • Journal of Chemical Theory and Computation, Vol. 9, Issue 6
  • DOI: 10.1021/ct400224n

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
journal, January 2012


First principles view on chemical compound space: Gaining rigorous atomistic control of molecular properties
journal, February 2013

  • von Lilienfeld, O. Anatole
  • International Journal of Quantum Chemistry, Vol. 113, Issue 12
  • DOI: 10.1002/qua.24375

Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies
journal, July 2013

  • Hansen, Katja; Montavon, Grégoire; Biegler, Franziska
  • Journal of Chemical Theory and Computation, Vol. 9, Issue 8
  • DOI: 10.1021/ct400195d

Comment on “Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning”
journal, August 2012


Rupp et al. Reply:
journal, August 2012


Gaussian-4 theory using reduced order perturbation theory
journal, September 2007

  • Curtiss, Larry A.; Redfern, Paul C.; Raghavachari, Krishnan
  • The Journal of Chemical Physics, Vol. 127, Issue 12
  • DOI: 10.1063/1.2770701

Quantum chemistry structures and properties of 134 kilo molecules
journal, August 2014

  • Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
  • Scientific Data, Vol. 1, Issue 1
  • DOI: 10.1038/sdata.2014.22

Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
journal, November 2012

  • Ruddigkeit, Lars; van Deursen, Ruud; Blum, Lorenz C.
  • Journal of Chemical Information and Modeling, Vol. 52, Issue 11
  • DOI: 10.1021/ci300415d

Orthogonalization corrections for semiempirical methods
journal, April 2000

  • Weber, Wolfgang; Thiel, Walter
  • Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), Vol. 103, Issue 6
  • DOI: 10.1007/s002149900083

A method for the solution of certain non-linear problems in least squares
journal, January 1944

  • Levenberg, Kenneth
  • Quarterly of Applied Mathematics, Vol. 2, Issue 2
  • DOI: 10.1090/qam/10666

An Algorithm for Least-Squares Estimation of Nonlinear Parameters
journal, June 1963

  • Marquardt, Donald W.
  • Journal of the Society for Industrial and Applied Mathematics, Vol. 11, Issue 2
  • DOI: 10.1137/0111030

Challenges for Density Functional Theory
journal, December 2011

  • Cohen, Aron J.; Mori-Sánchez, Paula; Yang, Weitao
  • Chemical Reviews, Vol. 112, Issue 1
  • DOI: 10.1021/cr200107z

“Learn on the Fly”: A Hybrid Classical and Quantum-Mechanical Molecular Dynamics Simulation
journal, October 2004


High-Throughput Virtual Screening Using Quantum Mechanical Probes: Discovery of Selective Kinase Inhibitors
journal, June 2010


Quantifying and Assessing the Effect of Chemical Symmetry in Metabolic Pathways
journal, September 2012

  • Zhou, Wanding; Nakhleh, Luay
  • Journal of Chemical Information and Modeling, Vol. 52, Issue 10
  • DOI: 10.1021/ci300259u

Challenges for Density Functional Theory
journal, December 2011

  • Cohen, Aron J.; Mori-Sánchez, Paula; Yang, Weitao
  • Chemical Reviews, Vol. 112, Issue 1
  • DOI: 10.1021/cr200107z

Towards the computational design of solid catalysts
journal, April 2009

  • Nørskov, J.; Bligaard, T.; Rossmeisl, J.
  • Nature Chemistry, Vol. 1, Issue 1, p. 37-46
  • DOI: 10.1038/nchem.121

Quantum chemistry structures and properties of 134 kilo molecules
text, January 2014


Works referencing / citing this record:

Machine learning prediction of interaction energies in rigid water clusters
journal, January 2018

  • Bose, Samik; Dhawan, Diksha; Nandi, Sutanu
  • Physical Chemistry Chemical Physics, Vol. 20, Issue 35
  • DOI: 10.1039/c8cp03138j

Structure-based sampling and self-correcting machine learning for accurate calculations of potential energy surfaces and vibrational levels
journal, June 2017

  • Dral, Pavlo O.; Owens, Alec; Yurchenko, Sergei N.
  • The Journal of Chemical Physics, Vol. 146, Issue 24
  • DOI: 10.1063/1.4989536

A new approach for the prediction of partition functions using machine learning techniques
journal, July 2018

  • Desgranges, Caroline; Delhommelle, Jerome
  • The Journal of Chemical Physics, Vol. 149, Issue 4
  • DOI: 10.1063/1.5037098

MLatom : A program package for quantum chemical research assisted by machine learning
journal, June 2019

  • Dral, Pavlo O.
  • Journal of Computational Chemistry, Vol. 40, Issue 26
  • DOI: 10.1002/jcc.26004

Semiempirical molecular orbital models based on the neglect of diatomic differential overlap approximation
journal, October 2018

  • Husch, Tamara; Vaucher, Alain C.; Reiher, Markus
  • International Journal of Quantum Chemistry, Vol. 118, Issue 24
  • DOI: 10.1002/qua.25799

Machine Learning a General-Purpose Interatomic Potential for Silicon
journal, December 2018


Deep learning for computational chemistry
journal, March 2017

  • Goh, Garrett B.; Hodas, Nathan O.; Vishnu, Abhinav
  • Journal of Computational Chemistry, Vol. 38, Issue 16
  • DOI: 10.1002/jcc.24764

From DFT to machine learning: recent approaches to materials science–a review
journal, May 2019

  • Schleder, Gabriel R.; Padilha, Antonio C. M.; Acosta, Carlos Mera
  • Journal of Physics: Materials, Vol. 2, Issue 3
  • DOI: 10.1088/2515-7639/ab084b

Machine Learning a General-Purpose Interatomic Potential for Silicon
text, January 2018

  • Bartók, Ap; Kermode, J.; Bernstein, N.
  • Apollo - University of Cambridge Repository
  • DOI: 10.17863/cam.34909

Deep Learning for Computational Chemistry
preprint, January 2017


Semiempirical Quantum-Chemical Methods with Orthogonalization and Dispersion Corrections
journal, January 2019

  • Dral, Pavlo O.; Wu, Xin; Thiel, Walter
  • Journal of Chemical Theory and Computation, Vol. 15, Issue 3
  • DOI: 10.1021/acs.jctc.8b01265

Read between the Molecules: Computational Insights into Organic Semiconductors
journal, November 2018

  • Gryn’ova, Ganna; Lin, Kun-Han; Corminboeuf, Clémence
  • Journal of the American Chemical Society, Vol. 140, Issue 48
  • DOI: 10.1021/jacs.8b07985

Endothelin-1 and cell Proliferation in lung Organ Cultures
journal, November 1996


Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns
journal, November 2019


A Density Functional Tight Binding Layer for Deep Learning of Chemical Hamiltonians
preprint, January 2018


Deterministic and Statistical Approaches to Quantum Chemistry
text, January 2020