skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Quantum-Chemically Informed Machine Learning: Prediction of Energies of Organic Molecules with 10 to 14 Non-hydrogen Atoms

Journal Article · · Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory

High-fidelity quantum-chemical calculations can provide accurate predictions of molecular energies, but their high computational costs limit their utility, especially for larger molecules. We have shown in previous work that machine learning models trained on high-level quantum-chemical calculations (G4MP2) for organic molecules with one to nine non-hydrogen atoms can provide accurate predictions for other molecules of comparable size at much lower costs. Here we demonstrate that such models can also be used to effectively predict energies of molecules larger than those in the training set. To implement this strategy, we first established a set of 191 molecules with 10–14 non-hydrogen atoms having reliable experimental enthalpies of formation. We then assessed the accuracy of computed G4MP2 enthalpies of formation for these 191 molecules. The error in the G4MP2 results was somewhat larger than that for smaller molecules, and the reason for this increase is discussed. Two density functional methods, B3LYP and ωB97X-D, were also used on this set of molecules, with ωB97X-D found to perform better than B3LYP at predicting energies. The G4MP2 energies for the 191 molecules were then predicted using these two functionals with two machine learning methods, the FCHL-Δ and SchNet-Δ models, with the learning done on calculated energies of the one to nine non-hydrogen atom molecules. The better-performing model, FCHL-Δ, gave atomization energies of the 191 organic molecules with 10–14 non-hydrogen atoms within 0.4 kcal/mol of their G4MP2 energies. Thus, this work demonstrates that quantum-chemically informed machine learning can be used to successfully predict the energies of large organic molecules whose size is beyond that in the training set.

Research Organization:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1656872
Journal Information:
Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory, Vol. 124, Issue 28; ISSN 1089-5639
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 18 works
Citation information provided by
Web of Science

References (43)

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning journal January 2012
Density‐functional thermochemistry. III. The role of exact exchange journal April 1993
Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals journal April 2017
Alchemical and structural distribution based representation for universal quantum machine learning journal June 2018
Explicitly correlated W n theory: W1-F12 and W2-F12 journal March 2012
ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost journal January 2017
Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density journal January 1988
Towards the intrinsic error of the correlation consistent Composite Approach (ccCA) journal April 2009
Electronic Structure of Mercury Oligomers and Exciplexes:  Models for Long-Range/Multicenter Bonding in Phosphorescent Transition-Metal Compounds journal February 2005
Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach journal April 2015
Assessment of Gaussian-4 theory for the computation of enthalpies of formation of large organic molecules journal July 2011
Accurate thermochemistry for larger molecules: Gaussian-2 theory with bond separation energies journal April 1997
A reactive, scalable, and transferable model for molecular energies from a neural network approach based on local information journal June 2018
W3X: A Cost-Effective Post-CCSD(T) Composite Procedure journal October 2013
Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space journal June 2015
Accelerating Electrolyte Discovery for Energy Storage with High-Throughput Screening journal January 2015
Gaussian-4 theory using reduced order perturbation theory journal September 2007
Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network journal August 2019
Gaussian-4 theory journal February 2007
Assessment of Gaussian-3 and density-functional theories on the G3/05 test set of experimental energies journal September 2005
Gaussian‐2 theory for molecular energies of first‐ and second‐row compounds journal June 1991
Assessment of Gaussian-2 and density functional theories for the computation of enthalpies of formation journal January 1997
The correlation consistent composite approach (cc CA ): An alternative to the Gaussian-n methods journal March 2006
Multireference Correlation Consistent Composite Approach [MR-ccCA]: Toward Accurate Prediction of the Energetics of Excited and Transition State Chemistry journal August 2010
Atoms in Molecules from Alchemical Perturbation Density Functional Theory journal October 2019
Chemical space journal December 2004
Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions journal November 2019
A QM/QM Multilayer Composite Methodology: The ONIOM Correlation Consistent Composite Approach (ONIOM-ccCA) journal September 2010
Extrapolation of high-order correlation energies: the WMS model journal January 2018
A new mixing of Hartree–Fock and local density‐functional theories journal January 1993
Combinatorial thinking in chemistry and biology journal April 1997
SchNet – A deep learning architecture for molecules and materials journal June 2018
Extended benchmark studies of coupled cluster theory through triple excitations journal August 2001
Semiempirical GGA-type density functional constructed with a long-range dispersion correction journal January 2006
Boosting Quantum Machine Learning Models with a Multilevel Combination Technique: Pople Diagrams Revisited journal December 2018
Machine learning for quantum mechanics in a nutshell journal July 2015
Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations journal August 2019
G n theory : G
  • Curtiss, Larry A.; Redfern, Paul C.; Raghavachari, Krishnan
  • Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 1, Issue 5 https://doi.org/10.1002/wcms.59
journal June 2011
SchNetPack: A Deep Learning Toolbox For Atomistic Systems journal November 2018
Robust and Affordable Multicoefficient Methods for Thermochemistry and Thermochemical Kinetics:  The MCCM/3 Suite and SAC/3 journal May 2003
PubChem 2019 update: improved access to chemical data journal October 2018
Gaussian-3 (G3) theory for molecules containing first and second-row atoms journal November 1998
Automated Force Field Parameterization for Nonpolarizable and Polarizable Atomic Models Based on Ab Initio Target Data journal July 2013

Similar Records

Accurate quantum chemical energies for 133 000 organic molecules
Journal Article · Wed Aug 07 00:00:00 EDT 2019 · Chemical Science · OSTI ID:1656872

Accurate Prediction of Adiabatic Ionization Potentials of Organic Molecules using Quantum Chemistry Assisted Machine Learning
Journal Article · Wed Jul 05 00:00:00 EDT 2023 · Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory · OSTI ID:1656872

Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations
Journal Article · Tue Aug 27 00:00:00 EDT 2019 · MRS Communications · OSTI ID:1656872