DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach

Abstract

Chemically accurate and comprehensive studies of the virtual space of all possible molecules are severely limited by the computational cost of quantum chemistry. Here we introduce a composite strategy that adds machine learning corrections to computationally inexpensive approximate legacy quantum methods. After training, highly accurate predictions of enthalpies, free energies, entropies, and electron correlation energies are possible, for significantly larger molecular sets than used for training. For thermochemical properties of up to 16k isomers of C7H10O2 we present numerical evidence that chemical accuracy can be reached. We also predict electron correlation energy in post Hartree-Fock methods, at the computational cost of HartreeFock, and we establish a qualitative relationship between molecular entropy and electron correlation. The transferability of our approach is demonstrated, using semiempirical quantum chemistry and machine learning models trained on 1 and 10% of 134k organic molecules, to reproduce enthalpies of all remaining molecules at density functional theory level of accuracy.

Authors:
 [1];  [2];  [1];  [3]
  1. University of Basel (Switzerland)
  2. Max-Planck-Institut für Kohlenforschung, Mülheim an der Ruhr (Germany); Friedrich-Alexander University Erlangen-Nuremberg, Bamberg (Germany)
  3. University of Basel (Switzerland); Argonne National Laboratory (ANL), Argonne, IL (United States). Argonne Leadership Computing Facility (ALCF)
Publication Date:
Research Org.:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC); Swiss National Science Foundation (SNSF)
OSTI Identifier:
1392925
Grant/Contract Number:  
AC02-06CH11357; PP00P2_138932
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Chemical Theory and Computation
Additional Journal Information:
Journal Volume: 11; Journal Issue: 5; Journal ID: ISSN 1549-9618
Publisher:
American Chemical Society
Country of Publication:
United States
Language:
English
Subject:
37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY

Citation Formats

Ramakrishnan, Raghunathan, Dral, Pavlo O., Rupp, Matthias, and von Lilienfeld, O. Anatole. Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. United States: N. p., 2015. Web. doi:10.1021/acs.jctc.5b00099.
Ramakrishnan, Raghunathan, Dral, Pavlo O., Rupp, Matthias, & von Lilienfeld, O. Anatole. Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. United States. https://doi.org/10.1021/acs.jctc.5b00099
Ramakrishnan, Raghunathan, Dral, Pavlo O., Rupp, Matthias, and von Lilienfeld, O. Anatole. Fri . "Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach". United States. https://doi.org/10.1021/acs.jctc.5b00099. https://www.osti.gov/servlets/purl/1392925.
@article{osti_1392925,
title = {Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach},
author = {Ramakrishnan, Raghunathan and Dral, Pavlo O. and Rupp, Matthias and von Lilienfeld, O. Anatole},
abstractNote = {Chemically accurate and comprehensive studies of the virtual space of all possible molecules are severely limited by the computational cost of quantum chemistry. Here we introduce a composite strategy that adds machine learning corrections to computationally inexpensive approximate legacy quantum methods. After training, highly accurate predictions of enthalpies, free energies, entropies, and electron correlation energies are possible, for significantly larger molecular sets than used for training. For thermochemical properties of up to 16k isomers of C7H10O2 we present numerical evidence that chemical accuracy can be reached. We also predict electron correlation energy in post Hartree-Fock methods, at the computational cost of HartreeFock, and we establish a qualitative relationship between molecular entropy and electron correlation. The transferability of our approach is demonstrated, using semiempirical quantum chemistry and machine learning models trained on 1 and 10% of 134k organic molecules, to reproduce enthalpies of all remaining molecules at density functional theory level of accuracy.},
doi = {10.1021/acs.jctc.5b00099},
journal = {Journal of Chemical Theory and Computation},
number = 5,
volume = 11,
place = {United States},
year = {Fri Apr 10 00:00:00 EDT 2015},
month = {Fri Apr 10 00:00:00 EDT 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 486 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Chemical space
journal, December 2004

  • Kirkpatrick, Peter; Ellis, Clare
  • Nature, Vol. 432, Issue 7019
  • DOI: 10.1038/432823a

Stochastic Voyages into Uncharted Chemical Space Produce a Representative Library of All Possible Drug-Like Compounds
journal, May 2013

  • Virshup, Aaron M.; Contreras-García, Julia; Wipf, Peter
  • Journal of the American Chemical Society, Vol. 135, Issue 19
  • DOI: 10.1021/ja401184g

First principles view on chemical compound space: Gaining rigorous atomistic control of molecular properties
journal, February 2013

  • von Lilienfeld, O. Anatole
  • International Journal of Quantum Chemistry, Vol. 113, Issue 12
  • DOI: 10.1002/qua.24375

Combinatorial thinking in chemistry and biology
journal, April 1997

  • Ellman, J.; Stoddard, B.; Wells, J.
  • Proceedings of the National Academy of Sciences, Vol. 94, Issue 7
  • DOI: 10.1073/pnas.94.7.2779

Click Chemistry: Diverse Chemical Function from a Few Good Reactions
journal, June 2001


Towards the computational design of solid catalysts
journal, April 2009

  • Nørskov, J.; Bligaard, T.; Rossmeisl, J.
  • Nature Chemistry, Vol. 1, Issue 1, p. 37-46
  • DOI: 10.1038/nchem.121

The Many Roles of Computation in Drug Discovery
journal, March 2004


De novodesign: balancing novelty and confined chemical space
journal, June 2010


The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid
journal, August 2011

  • Hachmann, Johannes; Olivares-Amaya, Roberto; Atahan-Evrenk, Sule
  • The Journal of Physical Chemistry Letters, Vol. 2, Issue 17
  • DOI: 10.1021/jz200866s

Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013

  • Jain, Anubhav; Ong, Shyue Ping; Hautier, Geoffroy
  • APL Materials, Vol. 1, Issue 1
  • DOI: 10.1063/1.4812323

Inverse Strategies for Molecular Design
journal, January 1996

  • Kuhn, Christoph; Beratan, David N.
  • The Journal of Physical Chemistry, Vol. 100, Issue 25
  • DOI: 10.1021/jp960518i

The inverse band-structure problem of finding an atomic configuration with given electronic properties
journal, November 1999

  • Franceschetti, Alberto; Zunger, Alex
  • Nature, Vol. 402, Issue 6757
  • DOI: 10.1038/46995

Variational Particle Number Approach for Rational Compound Design
journal, October 2005

  • von Lilienfeld, O. Anatole; Lins, Roberto D.; Rothlisberger, Ursula
  • Physical Review Letters, Vol. 95, Issue 15
  • DOI: 10.1103/PhysRevLett.95.153002

Designing Molecules by Optimizing Potentials
journal, March 2006

  • Wang, Mingliang; Hu, Xiangqian; Beratan, David N.
  • Journal of the American Chemical Society, Vol. 128, Issue 10
  • DOI: 10.1021/ja0572046

A comprehensive chemical kinetic combustion model for the four butanol isomers
journal, June 2012


Combustion and pyrolysis of iso-butanol: Experimental and chemical kinetic modeling study
journal, October 2013


Chemical Kinetic Data Base for Combustion Chemistry. Part I. Methane and Related Compounds
journal, July 1986

  • Tsang, W.; Hampson, R. F.
  • Journal of Physical and Chemical Reference Data, Vol. 15, Issue 3
  • DOI: 10.1063/1.555759

Ab initio quantum chemistry: Methodology and applications
journal, May 2005


Gaussian‐1 theory: A general procedure for prediction of molecular energies
journal, May 1989

  • Pople, John A.; Head‐Gordon, Martin; Fox, Douglas J.
  • The Journal of Chemical Physics, Vol. 90, Issue 10
  • DOI: 10.1063/1.456415

Gaussian‐2 theory for molecular energies of first‐ and second‐row compounds
journal, June 1991

  • Curtiss, Larry A.; Raghavachari, Krishnan; Trucks, Gary W.
  • The Journal of Chemical Physics, Vol. 94, Issue 11
  • DOI: 10.1063/1.460205

Gaussian-4 theory
journal, February 2007

  • Curtiss, Larry A.; Redfern, Paul C.; Raghavachari, Krishnan
  • The Journal of Chemical Physics, Vol. 126, Issue 8
  • DOI: 10.1063/1.2436888

Gaussian-4 theory using reduced order perturbation theory
journal, September 2007

  • Curtiss, Larry A.; Redfern, Paul C.; Raghavachari, Krishnan
  • The Journal of Chemical Physics, Vol. 127, Issue 12
  • DOI: 10.1063/1.2770701

Quantum chemistry structures and properties of 134 kilo molecules
journal, August 2014

  • Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
  • Scientific Data, Vol. 1, Issue 1
  • DOI: 10.1038/sdata.2014.22

Neural network approach to quantum-chemistry data: Accurate prediction of density functional theory energies
journal, August 2009

  • Balabin, Roman M.; Lomakina, Ekaterina I.
  • The Journal of Chemical Physics, Vol. 131, Issue 7
  • DOI: 10.1063/1.3206326

Combined first-principles calculation and neural-network correction approach for heat of formation
journal, December 2003

  • Hu, LiHong; Wang, XiuJun; Wong, LaiHo
  • The Journal of Chemical Physics, Vol. 119, Issue 22
  • DOI: 10.1063/1.1630951

First-principles energetics of water clusters and ice: A many-body analysis
journal, December 2013

  • Gillan, M. J.; Alfè, D.; Bartók, A. P.
  • The Journal of Chemical Physics, Vol. 139, Issue 24
  • DOI: 10.1063/1.4852182

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
journal, January 2012


Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies
journal, July 2013

  • Hansen, Katja; Montavon, Grégoire; Biegler, Franziska
  • Journal of Chemical Theory and Computation, Vol. 9, Issue 8
  • DOI: 10.1021/ct400195d

Machine learning of molecular electronic properties in chemical compound space
journal, September 2013


Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
journal, November 2012

  • Ruddigkeit, Lars; van Deursen, Ruud; Blum, Lorenz C.
  • Journal of Chemical Information and Modeling, Vol. 52, Issue 11
  • DOI: 10.1021/ci300415d

Challenges for Density Functional Theory
journal, December 2011

  • Cohen, Aron J.; Mori-Sánchez, Paula; Yang, Weitao
  • Chemical Reviews, Vol. 112, Issue 1
  • DOI: 10.1021/cr200107z

Orthogonalization corrections for semiempirical methods
journal, April 2000

  • Weber, Wolfgang; Thiel, Walter
  • Theoretical Chemistry Accounts: Theory, Computation, and Modeling (Theoretica Chimica Acta), Vol. 103, Issue 6
  • DOI: 10.1007/s002149900083

Discovering chemistry with an ab initio nanoreactor
journal, November 2014

  • Wang, Lee-Ping; Titov, Alexey; McGibbon, Robert
  • Nature Chemistry, Vol. 6, Issue 12
  • DOI: 10.1038/nchem.2099

Nearsightedness of electronic matter
journal, August 2005

  • Prodan, E.; Kohn, W.
  • Proceedings of the National Academy of Sciences, Vol. 102, Issue 33
  • DOI: 10.1073/pnas.0505436102

Self-consistent-charge density-functional tight-binding method for simulations of complex materials properties
journal, September 1998

  • Elstner, M.; Porezag, D.; Jungnickel, G.
  • Physical Review B, Vol. 58, Issue 11, p. 7260-7268
  • DOI: 10.1103/PhysRevB.58.7260

Functional designed to include surface effects in self-consistent density functional theory
journal, August 2005


SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules
journal, February 1988

  • Weininger, David
  • Journal of Chemical Information and Modeling, Vol. 28, Issue 1
  • DOI: 10.1021/ci00057a005

Open Babel: An open chemical toolbox
journal, October 2011

  • O'Boyle, Noel M.; Banck, Michael; James, Craig A.
  • Journal of Cheminformatics, Vol. 3, Issue 1
  • DOI: 10.1186/1758-2946-3-33

From atoms and bonds to three-dimensional atomic coordinates: automatic model builders
journal, November 1993

  • Sadowski, Jens.; Gasteiger, Johann.
  • Chemical Reviews, Vol. 93, Issue 7
  • DOI: 10.1021/cr00023a012

Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields
journal, November 1994

  • Stephens, P. J.; Devlin, F. J.; Chabalowski, C. F.
  • The Journal of Physical Chemistry, Vol. 98, Issue 45, p. 11623-11627
  • DOI: 10.1021/j100096a001

First principles view on chemical compound space: Gaining rigorous atomistic control of molecular properties
journal, February 2013

  • von Lilienfeld, O. Anatole
  • International Journal of Quantum Chemistry, Vol. 113, Issue 12
  • DOI: 10.1002/qua.24375

A comprehensive chemical kinetic combustion model for the four butanol isomers
journal, June 2012


SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules
journal, February 1988

  • Weininger, David
  • Journal of Chemical Information and Modeling, Vol. 28, Issue 1
  • DOI: 10.1021/ci00057a005

From atoms and bonds to three-dimensional atomic coordinates: automatic model builders
journal, November 1993

  • Sadowski, Jens.; Gasteiger, Johann.
  • Chemical Reviews, Vol. 93, Issue 7
  • DOI: 10.1021/cr00023a012

Inverse Strategies for Molecular Design
journal, January 1996

  • Kuhn, Christoph; Beratan, David N.
  • The Journal of Physical Chemistry, Vol. 100, Issue 25
  • DOI: 10.1021/jp960518i

Gaussian-4 theory
journal, February 2007

  • Curtiss, Larry A.; Redfern, Paul C.; Raghavachari, Krishnan
  • The Journal of Chemical Physics, Vol. 126, Issue 8
  • DOI: 10.1063/1.2436888

Neural network approach to quantum-chemistry data: Accurate prediction of density functional theory energies
journal, August 2009

  • Balabin, Roman M.; Lomakina, Ekaterina I.
  • The Journal of Chemical Physics, Vol. 131, Issue 7
  • DOI: 10.1063/1.3206326

Ab initio quantum chemistry: Methodology and applications
journal, May 2005


Data-mined similarity function between material compositions
journal, December 2013


Quantum chemistry structures and properties of 134 kilo molecules
text, January 2014


Works referencing / citing this record:

Computational Approach to Molecular Catalysis by 3d Transition Metals: Challenges and Opportunities
journal, October 2018

  • Vogiatzis, Konstantinos D.; Polynski, Mikhail V.; Kirkland, Justin K.
  • Chemical Reviews, Vol. 119, Issue 4
  • DOI: 10.1021/acs.chemrev.8b00361

Boosting Quantum Machine Learning Models with a Multilevel Combination Technique: Pople Diagrams Revisited
journal, December 2018

  • Zaspel, Peter; Huang, Bing; Harbrecht, Helmut
  • Journal of Chemical Theory and Computation, Vol. 15, Issue 3
  • DOI: 10.1021/acs.jctc.8b00832

Self-Parametrizing System-Focused Atomistic Models
journal, January 2020

  • Brunken, Christoph; Reiher, Markus
  • Journal of Chemical Theory and Computation, Vol. 16, Issue 3
  • DOI: 10.1021/acs.jctc.9b00855

Regression Clustering for Improved Accuracy and Training Costs with Molecular-Orbital-Based Machine Learning
journal, October 2019

  • Cheng, Lixue; Kovachki, Nikola B.; Welborn, Matthew
  • Journal of Chemical Theory and Computation, Vol. 15, Issue 12
  • DOI: 10.1021/acs.jctc.9b00884

Breaking the Coupled Cluster Barrier for Machine-Learned Potentials of Large Molecules: The Case of 15-Atom Acetylacetone
journal, May 2021

  • Qu, Chen; Houston, Paul L.; Conte, Riccardo
  • The Journal of Physical Chemistry Letters, Vol. 12, Issue 20
  • DOI: 10.1021/acs.jpclett.1c01142

Learning a Local-Variable Model of Aromatic and Conjugated Systems
journal, December 2017


Read between the Molecules: Computational Insights into Organic Semiconductors
journal, November 2018

  • Gryn’ova, Ganna; Lin, Kun-Han; Corminboeuf, Clémence
  • Journal of the American Chemical Society, Vol. 140, Issue 48
  • DOI: 10.1021/jacs.8b07985

Quantum-chemical insights from deep tensor neural networks
journal, January 2017

  • Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms13890

Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning
journal, July 2019


Energy refinement and analysis of structures in the QM9 database via a highly accurate quantum chemical method
journal, July 2019


Energy-free machine learning force field for aluminum
journal, August 2017


Deep elastic strain engineering of bandgap through machine learning
journal, February 2019

  • Shi, Zhe; Tsymbalov, Evgenii; Dao, Ming
  • Proceedings of the National Academy of Sciences, Vol. 116, Issue 10
  • DOI: 10.1073/pnas.1818555116

Machine learning unifies the modeling of materials and molecules
journal, December 2017

  • Bartók, Albert P.; De, Sandip; Poelking, Carl
  • Science Advances, Vol. 3, Issue 12
  • DOI: 10.1126/sciadv.1701816

Electronic structure at coarse-grained resolutions from supervised machine learning
journal, March 2019

  • Jackson, Nicholas E.; Bowen, Alec S.; Antony, Lucas W.
  • Science Advances, Vol. 5, Issue 3
  • DOI: 10.1126/sciadv.aav1190

Application of Computational Biology and Artificial Intelligence Technologies in Cancer Precision Drug Discovery
journal, November 2019

  • Nagarajan, Nagasundaram; Yapp, Edward K. Y.; Le, Nguyen Quoc Khanh
  • BioMed Research International, Vol. 2019
  • DOI: 10.1155/2019/8427042

Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns
journal, November 2019


Modelling Chemical Reasoning to Predict Reactions
text, January 2016


A Density Functional Tight Binding Layer for Deep Learning of Chemical Hamiltonians
preprint, January 2018


Resolution limit of data-driven coarse-grained models spanning chemical space
text, January 2019


Inverse Design of Potential Singlet Fission Molecules using a Transfer Learning Based Approach
preprint, January 2020