DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A universal density matrix functional from molecular orbital-based machine learning: Transferability across organic molecules

Abstract

We address the degree to which machine learning (ML) can be used to accurately and transferably predict post-Hartree-Fock correlation energies. Refined strategies for feature design and selection are presented, and the molecular-orbital-based machine learning (MOB-ML) method is applied to several test systems. Strikingly, for the second-order Møller-Plessett perturbation theory, coupled cluster with singles and doubles (CCSD), and CCSD with perturbative triples levels of theory, it is shown that the thermally accessible (350 K) potential energy surface for a single water molecule can be described to within 1 mhartree using a model that is trained from only a single reference calculation at a randomized geometry. To explore the breadth of chemical diversity that can be described, MOB-ML is also applied to a new dataset of thermalized (350 K) geometries of 7211 organic models with up to seven heavy atoms. In comparison with the previously reported Δ-ML method, MOB-ML is shown to reach chemical accuracy with threefold fewer training geometries. Finally, a transferability test in which models trained for seven-heavy-atom systems are used to predict energies for thirteen-heavy-atom systems reveals that MOB-ML reaches chemical accuracy with 36-fold fewer training calculations than Δ-ML (140 vs 5000 training calculations).

Authors:
ORCiD logo [1];  [1]; ORCiD logo [2]; ORCiD logo [1]
  1. California Inst. of Technology (CalTech), Pasadena, CA (United States)
  2. Univ. of Basel (Switzerland)
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1529903
Alternate Identifier(s):
OSTI ID: 1505017
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Chemical Physics
Additional Journal Information:
Journal Volume: 150; Journal Issue: 13; Journal ID: ISSN 0021-9606
Publisher:
American Institute of Physics (AIP)
Country of Publication:
United States
Language:
English
Subject:
37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY

Citation Formats

Cheng, Lixue, Welborn, Matthew, Christensen, Anders S., and Miller, Thomas F. A universal density matrix functional from molecular orbital-based machine learning: Transferability across organic molecules. United States: N. p., 2019. Web. doi:10.1063/1.5088393.
Cheng, Lixue, Welborn, Matthew, Christensen, Anders S., & Miller, Thomas F. A universal density matrix functional from molecular orbital-based machine learning: Transferability across organic molecules. United States. https://doi.org/10.1063/1.5088393
Cheng, Lixue, Welborn, Matthew, Christensen, Anders S., and Miller, Thomas F. Thu . "A universal density matrix functional from molecular orbital-based machine learning: Transferability across organic molecules". United States. https://doi.org/10.1063/1.5088393. https://www.osti.gov/servlets/purl/1529903.
@article{osti_1529903,
title = {A universal density matrix functional from molecular orbital-based machine learning: Transferability across organic molecules},
author = {Cheng, Lixue and Welborn, Matthew and Christensen, Anders S. and Miller, Thomas F.},
abstractNote = {We address the degree to which machine learning (ML) can be used to accurately and transferably predict post-Hartree-Fock correlation energies. Refined strategies for feature design and selection are presented, and the molecular-orbital-based machine learning (MOB-ML) method is applied to several test systems. Strikingly, for the second-order Møller-Plessett perturbation theory, coupled cluster with singles and doubles (CCSD), and CCSD with perturbative triples levels of theory, it is shown that the thermally accessible (350 K) potential energy surface for a single water molecule can be described to within 1 mhartree using a model that is trained from only a single reference calculation at a randomized geometry. To explore the breadth of chemical diversity that can be described, MOB-ML is also applied to a new dataset of thermalized (350 K) geometries of 7211 organic models with up to seven heavy atoms. In comparison with the previously reported Δ-ML method, MOB-ML is shown to reach chemical accuracy with threefold fewer training geometries. Finally, a transferability test in which models trained for seven-heavy-atom systems are used to predict energies for thirteen-heavy-atom systems reveals that MOB-ML reaches chemical accuracy with 36-fold fewer training calculations than Δ-ML (140 vs 5000 training calculations).},
doi = {10.1063/1.5088393},
journal = {Journal of Chemical Physics},
number = 13,
volume = 150,
place = {United States},
year = {Thu Apr 04 00:00:00 EDT 2019},
month = {Thu Apr 04 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 67 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Recognizing molecular patterns by machine learning: An agnostic structural definition of the hydrogen bond
journal, November 2014

  • Gasparotto, Piero; Ceriotti, Michele
  • The Journal of Chemical Physics, Vol. 141, Issue 17
  • DOI: 10.1063/1.4900655

Accurate sampling using Langevin dynamics
journal, May 2007


Machine learning of molecular electronic properties in chemical compound space
text, January 2013


970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13
journal, July 2009

  • Blum, Lorenz C.; Reymond, Jean-Louis
  • Journal of the American Chemical Society, Vol. 131, Issue 25
  • DOI: 10.1021/ja902302h

Comparison of permutationally invariant polynomials, neural networks, and Gaussian approximation potentials in representing water interactions through many-body expansions
journal, June 2018

  • Nguyen, Thuong T.; Székely, Eszter; Imbalzano, Giulio
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5024577

Accurate spin-dependent electron liquid correlation energies for local spin density calculations: a critical analysis
journal, August 1980

  • Vosko, S. H.; Wilk, L.; Nusair, M.
  • Canadian Journal of Physics, Vol. 58, Issue 8
  • DOI: 10.1139/p80-159

The TensorMol-0.1 Model Chemistry: a Neural Network Augmented with Long-Range Physics
preprint, January 2017


Localization: theory and experiment
journal, December 1993


Gaussian Approximation Potentials: the accuracy of quantum mechanics, without the electrons
text, January 2009


Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)
journal, August 2001


Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach
text, January 2015

  • Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
  • American Chemical Society
  • DOI: 10.5451/unibas-ep43348

MoleculeNet: a benchmark for molecular machine learning
journal, January 2018

  • Wu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.
  • Chemical Science, Vol. 9, Issue 2
  • DOI: 10.1039/c7sc02664a

Machine learning of molecular electronic properties in chemical compound space
journal, September 2013


Transferable Machine-Learning Model of the Electron Density
journal, December 2018


Gaussian processes with built-in dimensionality reduction: Applications to high-dimensional uncertainty propagation
journal, September 2016

  • Tripathy, Rohit; Bilionis, Ilias; Gonzalez, Marcial
  • Journal of Computational Physics, Vol. 321
  • DOI: 10.1016/j.jcp.2016.05.039

Advances in molecular quantum chemistry contained in the Q-Chem 4 program package
journal, September 2014


Fast Hartree–Fock theory using local density fitting approximations
journal, November 2004


Gaussian approximation potential modeling of lithium intercalation in carbon nanostructures
journal, June 2018

  • Fujikake, So; Deringer, Volker L.; Lee, Tae Hoon
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5016317

Deep Potential Molecular Dynamics: A Scalable Model with the Accuracy of Quantum Mechanics
journal, April 2018


Deep Learning in Drug Discovery
journal, December 2015

  • Gawehn, Erik; Hiss, Jan A.; Schneider, Gisbert
  • Molecular Informatics, Vol. 35, Issue 1
  • DOI: 10.1002/minf.201501008

Molecular graph convolutions: moving beyond fingerprints
journal, August 2016

  • Kearnes, Steven; McCloskey, Kevin; Berndl, Marc
  • Journal of Computer-Aided Molecular Design, Vol. 30, Issue 8
  • DOI: 10.1007/s10822-016-9938-8

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
journal, January 2012


Comparison of permutationally invariant polynomials, neural networks, and Gaussian approximation potentials in representing water interactions through many-body expansions.
text, January 2018

  • Nguyen, Thuong T.; Székely, Eszter; Imbalzano, Giulio
  • Apollo - University of Cambridge Repository
  • DOI: 10.17863/cam.22160

Inverse molecular design using machine learning: Generative models for matter engineering
journal, July 2018


Intrinsic Atomic Orbitals: An Unbiased Bridge between Quantum Theory and Chemical Concepts
journal, October 2013

  • Knizia, Gerald
  • Journal of Chemical Theory and Computation, Vol. 9, Issue 11
  • DOI: 10.1021/ct400687b

Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach
journal, April 2015

  • Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
  • Journal of Chemical Theory and Computation, Vol. 11, Issue 5
  • DOI: 10.1021/acs.jctc.5b00099

Localized Orbitals for NH 3 , C 2 H 4 , and C 2 H 2
journal, March 1967

  • Kaldor, Uzi
  • The Journal of Chemical Physics, Vol. 46, Issue 5
  • DOI: 10.1063/1.1840963

The TensorMol-0.1 model chemistry: a neural network augmented with long-range physics
journal, January 2018

  • Yao, Kun; Herr, John E.; Toth, David W.
  • Chemical Science, Vol. 9, Issue 8
  • DOI: 10.1039/c7sc04934j

Gaussian approximation potential modeling of lithium intercalation in carbon nanostructures
text, January 2017


Transferable Machine-Learning Model of the Electron Density
dataset, January 2019


Non-iterative fifth-order triple and quadruple excitation energy corrections in correlated methods
journal, February 1990


Accurate molecular polarizabilities with coupled cluster theory and machine learning
journal, February 2019

  • Wilkins, David M.; Grisafi, Andrea; Yang, Yang
  • Proceedings of the National Academy of Sciences, Vol. 116, Issue 9
  • DOI: 10.1073/pnas.1816132116

Accurate sampling using Langevin dynamics
text, January 2008


Random Forests
journal, January 2001


Mathematical Contributions to the Theory of Evolution. III. Regression, Heredity, and Panmixia
journal, January 1896

  • Pearson, K.
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 187, Issue 0
  • DOI: 10.1098/rsta.1896.0007

Quantum-chemical insights from deep tensor neural networks
journal, January 2017

  • Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms13890

Quantum-Chemical Insights from Deep Tensor Neural Networks
text, January 2016


Low-order scaling local electron correlation methods. III. Linear scaling local perturbative triples correction ( T )
journal, December 2000

  • Schütz, Martin
  • The Journal of Chemical Physics, Vol. 113, Issue 22
  • DOI: 10.1063/1.1323265

Gaussian approximation potential modeling of lithium intercalation in carbon nanostructures.
text, January 2018

  • Fujikake, So; Deringer, Volker; Lee, Taehoon
  • Apollo - University of Cambridge Repository
  • DOI: 10.17863/cam.20669

To address surface reaction network complexity using scaling relations machine learning and DFT calculations
journal, March 2017

  • Ulissi, Zachary W.; Medford, Andrew J.; Bligaard, Thomas
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms14621

Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields
journal, November 1994

  • Stephens, P. J.; Devlin, F. J.; Chabalowski, C. F.
  • The Journal of Physical Chemistry, Vol. 98, Issue 45, p. 11623-11627
  • DOI: 10.1021/j100096a001

Deep Reinforcement Learning for De-Novo Drug Design
text, January 2017


Machine learning for molecular and materials science
journal, July 2018


Transferability in Machine Learning for Electronic Structure via the Molecular Orbital Basis
text, January 2018


Neural Networks for the Prediction of Organic Chemistry Reactions
journal, October 2016


By-passing the Kohn-Sham equations with machine learning
text, January 2016


Accelerated Discovery of Metallic Glasses through Iteration of Machine Learning and High-Throughput Experiments
dataset, January 2018

  • Fang, Ren; Ward, Logan; Williams, Travis
  • Materials Data Facility
  • DOI: 10.18126/m2b06m

Deep reinforcement learning for de novo drug design
journal, July 2018

  • Popova, Mariya; Isayev, Olexandr; Tropsha, Alexander
  • Science Advances, Vol. 4, Issue 7
  • DOI: 10.1126/sciadv.aap7885

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
text, January 2011


Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen
journal, January 1989

  • Dunning, Thom H.
  • The Journal of Chemical Physics, Vol. 90, Issue 2
  • DOI: 10.1063/1.456153

A Density Functional Tight Binding Layer for Deep Learning of Chemical Hamiltonians
journal, October 2018

  • Li, Haichen; Collins, Christopher; Tanha, Matteus
  • Journal of Chemical Theory and Computation, Vol. 14, Issue 11
  • DOI: 10.1021/acs.jctc.8b00873

Transferability in Machine Learning for Electronic Structure via the Molecular Orbital Basis
journal, July 2018

  • Welborn, Matthew; Cheng, Lixue; Miller, Thomas F.
  • Journal of Chemical Theory and Computation, Vol. 14, Issue 9
  • DOI: 10.1021/acs.jctc.8b00636

Local Treatment of Electron Correlation
journal, January 1993


Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons
journal, April 2010


Perspective: Machine learning potentials for atomistic simulations
journal, November 2016

  • Behler, Jörg
  • The Journal of Chemical Physics, Vol. 145, Issue 17
  • DOI: 10.1063/1.4966192

miRNALoc: predicting miRNA subcellular localizations based on principal component scores of physico-chemical properties and pseudo compositions of di-nucleotides
journal, September 2020

  • Meher, Prabina Kumar; Satpathy, Subhrajit; Rao, Atmakuri Ramakrishna
  • Scientific Reports, Vol. 10, Issue 1
  • DOI: 10.1038/s41598-020-71381-4

Hierarchical modeling of molecular energies using a deep neural network
journal, June 2018

  • Lubbers, Nicholas; Smith, Justin S.; Barros, Kipton
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5011181

Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction
journal, February 2017

  • Segler, Marwin H. S.; Waller, Mark P.
  • Chemistry - A European Journal, Vol. 23, Issue 25
  • DOI: 10.1002/chem.201605499

Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments
journal, April 2018


Density‐functional thermochemistry. III. The role of exact exchange
journal, April 1993

  • Becke, Axel D.
  • The Journal of Chemical Physics, Vol. 98, Issue 7, p. 5648-5652
  • DOI: 10.1063/1.464913

Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density
text, January 1988

  • Robert, Parr,; Chengteh, Lee,; Weitao, Yang,
  • The University of North Carolina at Chapel Hill University Libraries
  • DOI: 10.17615/zrp0-ry04

Alchemical and structural distribution based representation for universal quantum machine learning
journal, June 2018

  • Faber, Felix A.; Christensen, Anders S.; Huang, Bing
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5020710

Fast and accurate modeling of molecular atomization energies with machine learning
text, January 2012

  • Rupp, Matthias; Tkatchenko, Alexandre; Müller, Klaus-Robert
  • American Physical Society
  • DOI: 10.5451/unibas-ep43360

MoleculeNet: A Benchmark for Molecular Machine Learning
preprint, January 2017


Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces
journal, April 2007


ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost
text, January 2017

  • ,
  • The University of North Carolina at Chapel Hill University Libraries
  • DOI: 10.17615/bhbf-9r93

Bypassing the Kohn-Sham equations with machine learning
journal, October 2017


Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies
journal, July 2013

  • Hansen, Katja; Montavon, Grégoire; Biegler, Franziska
  • Journal of Chemical Theory and Computation, Vol. 9, Issue 8
  • DOI: 10.1021/ct400195d

ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost
journal, January 2017

  • Smith, J. S.; Isayev, O.; Roitberg, A. E.
  • Chemical Science, Vol. 8, Issue 4
  • DOI: 10.1039/c6sc05720a

Virtual screening of inorganic materials synthesis parameters with deep learning
journal, December 2017


Improving the accuracy of Møller-Plesset perturbation theory with neural networks
journal, October 2017

  • McGibbon, Robert T.; Taube, Andrew G.; Donchev, Alexander G.
  • The Journal of Chemical Physics, Vol. 147, Issue 16
  • DOI: 10.1063/1.4986081

Machine learning of molecular electronic properties in chemical compound space
text, January 2013


Machine-learning approaches in drug discovery: methods and applications
journal, March 2015


Machine-learning-assisted materials discovery using failed experiments
journal, May 2016

  • Raccuglia, Paul; Elbert, Katherine C.; Adler, Philip D. F.
  • Nature, Vol. 533, Issue 7601
  • DOI: 10.1038/nature17439

Local treatment of electron correlation in coupled cluster theory
journal, April 1996

  • Hampel, Claudia; Werner, Hans‐Joachim
  • The Journal of Chemical Physics, Vol. 104, Issue 16
  • DOI: 10.1063/1.471289

Planning chemical syntheses with deep neural networks and symbolic AI
journal, March 2018

  • Segler, Marwin H. S.; Preuss, Mike; Waller, Mark P.
  • Nature, Vol. 555, Issue 7698
  • DOI: 10.1038/nature25978

Neural networks for the prediction organic chemistry reactions
text, January 2016


Works referencing / citing this record:

Machine-learned electron correlation model based on correlation energy density at complete basis set limit
journal, July 2019

  • Nudejima, Takuro; Ikabata, Yasuhiro; Seino, Junji
  • The Journal of Chemical Physics, Vol. 151, Issue 2
  • DOI: 10.1063/1.5100165

Deep Learning for Optoelectronic Properties of Organic Semiconductors
text, January 2019


OrbNet: Deep Learning for Quantum Chemistry Using Symmetry-Adapted Atomic-Orbital Features
text, January 2020


Deep learning for molecular design—a review of the state of the art
journal, January 2019

  • Elton, Daniel C.; Boukouvalas, Zois; Fuge, Mark D.
  • Molecular Systems Design & Engineering, Vol. 4, Issue 4
  • DOI: 10.1039/c9me00039a

Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions
journal, November 2019


FCHL revisited: Faster and more accurate quantum machine learning
journal, January 2020

  • Christensen, Anders S.; Bratholm, Lars A.; Faber, Felix A.
  • The Journal of Chemical Physics, Vol. 152, Issue 4
  • DOI: 10.1063/1.5126701

FCHL revisited: faster and more accurate quantum machine learning
text, January 2019


Perspective on integrating machine learning into computational chemistry and materials science
journal, June 2021

  • Westermayr, Julia; Gastegger, Michael; Schütt, Kristof T.
  • The Journal of Chemical Physics, Vol. 154, Issue 23
  • DOI: 10.1063/5.0047760

FCHL revisited: faster and more accurate quantum machine learning
text, January 2019


Dataset’s chemical diversity limits the generalizability of machine learning predictions
journal, November 2019

  • Glavatskikh, Marta; Leguy, Jules; Hunault, Gilles
  • Journal of Cheminformatics, Vol. 11, Issue 1
  • DOI: 10.1186/s13321-019-0391-2

Regression Clustering for Improved Accuracy and Training Costs with Molecular-Orbital-Based Machine Learning
journal, October 2019

  • Cheng, Lixue; Kovachki, Nikola B.; Welborn, Matthew
  • Journal of Chemical Theory and Computation, Vol. 15, Issue 12
  • DOI: 10.1021/acs.jctc.9b00884