Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry
Abstract
Because collecting precise and accurate chemistry data is often challenging, chemistry data sets usually only span a small region of chemical space, which limits the performance and the scope of applicability of data-driven models. To address this issue, we integrated an active learning machine with automatic ab initio calculations to form a self-evolving model that can continuously adapt to new species appointed by the users. In the present work, we demonstrate the self-evolving concept by modeling the formation enthalpies of stable closed-shell polycyclic species calculated at the B3LYP/6-31G(2df,p) level of theory. By combining a molecular graph convolutional neural network with a dropout training strategy, the model we developed can predict density functional theory (DFT) enthalpies for a broad range of polycyclic species and assess the quality of each predicted value. For the species which the current model is uncertain about, the automatic ab initio calculations provide additional training data to improve the performance of the model. For a test set composed of 2858 cyclic and polycyclic hydrocarbons and oxygenates, the enthalpies predicted by the model agree with the reference DFT values with a root-mean-square error of 2.62 kcal/mol. Finally, we found that a model originally trained on hydrocarbons and oxygenatesmore »
- Authors:
-
- Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Dept. of Chemical Engineering
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1530407
- Grant/Contract Number:
- AC02- 05CH11231
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory
- Additional Journal Information:
- Journal Volume: 123; Journal Issue: 10; Journal ID: ISSN 1089-5639
- Publisher:
- American Chemical Society
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY
Citation Formats
Li, Yi-Pei, Han, Kehang, Grambow, Colin A., and Green, William H. Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry. United States: N. p., 2019.
Web. doi:10.1021/acs.jpca.8b10789.
Li, Yi-Pei, Han, Kehang, Grambow, Colin A., & Green, William H. Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry. United States. https://doi.org/10.1021/acs.jpca.8b10789
Li, Yi-Pei, Han, Kehang, Grambow, Colin A., and Green, William H. Wed .
"Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry". United States. https://doi.org/10.1021/acs.jpca.8b10789. https://www.osti.gov/servlets/purl/1530407.
@article{osti_1530407,
title = {Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry},
author = {Li, Yi-Pei and Han, Kehang and Grambow, Colin A. and Green, William H.},
abstractNote = {Because collecting precise and accurate chemistry data is often challenging, chemistry data sets usually only span a small region of chemical space, which limits the performance and the scope of applicability of data-driven models. To address this issue, we integrated an active learning machine with automatic ab initio calculations to form a self-evolving model that can continuously adapt to new species appointed by the users. In the present work, we demonstrate the self-evolving concept by modeling the formation enthalpies of stable closed-shell polycyclic species calculated at the B3LYP/6-31G(2df,p) level of theory. By combining a molecular graph convolutional neural network with a dropout training strategy, the model we developed can predict density functional theory (DFT) enthalpies for a broad range of polycyclic species and assess the quality of each predicted value. For the species which the current model is uncertain about, the automatic ab initio calculations provide additional training data to improve the performance of the model. For a test set composed of 2858 cyclic and polycyclic hydrocarbons and oxygenates, the enthalpies predicted by the model agree with the reference DFT values with a root-mean-square error of 2.62 kcal/mol. Finally, we found that a model originally trained on hydrocarbons and oxygenates can broaden its prediction coverage to nitrogen-containing species via an active learning process, suggesting that the continuous learning strategy is not only able to improve the model accuracy but is also capable of expanding the predictive capacity of a model to unseen species domains.},
doi = {10.1021/acs.jpca.8b10789},
journal = {Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory},
number = 10,
volume = 123,
place = {United States},
year = {Wed Feb 13 00:00:00 EST 2019},
month = {Wed Feb 13 00:00:00 EST 2019}
}
Web of Science
Works referenced in this record:
Quantum Mechanical Modeling of Catalytic Processes
journal, July 2011
- Bell, Alexis T.; Head-Gordon, Martin
- Annual Review of Chemical and Biomolecular Engineering, Vol. 2, Issue 1
Supramolecular Binding Thermodynamics by Dispersion-Corrected Density Functional Theory
journal, July 2012
- Grimme, Stefan
- Chemistry - A European Journal, Vol. 18, Issue 32
Analysis of the Reaction Mechanism and Catalytic Activity of Metal-Substituted Beta Zeolite for the Isomerization of Glucose to Fructose
journal, April 2014
- Li, Yi-Pei; Head-Gordon, Martin; Bell, Alexis T.
- ACS Catalysis, Vol. 4, Issue 5
Computational Study of p -Xylene Synthesis from Ethylene and 2,5-Dimethylfuran Catalyzed by H-BEA
journal, September 2014
- Li, Yi-Pei; Head-Gordon, Martin; Bell, Alexis T.
- The Journal of Physical Chemistry C, Vol. 118, Issue 38
Improved Force-Field Parameters for QM/MM Simulations of the Energies of Adsorption for Molecules in Zeolites and a Free Rotor Correction to the Rigid Rotor Harmonic Oscillator Model for Adsorption Enthalpies
journal, January 2015
- Li, Yi-Pei; Gomes, Joseph; Mallikarjun Sharada, Shaama
- The Journal of Physical Chemistry C, Vol. 119, Issue 4
HEAT: High accuracy extrapolated ab initio thermochemistry
journal, December 2004
- Tajti, Attila; Szalay, Péter G.; Császár, Attila G.
- The Journal of Chemical Physics, Vol. 121, Issue 23
W4 theory for computational thermochemistry: In pursuit of confident sub-kJ/mol predictions
journal, October 2006
- Karton, Amir; Rabinovich, Elena; Martin, Jan M. L.
- The Journal of Chemical Physics, Vol. 125, Issue 14
W3 theory: Robust computational thermochemistry in the kJ/mol accuracy range
journal, March 2004
- Boese, A. Daniel; Oren, Mikhal; Atasoylu, Onur
- The Journal of Chemical Physics, Vol. 120, Issue 9
Towards standard methods for benchmark quality ab initio thermochemistry—W1 and W2 theory
journal, August 1999
- Martin, Jan M. L.; de Oliveira, Glênisson
- The Journal of Chemical Physics, Vol. 111, Issue 5
Further benchmarks of a composite, convergent, statistically calibrated coupled-cluster-based approach for thermochemical and spectroscopic studies
journal, April 2012
- Feller, David; Peterson, Kirk A.; Dixon, David A.
- Molecular Physics, Vol. 110, Issue 19-20
Chemical accuracy in ab initio thermochemistry and spectroscopy: current strategies and future challenges
journal, January 2012
- Peterson, Kirk A.; Feller, David; Dixon, David A.
- Theoretical Chemistry Accounts, Vol. 131, Issue 1
An expanded calibration study of the explicitly correlated CCSD(T)-F12b method using large basis set standard CCSD(T) atomization energies
journal, August 2013
- Feller, David; Peterson, Kirk A.
- The Journal of Chemical Physics, Vol. 139, Issue 8
In pursuit of the ab initio limit for conformational energy prototypes
journal, June 1998
- Császár, Attila G.; Allen, Wesley D.; Schaefer, Henry F.
- The Journal of Chemical Physics, Vol. 108, Issue 23
Gaussian-4 theory
journal, February 2007
- Curtiss, Larry A.; Redfern, Paul C.; Raghavachari, Krishnan
- The Journal of Chemical Physics, Vol. 126, Issue 8
Automated computational thermochemistry for butane oxidation: A prelude to predictive automated combustion kinetics
journal, January 2019
- Keçeli, Murat; Elliott, Sarah N.; Li, Yi-Pei
- Proceedings of the Combustion Institute, Vol. 37, Issue 1
Thermodynamics of Anharmonic Systems: Uncoupled Mode Approximations for Molecules
journal, May 2016
- Li, Yi-Pei; Bell, Alexis T.; Head-Gordon, Martin
- Journal of Chemical Theory and Computation, Vol. 12, Issue 6
Additivity rules for the estimation of thermochemical properties
journal, June 1969
- Benson, Sidney W.; Cruickshank, F. R.; Golden, D. M.
- Chemical Reviews, Vol. 69, Issue 3
Reaction Mechanism Generator: Automatic construction of chemical kinetic mechanisms
journal, June 2016
- Gao, Connie W.; Allen, Joshua W.; Green, William H.
- Computer Physics Communications, Vol. 203
THERM: Thermodynamic property estimation for gas phase radicals and molecules
journal, September 1991
- Ritter, Edward R.; Bozzelli, Joseph W.
- International Journal of Chemical Kinetics, Vol. 23, Issue 9
Thermodynamic Parameters and Group Additivity Ring Corrections for Three- to Six-Membered Oxygen Heterocyclic Hydrocarbons
journal, March 1997
- Lay, Tsan H.; Yamada, Takahiro; Tsai, Po-Lun
- The Journal of Physical Chemistry A, Vol. 101, Issue 13
An Extended Group Additivity Method for Polycyclic Thermochemistry Estimation: AN EXTENDED GROUP ADDITIVITY METHOD FOR POLYCYCLIC THERMOCHEMISTRY ESTIMATION
journal, February 2018
- Han, Kehang; Jamal, Adeel; Grambow, Colin A.
- International Journal of Chemical Kinetics, Vol. 50, Issue 4
An adaptive distance-based group contribution method for thermodynamic property prediction
journal, January 2016
- He, Tanjin; Li, Shuang; Chi, Yawei
- Physical Chemistry Chemical Physics, Vol. 18, Issue 34
Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction
journal, July 2017
- Coley, Connor W.; Barzilay, Regina; Green, William H.
- Journal of Chemical Information and Modeling, Vol. 57, Issue 8
Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
journal, January 2012
- Rupp, Matthias; Tkatchenko, Alexandre; Müller, Klaus-Robert
- Physical Review Letters, Vol. 108, Issue 5
Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces
journal, April 2007
- Behler, Jörg; Parrinello, Michele
- Physical Review Letters, Vol. 98, Issue 14
Amp: A modular approach to machine learning in atomistic simulations
journal, October 2016
- Khorshidi, Alireza; Peterson, Andrew A.
- Computer Physics Communications, Vol. 207
Extended-Connectivity Fingerprints
journal, April 2010
- Rogers, David; Hahn, Mathew
- Journal of Chemical Information and Modeling, Vol. 50, Issue 5
Molecular graph convolutions: moving beyond fingerprints
journal, August 2016
- Kearnes, Steven; McCloskey, Kevin; Berndl, Marc
- Journal of Computer-Aided Molecular Design, Vol. 30, Issue 8
Systematic Error Estimation for Chemical Reaction Energies
journal, May 2016
- Simm, Gregor N.; Reiher, Markus
- Journal of Chemical Theory and Computation, Vol. 12, Issue 6
Error-Controlled Exploration of Chemical Reaction Networks with Gaussian Processes
journal, August 2018
- Simm, Gregor N.; Reiher, Markus
- Journal of Chemical Theory and Computation, Vol. 14, Issue 10
An Introduction to the Bootstrap
book, May 1994
- Efron, Bradley; Tibshirani, R. J.
- Monographs on Statistics and Applied Probability
Reliable Estimation of Prediction Uncertainty for Physicochemical Property Models
journal, June 2017
- Proppe, Jonny; Reiher, Markus
- Journal of Chemical Theory and Computation, Vol. 13, Issue 7
Active Learning
journal, June 2012
- Settles, Burr
- Synthesis Lectures on Artificial Intelligence and Machine Learning, Vol. 6, Issue 1
Addressing uncertainty in atomistic machine learning
journal, January 2017
- Peterson, Andrew A.; Christensen, Rune; Khorshidi, Alireza
- Physical Chemistry Chemical Physics, Vol. 19, Issue 18
Less is more: Sampling chemical space with active learning
journal, June 2018
- Smith, Justin S.; Nebgen, Ben; Lubbers, Nicholas
- The Journal of Chemical Physics, Vol. 148, Issue 24
Quantum chemistry structures and properties of 134 kilo molecules
journal, August 2014
- Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
- Scientific Data, Vol. 1, Issue 1
Challenges for Density Functional Theory
journal, December 2011
- Cohen, Aron J.; Mori-Sánchez, Paula; Yang, Weitao
- Chemical Reviews, Vol. 112, Issue 1
ωB97X-V: A 10-parameter, range-separated hybrid, generalized gradient approximation density functional with nonlocal correlation, designed by a survival-of-the-fittest strategy
journal, January 2014
- Mardirossian, Narbe; Head-Gordon, Martin
- Physical Chemistry Chemical Physics, Vol. 16, Issue 21
Uncertainty quantification for quantum chemical models of complex reaction networks
journal, January 2016
- Proppe, Jonny; Husch, Tamara; Simm, Gregor N.
- Faraday Discussions, Vol. 195
Context-Driven Exploration of Complex Chemical Reaction Networks
journal, November 2017
- Simm, Gregor N.; Reiher, Markus
- Journal of Chemical Theory and Computation, Vol. 13, Issue 12
Advances in molecular quantum chemistry contained in the Q-Chem 4 program package
journal, September 2014
- Shao, Yihan; Gan, Zhengting; Epifanovsky, Evgeny
- Molecular Physics, Vol. 113, Issue 2
Uncertainty quantification in thermochemistry, benchmarking electronic structure computations, and Active Thermochemical Tables
journal, January 2014
- Ruscic, Branko
- International Journal of Quantum Chemistry, Vol. 114, Issue 17
A Hybrid Human-computer Approach to the Extraction of Scientific Facts from the Literature
journal, January 2016
- Tchoua, Roselyne B.; Chard, Kyle; Audus, Debra
- Procedia Computer Science, Vol. 80
ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature
journal, October 2016
- Swain, Matthew C.; Cole, Jacqueline M.
- Journal of Chemical Information and Modeling, Vol. 56, Issue 10
ChemicalTagger: A tool for semantic text-mining in chemistry
journal, May 2011
- Hawizy, Lezan; Jessop, David M.; Adams, Nico
- Journal of Cheminformatics, Vol. 3, Issue 1
ChemSpot: a hybrid system for chemical named entity recognition
journal, April 2012
- Rocktäschel, Tim; Weidlich, Michael; Leser, Ulf
- Bioinformatics, Vol. 28, Issue 12
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning
journal, October 2017
- Kim, Edward; Huang, Kevin; Saunders, Adam
- Chemistry of Materials, Vol. 29, Issue 21
Using natural language processing techniques to inform research on nanotechnology
journal, January 2015
- Lewinski, Nastassja A.; McInnes, Bridget T.
- Beilstein Journal of Nanotechnology, Vol. 6
ChemOS: Orchestrating autonomous experimentation
journal, June 2018
- Roch, Loïc M.; Häse, Florian; Kreisbeck, Christoph
- Science Robotics, Vol. 3, Issue 19
Networking chemical robots for reaction multitasking
journal, August 2018
- Caramelli, Dario; Salley, Daniel; Henson, Alon
- Nature Communications, Vol. 9, Issue 1
Harmonic Vibrational Frequencies: An Evaluation of Hartree−Fock, Møller−Plesset, Quadratic Configuration Interaction, Density Functional Theory, and Semiempirical Scale Factors
journal, January 1996
- Scott, Anthony P.; Radom, Leo
- The Journal of Physical Chemistry, Vol. 100, Issue 41
New Scale Factors for Harmonic Vibrational Frequencies Using the B3LYP Density Functional Method with the Triple-ζ Basis Set 6-311+G(d,p)
journal, March 2005
- Andersson, M. P.; Uvdal, P.
- The Journal of Physical Chemistry A, Vol. 109, Issue 12
Molpro: a general-purpose quantum chemistry program package: Molpro
journal, July 2011
- Werner, Hans-Joachim; Knowles, Peter J.; Knizia, Gerald
- Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 2, Issue 2
A simple and efficient CCSD(T)-F12 approximation
journal, December 2007
- Adler, Thomas B.; Knizia, Gerald; Werner, Hans-Joachim
- The Journal of Chemical Physics, Vol. 127, Issue 22
Works referencing / citing this record:
Thermochemistry Prediction and Automatic Reaction Mechanism Generation for Oxygenated Sulfur Systems: A Case Study of Dimethyl Sulfide Oxidation
journal, February 2020
- Gillis, Ryan J.; Green, William H.
- ChemSystemsChem, Vol. 2, Issue 4