skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Self-Evolving Machine: A Continuously Improving Model for Molecular Thermochemistry

Journal Article · · Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory

Because collecting precise and accurate chemistry data is often challenging, chemistry data sets usually only span a small region of chemical space, which limits the performance and the scope of applicability of data-driven models. To address this issue, we integrated an active learning machine with automatic ab initio calculations to form a self-evolving model that can continuously adapt to new species appointed by the users. In the present work, we demonstrate the self-evolving concept by modeling the formation enthalpies of stable closed-shell polycyclic species calculated at the B3LYP/6-31G(2df,p) level of theory. By combining a molecular graph convolutional neural network with a dropout training strategy, the model we developed can predict density functional theory (DFT) enthalpies for a broad range of polycyclic species and assess the quality of each predicted value. For the species which the current model is uncertain about, the automatic ab initio calculations provide additional training data to improve the performance of the model. For a test set composed of 2858 cyclic and polycyclic hydrocarbons and oxygenates, the enthalpies predicted by the model agree with the reference DFT values with a root-mean-square error of 2.62 kcal/mol. Finally, we found that a model originally trained on hydrocarbons and oxygenates can broaden its prediction coverage to nitrogen-containing species via an active learning process, suggesting that the continuous learning strategy is not only able to improve the model accuracy but is also capable of expanding the predictive capacity of a model to unseen species domains.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
AC02- 05CH11231
OSTI ID:
1530407
Journal Information:
Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory, Vol. 123, Issue 10; ISSN 1089-5639
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 39 works
Citation information provided by
Web of Science

References (55)

Quantum Mechanical Modeling of Catalytic Processes journal July 2011
Supramolecular Binding Thermodynamics by Dispersion-Corrected Density Functional Theory journal July 2012
Analysis of the Reaction Mechanism and Catalytic Activity of Metal-Substituted Beta Zeolite for the Isomerization of Glucose to Fructose journal April 2014
Computational Study of p -Xylene Synthesis from Ethylene and 2,5-Dimethylfuran Catalyzed by H-BEA journal September 2014
Improved Force-Field Parameters for QM/MM Simulations of the Energies of Adsorption for Molecules in Zeolites and a Free Rotor Correction to the Rigid Rotor Harmonic Oscillator Model for Adsorption Enthalpies journal January 2015
HEAT: High accuracy extrapolated ab initio thermochemistry journal December 2004
W4 theory for computational thermochemistry: In pursuit of confident sub-kJ/mol predictions journal October 2006
W3 theory: Robust computational thermochemistry in the kJ/mol accuracy range journal March 2004
Towards standard methods for benchmark quality ab initio thermochemistry—W1 and W2 theory journal August 1999
Further benchmarks of a composite, convergent, statistically calibrated coupled-cluster-based approach for thermochemical and spectroscopic studies journal April 2012
Chemical accuracy in ab initio thermochemistry and spectroscopy: current strategies and future challenges journal January 2012
An expanded calibration study of the explicitly correlated CCSD(T)-F12b method using large basis set standard CCSD(T) atomization energies journal August 2013
In pursuit of the ab initio limit for conformational energy prototypes journal June 1998
Gaussian-4 theory journal February 2007
Automated computational thermochemistry for butane oxidation: A prelude to predictive automated combustion kinetics journal January 2019
Thermodynamics of Anharmonic Systems: Uncoupled Mode Approximations for Molecules journal May 2016
Additivity rules for the estimation of thermochemical properties journal June 1969
Reaction Mechanism Generator: Automatic construction of chemical kinetic mechanisms journal June 2016
THERM: Thermodynamic property estimation for gas phase radicals and molecules journal September 1991
Thermodynamic Parameters and Group Additivity Ring Corrections for Three- to Six-Membered Oxygen Heterocyclic Hydrocarbons journal March 1997
An Extended Group Additivity Method for Polycyclic Thermochemistry Estimation: AN EXTENDED GROUP ADDITIVITY METHOD FOR POLYCYCLIC THERMOCHEMISTRY ESTIMATION journal February 2018
An adaptive distance-based group contribution method for thermodynamic property prediction journal January 2016
Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction journal July 2017
Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning journal January 2012
Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces journal April 2007
Amp: A modular approach to machine learning in atomistic simulations journal October 2016
Extended-Connectivity Fingerprints journal April 2010
Molecular graph convolutions: moving beyond fingerprints journal August 2016
Systematic Error Estimation for Chemical Reaction Energies journal May 2016
Error-Controlled Exploration of Chemical Reaction Networks with Gaussian Processes journal August 2018
An Introduction to the Bootstrap book May 1994
Reliable Estimation of Prediction Uncertainty for Physicochemical Property Models journal June 2017
Active Learning journal June 2012
Addressing uncertainty in atomistic machine learning journal January 2017
Less is more: Sampling chemical space with active learning journal June 2018
Bagging predictors journal August 1996
Quantum chemistry structures and properties of 134 kilo molecules journal August 2014
Challenges for Density Functional Theory journal December 2011
ωB97X-V: A 10-parameter, range-separated hybrid, generalized gradient approximation density functional with nonlocal correlation, designed by a survival-of-the-fittest strategy journal January 2014
Uncertainty quantification for quantum chemical models of complex reaction networks journal January 2016
Context-Driven Exploration of Complex Chemical Reaction Networks journal November 2017
Advances in molecular quantum chemistry contained in the Q-Chem 4 program package journal September 2014
Uncertainty quantification in thermochemistry, benchmarking electronic structure computations, and Active Thermochemical Tables journal January 2014
A Hybrid Human-computer Approach to the Extraction of Scientific Facts from the Literature journal January 2016
ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature journal October 2016
ChemicalTagger: A tool for semantic text-mining in chemistry journal May 2011
ChemSpot: a hybrid system for chemical named entity recognition journal April 2012
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning journal October 2017
Using natural language processing techniques to inform research on nanotechnology journal January 2015
ChemOS: Orchestrating autonomous experimentation journal June 2018
Networking chemical robots for reaction multitasking journal August 2018
Harmonic Vibrational Frequencies:  An Evaluation of Hartree−Fock, Møller−Plesset, Quadratic Configuration Interaction, Density Functional Theory, and Semiempirical Scale Factors journal January 1996
New Scale Factors for Harmonic Vibrational Frequencies Using the B3LYP Density Functional Method with the Triple-ζ Basis Set 6-311+G(d,p) journal March 2005
Molpro: a general-purpose quantum chemistry program package: Molpro
  • Werner, Hans-Joachim; Knowles, Peter J.; Knizia, Gerald
  • Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 2, Issue 2 https://doi.org/10.1002/wcms.82
journal July 2011
A simple and efficient CCSD(T)-F12 approximation journal December 2007

Cited By (1)