skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method

Abstract

We report an evaluation of a semi-empirical quantum chemical method PM7 from the perspective of uncertainty quantification. Specifically, we apply Bound-to-Bound Data Collaboration, an uncertainty quantification framework, to characterize (a) variability of PM7 model parameter values consistent with the uncertainty in the training data and (b) uncertainty propagation from the training data to the model predictions. Experimental heats of formation of a homologous series of linear alkanes are used as the property of interest. The training data are chemically accurate, i.e., they have very low uncertainty by the standards of computational chemistry. The analysis does not find evidence of PM7 consistency with the entire data set considered as no single set of parameter values is found that captures the experimental uncertainties of all training data. A set of parameter values for PM7 was able to capture the training data within ±1 kcal/mol, but not to the smaller level of uncertainty in the reported data. Nevertheless, PM7 was found to be consistent for subsets of the training data. In such cases, uncertainty propagation from the chemically accurate training data to the predicted values preserves error within bounds of chemical accuracy if predictions are made for the molecules of comparable size. Otherwise,more » the error grows linearly with the relative size of the molecules.« less

Authors:
; ; ; ; ; ;
Publication Date:
Research Org.:
Univ. of Utah, Salt Lake City, UT (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1467754
Alternate Identifier(s):
OSTI ID: 1545322
Grant/Contract Number:  
[NA0002375]
Resource Type:
Published Article
Journal Name:
Scientific Reports
Additional Journal Information:
[Journal Name: Scientific Reports Journal Volume: 8 Journal Issue: 1]; Journal ID: ISSN 2045-2322
Publisher:
Nature Publishing Group
Country of Publication:
United Kingdom
Language:
English
Subject:
42 ENGINEERING

Citation Formats

Oreluk, James, Liu, Zhenyuan, Hegde, Arun, Li, Wenyu, Packard, Andrew, Frenklach, Michael, and Zubarev, Dmitry. Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method. United Kingdom: N. p., 2018. Web. doi:10.1038/s41598-018-31677-y.
Oreluk, James, Liu, Zhenyuan, Hegde, Arun, Li, Wenyu, Packard, Andrew, Frenklach, Michael, & Zubarev, Dmitry. Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method. United Kingdom. doi:10.1038/s41598-018-31677-y.
Oreluk, James, Liu, Zhenyuan, Hegde, Arun, Li, Wenyu, Packard, Andrew, Frenklach, Michael, and Zubarev, Dmitry. Wed . "Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method". United Kingdom. doi:10.1038/s41598-018-31677-y.
@article{osti_1467754,
title = {Diagnostics of Data-Driven Models: Uncertainty Quantification of PM7 Semi-Empirical Quantum Chemical Method},
author = {Oreluk, James and Liu, Zhenyuan and Hegde, Arun and Li, Wenyu and Packard, Andrew and Frenklach, Michael and Zubarev, Dmitry},
abstractNote = {We report an evaluation of a semi-empirical quantum chemical method PM7 from the perspective of uncertainty quantification. Specifically, we apply Bound-to-Bound Data Collaboration, an uncertainty quantification framework, to characterize (a) variability of PM7 model parameter values consistent with the uncertainty in the training data and (b) uncertainty propagation from the training data to the model predictions. Experimental heats of formation of a homologous series of linear alkanes are used as the property of interest. The training data are chemically accurate, i.e., they have very low uncertainty by the standards of computational chemistry. The analysis does not find evidence of PM7 consistency with the entire data set considered as no single set of parameter values is found that captures the experimental uncertainties of all training data. A set of parameter values for PM7 was able to capture the training data within ±1 kcal/mol, but not to the smaller level of uncertainty in the reported data. Nevertheless, PM7 was found to be consistent for subsets of the training data. In such cases, uncertainty propagation from the chemically accurate training data to the predicted values preserves error within bounds of chemical accuracy if predictions are made for the molecules of comparable size. Otherwise, the error grows linearly with the relative size of the molecules.},
doi = {10.1038/s41598-018-31677-y},
journal = {Scientific Reports},
number = [1],
volume = [8],
place = {United Kingdom},
year = {2018},
month = {9}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
DOI: 10.1038/s41598-018-31677-y

Save / Share:

Works referenced in this record:

Transforming data into knowledge—Process Informatics for combustion chemistry
journal, January 2007


Semiempirical quantum–chemical methods
journal, July 2013

  • Thiel, Walter
  • Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 4, Issue 2
  • DOI: 10.1002/wcms.1161

Polarizable Force Fields:  History, Test Cases, and Prospects
journal, September 2007

  • Warshel, Arieh; Kato, Mitsunori; Pisliakov, Andrei V.
  • Journal of Chemical Theory and Computation, Vol. 3, Issue 6
  • DOI: 10.1021/ct700127w

Quest for a universal density functional: the accuracy of density functionals across a broad spectrum of databases in chemistry and physics
journal, March 2014

  • Peverati, Roberto; Truhlar, Donald G.
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 372, Issue 2011
  • DOI: 10.1098/rsta.2012.0476

Computational prediction of protein interfaces: A review of data driven methods
journal, October 2015


Uncertainty quantification: Making predictions of complex reaction systems reliable
journal, October 2010


Comparison of Statistical and Deterministic Frameworks of Uncertainty Quantification
journal, January 2016

  • Frenklach, Michael; Packard, Andrew; Garcia-Donato, Gonzalo
  • SIAM/ASA Journal on Uncertainty Quantification, Vol. 4, Issue 1
  • DOI: 10.1137/15M1019131

Density functional tight binding: values of semi-empirical methods in an ab initio era
journal, January 2014

  • Cui, Qiang; Elstner, Marcus
  • Phys. Chem. Chem. Phys., Vol. 16, Issue 28
  • DOI: 10.1039/C4CP00908H

Towards cleaner combustion engines through groundbreaking detailed chemical kinetic models
journal, January 2011

  • Battin-Leclerc, Frédérique; Blurock, Edward; Bounaceur, Roda
  • Chemical Society Reviews, Vol. 40, Issue 9
  • DOI: 10.1039/c0cs00207k

Uncertainty quantification in thermochemistry, benchmarking electronic structure computations, and Active Thermochemical Tables
journal, January 2014

  • Ruscic, Branko
  • International Journal of Quantum Chemistry, Vol. 114, Issue 17
  • DOI: 10.1002/qua.24605

Deep learning for computational chemistry
journal, March 2017

  • Goh, Garrett B.; Hodas, Nathan O.; Vishnu, Abhinav
  • Journal of Computational Chemistry, Vol. 38, Issue 16
  • DOI: 10.1002/jcc.24764

Semiempirical Quantum-Chemical Orthogonalization-Corrected Methods: Theory, Implementation, and Parameters
journal, January 2016

  • Dral, Pavlo O.; Wu, Xin; Spörkel, Lasse
  • Journal of Chemical Theory and Computation, Vol. 12, Issue 3
  • DOI: 10.1021/acs.jctc.5b01046

Semiempirical Quantum-Chemical Orthogonalization-Corrected Methods: Benchmarks for Ground-State Properties
journal, January 2016

  • Dral, Pavlo O.; Wu, Xin; Spörkel, Lasse
  • Journal of Chemical Theory and Computation, Vol. 12, Issue 3
  • DOI: 10.1021/acs.jctc.5b01047

Prediction uncertainty from models and data
conference, January 2002

  • Frenklach, M.; Packard, A.; Seiler, P.
  • Proceedings of 2002 American Control Conference, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301)
  • DOI: 10.1109/ACC.2002.1024578

The Effects of Computational Modeling Errors on the Estimation of Statistical Mechanical Variables
journal, March 2012

  • Faver, John C.; Yang, Wei; Merz, Kenneth M.
  • Journal of Chemical Theory and Computation, Vol. 8, Issue 10
  • DOI: 10.1021/ct300024z

QSAR Modeling: Where Have You Been? Where Are You Going To?
journal, January 2014

  • Cherkasov, Artem; Muratov, Eugene N.; Fourches, Denis
  • Journal of Medicinal Chemistry, Vol. 57, Issue 12
  • DOI: 10.1021/jm4004285

An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling $
journal, November 2016


Interval Prediction of Molecular Properties in Parametrized Quantum Chemistry
journal, June 2014


Consistency of a Reaction Dataset
journal, November 2004

  • Feeley, Ryan; Seiler, Pete; Packard, Andrew
  • The Journal of Physical Chemistry A, Vol. 108, Issue 44
  • DOI: 10.1021/jp047524w

Density functional theory is straying from the path toward the exact functional
journal, January 2017

  • Medvedev, Michael G.; Bushmarinov, Ivan S.; Sun, Jianwei
  • Science, Vol. 355, Issue 6320
  • DOI: 10.1126/science.aah5975

Comparison of Molecular Mechanics, Semi-Empirical Quantum Mechanical, and Density Functional Theory Methods for Scoring Protein–Ligand Interactions
journal, June 2013

  • Yilmazer, Nusret Duygu; Korth, Martin
  • The Journal of Physical Chemistry B, Vol. 117, Issue 27
  • DOI: 10.1021/jp402719k

Hybrid Density Functional Methods Empirically Optimized for the Computation of 13 C and 1 H Chemical Shifts in Chloroform Solution
journal, May 2006

  • Wiitala, Keith W.; Hoye, Thomas R.; Cramer, Christopher J.
  • Journal of Chemical Theory and Computation, Vol. 2, Issue 4
  • DOI: 10.1021/ct6001016

Improving the accuracy of Møller-Plesset perturbation theory with neural networks
journal, October 2017

  • McGibbon, Robert T.; Taube, Andrew G.; Donchev, Alexander G.
  • The Journal of Chemical Physics, Vol. 147, Issue 16
  • DOI: 10.1063/1.4986081

Numerical approaches for collaborative data processing
journal, December 2006

  • Seiler, Pete; Frenklach, Michael; Packard, Andrew
  • Optimization and Engineering, Vol. 7, Issue 4
  • DOI: 10.1007/s11081-006-0350-4

Atomic Radius and Charge Parameter Uncertainty in Biomolecular Solvation Energy Calculations
journal, January 2018

  • Yang, Xiu; Lei, Huan; Gao, Peiyuan
  • Journal of Chemical Theory and Computation, Vol. 14, Issue 2
  • DOI: 10.1021/acs.jctc.7b00905

Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules
journal, July 2013

  • Lusci, Alessandro; Pollastri, Gianluca; Baldi, Pierre
  • Journal of Chemical Information and Modeling, Vol. 53, Issue 7
  • DOI: 10.1021/ci400187y

Error Assessment of Computational Models in Chemistry
journal, April 2017

  • Simm, GregorN.; Proppe, Jonny; Reiher, Markus
  • CHIMIA International Journal for Chemistry, Vol. 71, Issue 4
  • DOI: 10.2533/chimia.2017.202

An Empirical Polarizable Force Field Based on the Classical Drude Oscillator Model: Development History and Recent Applications
journal, January 2016


Design of Density Functionals by Combining the Method of Constraint Satisfaction with Parametrization for Thermochemistry, Thermochemical Kinetics, and Noncovalent Interactions
journal, January 2006

  • Zhao, Yan; Schultz, Nathan E.; Truhlar, Donald G.
  • Journal of Chemical Theory and Computation, Vol. 2, Issue 2
  • DOI: 10.1021/ct0502763

Perspective on density functional theory
journal, April 2012

  • Burke, Kieron
  • The Journal of Chemical Physics, Vol. 136, Issue 15
  • DOI: 10.1063/1.4704546

Semiempirical Quantum Mechanical Methods for Noncovalent Interactions for Chemical and Biochemical Applications
journal, April 2016


Additivity rules for the estimation of thermochemical properties
journal, June 1969

  • Benson, Sidney W.; Cruickshank, F. R.; Golden, D. M.
  • Chemical Reviews, Vol. 69, Issue 3
  • DOI: 10.1021/cr60259a002

Chemical Kinetics and Combustion Modeling
journal, October 1990