Learning reduced kinetic Monte Carlo models of complex chemistry from molecular dynamics
Here, we propose a novel statistical learning framework for automatically and efficiently building reduced kinetic Monte Carlo (KMC) models of largescale elementary reaction networks from data generated by a single or few molecular dynamics simulations (MD). Existing approaches for identifying species and reactions from molecular dynamics typically use bond length and duration criteria, where bond duration is a fixed parameter motivated by an understanding of bond vibrational frequencies. Conversely, we show that for highly reactive systems, bond duration should be a model parameter that is chosen to maximize the predictive power of the resulting statistical model. We demonstrate our method on a high temperature, high pressure system of reacting liquid methane, and show that the learned KMC model is able to extrapolate more than an order of magnitude in time for key molecules. Additionally, our KMC model of elementary reactions enables us to isolate the most important set of reactions governing the behavior of key molecules found in the MD simulation. We develop a new datadriven algorithm to reduce the chemical reaction network which can be solved either as an integer program or efficiently using L1 regularization, and compare our results with simple countbased reduction. For our liquid methane system,more »
 Authors:

^{[1]}
;
^{[2]};
^{[3]}
 Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering
 Pontifical Catholic Univ. of Chile, Santiago (Chile). Mathematical and Computational Engineering
 Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering and Dept. of Materials Science and Engineering
 Publication Date:
 Grant/Contract Number:
 NA0002007
 Type:
 Accepted Manuscript
 Journal Name:
 Chemical Science
 Additional Journal Information:
 Journal Volume: 679; Journal ID: ISSN 20416520
 Publisher:
 Royal Society of Chemistry
 Research Org:
 Washington State Univ., Pullman, WA (United States)
 Sponsoring Org:
 USDOE National Nuclear Security Administration (NNSA)
 Country of Publication:
 United States
 Language:
 English
 Subject:
 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY
 OSTI Identifier:
 1367888
Yang, Qian, SingLong, Carlos A., and Reed, Evan J.. Learning reduced kinetic Monte Carlo models of complex chemistry from molecular dynamics. United States: N. p.,
Web. doi:10.1039/c7sc01052d.
Yang, Qian, SingLong, Carlos A., & Reed, Evan J.. Learning reduced kinetic Monte Carlo models of complex chemistry from molecular dynamics. United States. doi:10.1039/c7sc01052d.
Yang, Qian, SingLong, Carlos A., and Reed, Evan J.. 2017.
"Learning reduced kinetic Monte Carlo models of complex chemistry from molecular dynamics". United States.
doi:10.1039/c7sc01052d. https://www.osti.gov/servlets/purl/1367888.
@article{osti_1367888,
title = {Learning reduced kinetic Monte Carlo models of complex chemistry from molecular dynamics},
author = {Yang, Qian and SingLong, Carlos A. and Reed, Evan J.},
abstractNote = {Here, we propose a novel statistical learning framework for automatically and efficiently building reduced kinetic Monte Carlo (KMC) models of largescale elementary reaction networks from data generated by a single or few molecular dynamics simulations (MD). Existing approaches for identifying species and reactions from molecular dynamics typically use bond length and duration criteria, where bond duration is a fixed parameter motivated by an understanding of bond vibrational frequencies. Conversely, we show that for highly reactive systems, bond duration should be a model parameter that is chosen to maximize the predictive power of the resulting statistical model. We demonstrate our method on a high temperature, high pressure system of reacting liquid methane, and show that the learned KMC model is able to extrapolate more than an order of magnitude in time for key molecules. Additionally, our KMC model of elementary reactions enables us to isolate the most important set of reactions governing the behavior of key molecules found in the MD simulation. We develop a new datadriven algorithm to reduce the chemical reaction network which can be solved either as an integer program or efficiently using L1 regularization, and compare our results with simple countbased reduction. For our liquid methane system, we discover that rare reactions do not play a significant role in the system, and find that less than 7% of the approximately 2000 reactions observed from molecular dynamics are necessary to reproduce the molecular concentration over time of methane. Furthermore, we describe a framework in this work that paves the way towards a genomic approach to studying complex chemical systems, where expensive MD simulation data can be reused to contribute to an increasingly large and accurate genome of elementary reactions and rates.},
doi = {10.1039/c7sc01052d},
journal = {Chemical Science},
number = ,
volume = 679,
place = {United States},
year = {2017},
month = {6}
}