skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Markov state models from short non-equilibrium simulations—Analysis and correction of estimation bias

Journal Article · · Journal of Chemical Physics
DOI:https://doi.org/10.1063/1.4976518· OSTI ID:1565592

To begin, many state-of-the-art methods for the thermodynamic and kinetic characterization of large and complex biomolecular systems by simulation rely on ensemble approaches, where data from large numbers of relatively short trajectories are integrated. In this context, Markov state models (MSMs) are extremely popular because they can be used to compute stationary quantities and long-time kinetics from ensembles of short simulations, provided that these short simulations are in “local equilibrium” within the MSM states. However, over the last 15 years since the inception of MSMs, it has been controversially discussed and not yet been answered how deviations from local equilibrium can be detected, whether these deviations induce a practical bias in MSM estimation, and how to correct for them. In this paper, we address these issues: We systematically analyze the estimation of MSMs from short non-equilibrium simulations, and we provide an expression for the error between unbiased transition probabilities and the expected estimate from many short simulations. We show that the unbiased MSM estimate can be obtained even from relatively short non-equilibrium simulations in the limit of long lag times and good discretization. Further, we exploit observable operator model (OOM) theory to derive an unbiased estimator for the MSM transition matrix that corrects for the effect of starting out of equilibrium, even when short lag times are used. Finally, we show how the OOM framework can be used to estimate the exact eigenvalues or relaxation time scales of the system without estimating an MSM transition matrix, which allows us to practically assess the discretization quality of the MSM. Applications to model systems and molecular dynamics simulation data of alanine dipeptide are included for illustration. The improved MSM estimator is implemented in PyEMMA of version 2.3.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Science (SC)
OSTI ID:
1565592
Journal Information:
Journal of Chemical Physics, Vol. 146, Issue 9; ISSN 0021-9606
Publisher:
American Institute of Physics (AIP)Copyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 42 works
Citation information provided by
Web of Science

References (35)

Projected and Hidden Markov Models for calculating kinetics and metastable states of complex molecules text January 2013
Projected metastable Markov processes and their estimation with observable operator models journal October 2015
Atomic-Level Characterization of the Structural Dynamics of Proteins journal October 2010
Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 1. Theory journal May 2004
Projected and hidden Markov models for calculating kinetics and metastable states of complex molecules journal November 2013
Dynamic properties of force fields journal February 2015
HTMD: High-Throughput Molecular Dynamics for Molecular Discovery journal March 2016
A variational approach to modeling slow processes in stochastic dynamical systems preprint January 2012
Improved side-chain torsion potentials for the Amber ff99SB protein force field journal January 2010
Hierarchical analysis of conformational dynamics in biomolecules: Transition networks of metastable states journal April 2007
Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics journal April 2007
How Fast-Folding Proteins Fold journal October 2011
Variational Approach to Molecular Kinetics journal March 2014
Estimation and uncertainty of reversible Markov models journal November 2015
Molecular Simulation of ab Initio Protein Folding for a Millisecond Folder NTL9(1−39) journal February 2010
PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models journal October 2015
Progress and challenges in the automated construction of Markov state models for full protein systems journal September 2009
ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time Scale journal May 2009
Rapid equilibrium sampling initiated from nonequilibrium data journal September 2009
Markov models of molecular kinetics: Generation and validation journal May 2011
Probability distributions of molecular observables computed from Markov models journal June 2008
A Variational Approach to Modeling Slow Processes in Stochastic Dynamical Systems journal January 2013
Using generalized ensemble simulations and Markov state models to identify conformational states journal October 2009
Fast recovery of free energy landscapes via diffusion-map-directed molecular dynamics journal January 2014
Particle mesh Ewald: An N ⋅log( N ) method for Ewald sums in large systems journal June 1993
On-the-Fly Learning and Sampling of Ligand Binding by High-Throughput Molecular Simulations journal April 2014
Observable Operator Models for Discrete Stochastic Time Series journal June 2000
On the Approximation Quality of Markov State Models journal January 2010
A Direct Approach to Conformational Dynamics Based on Hybrid Monte Carlo journal May 1999
Simulating the T-Jump-Triggered Unfolding Dynamics of trpzip2 Peptide and Its Time-Resolved IR and Two-Dimensional IR Signals Using the Markov State Model Approach journal May 2011
Constructing the equilibrium ensemble of folding pathways from short off-equilibrium simulations journal November 2009
Estimating the Eigenvalue Error of Markov State Models journal January 2012
Coarse Master Equations for Peptide Folding Dynamics journal May 2008
Estimation and uncertainty of reversible Markov models preprint January 2015
Improving the specificity of organophosphorus hydrolase to acephate by mutagenesis at its binding site: a computational study journal May 2021

Cited By (20)

Coarse-graining molecular systems by spectral matching journal July 2019
Building Markov State Models Using Optimal Transport Theory posted_content December 2018
Quantitative comparison of adaptive sampling methods for protein dynamics journal December 2018
On the removal of initial state bias from simulation data text January 2019
Dynamical matrix propagator scheme for large-scale proton dynamics simulations journal March 2020
Markov Models of Molecular Kinetics journal November 2019
Adaptive Markov state model estimation using short reseeding trajectories journal January 2020
Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics journal June 2018
Insights into the cooperative nature of ATP hydrolysis in actin filaments posted_content May 2018
Insights into the Cooperative Nature of ATP Hydrolysis in Actin Filaments journal February 2018
Identification of kinetic order parameters for non-equilibrium dynamics journal April 2019
VAMPnets for deep learning of molecular kinetics journal January 2018
Machine Learning of coarse-grained Molecular Dynamics Force Fields preprint January 2018
On the removal of initial state bias from simulation data journal March 2019
Building Markov state models using optimal transport theory journal February 2019
Coarse-graining Molecular Systems by Spectral Matching text January 2019
Identification of kinetic order parameters for non-equilibrium dynamics text January 2018
Special Topic: Markov Models of Molecular Kinetics text January 2019
Building Markov State Models Using Optimal Transport Theory posted_content December 2018
Adaptive Markov State Model estimation using short reseeding trajectories text January 2019