skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Bayesian model aggregation for ensemble-based estimates of protein pKa values

Journal Article · · Proteins. Structure, Function, and Bioinformatics, 82(3):354-363
DOI:https://doi.org/10.1002/prot.24390· OSTI ID:1129305

This paper investigates an ensemble-based technique called Bayesian Model Averaging (BMA) to improve the performance of protein amino acid p$$K_a$$ predictions. Structure-based p$$K_a$$ calculations play an important role in the mechanistic interpretation of protein structure and are also used to determine a wide range of protein properties. A diverse set of methods currently exist for p$$K_a$$ prediction, ranging from empirical statistical models to {\it ab initio} quantum mechanical approaches. However, each of these methods are based on a set of assumptions that have inherent bias and sensitivities that can effect a model's accuracy and generalizability for p$$K_a$$ prediction in complicated biomolecular systems. We use BMA to combine eleven diverse prediction methods that each estimate pKa values of amino acids in staphylococcal nuclease. These methods are based on work conducted for the pKa Cooperative and the pKa measurements are based on experimental work conducted by the Garc{\'i}a-Moreno lab. Our study demonstrates that the aggregated estimate obtained from BMA outperforms all individual prediction methods in our cross-validation study with improvements from 40-70\% over other method classes. This work illustrates a new possible mechanism for improving the accuracy of p$$K_a$$ prediction and lays the foundation for future work on aggregate models that balance computational cost with prediction accuracy.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1129305
Report Number(s):
PNNL-SA-95333; 400412000
Journal Information:
Proteins. Structure, Function, and Bioinformatics, 82(3):354-363, Journal Name: Proteins. Structure, Function, and Bioinformatics, 82(3):354-363
Country of Publication:
United States
Language:
English

Similar Records

Accurate and Transferable Reactive Molecular Dynamics Models from Constrained Density Functional Theory
Journal Article · Tue Sep 14 00:00:00 EDT 2021 · Journal of Physical Chemistry. B · OSTI ID:1129305

Bayesian Model Averaging for Ensemble-Based Estimates of Solvation Free Energies
Journal Article · Thu Apr 20 00:00:00 EDT 2017 · Journal of Physical Chemistry B · OSTI ID:1129305

Progress in the prediction of pKa values in proteins
Journal Article · Thu Dec 15 00:00:00 EST 2011 · Proteins. Structure, Function, and Bioinformatics, 79(12):3260-3275 · OSTI ID:1129305