Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability
Abstract
Predicting how a point mutation alters a protein’s stability can guide pharmaceutical drug design initiatives which aim to counter the effects of serious diseases. Conducting mutagenesis studies in physical proteins can give insights about the effects of amino acid substitutions, but such wet-lab work is prohibitive due to the time as well as financial resources needed to assess the effect of even a single amino acid substitution. Computational methods for predicting the effects of a mutation on a protein structure can complement wet-lab work, and varying approaches are available with promising accuracy rates. In this work we compare and assess the utility of several machine learning methods and their ability to predict the effects of single and double mutations. We in silico generate mutant protein structures, and compute several rigidity metrics for each of them. We use these as features for our Support Vector Regression (SVR), Random Forest (RF), and Deep Neural Network (DNN) methods. We validate the predictions of our in silico mutations against experimental ΔΔG stability data, and attain Pearson Correlation values upwards of 0.71 for single mutations, and 0.81 for double mutations. We perform ablation studies to assess which features contribute most to a model’s success, andmore »
- Authors:
-
- Univ. of Massachusetts Boston, MA (United States). Dept. of Computer Science
- Western Washington Univ., Bellingham, WA (United States). Dept. of Computer Science
- Western Washington Univ., Bellingham, WA (United States). Dept. of Computer Science; Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Computing and Analytics Division
- Publication Date:
- Research Org.:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1628486
- Grant/Contract Number:
- AC05-76RL01830
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Molecules
- Additional Journal Information:
- Journal Volume: 23; Journal Issue: 2; Journal ID: ISSN 1420-3049
- Publisher:
- MDPI
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY; 59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; Biochemistry & Molecular Biology; Chemistry; machine learning; protein mutational study; SVR; RF; DNN; rigidity analysis
Citation Formats
Dehghanpoor, Ramin, Ricks, Evan, Hursh, Katie, Gunderson, Sarah, Farhoodi, Roshanak, Haspel, Nurit, Hutchinson, Brian, and Jagodzinski, Filip. Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability. United States: N. p., 2018.
Web. doi:10.3390/molecules23020251.
Dehghanpoor, Ramin, Ricks, Evan, Hursh, Katie, Gunderson, Sarah, Farhoodi, Roshanak, Haspel, Nurit, Hutchinson, Brian, & Jagodzinski, Filip. Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability. United States. https://doi.org/10.3390/molecules23020251
Dehghanpoor, Ramin, Ricks, Evan, Hursh, Katie, Gunderson, Sarah, Farhoodi, Roshanak, Haspel, Nurit, Hutchinson, Brian, and Jagodzinski, Filip. Sat .
"Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability". United States. https://doi.org/10.3390/molecules23020251. https://www.osti.gov/servlets/purl/1628486.
@article{osti_1628486,
title = {Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability},
author = {Dehghanpoor, Ramin and Ricks, Evan and Hursh, Katie and Gunderson, Sarah and Farhoodi, Roshanak and Haspel, Nurit and Hutchinson, Brian and Jagodzinski, Filip},
abstractNote = {Predicting how a point mutation alters a protein’s stability can guide pharmaceutical drug design initiatives which aim to counter the effects of serious diseases. Conducting mutagenesis studies in physical proteins can give insights about the effects of amino acid substitutions, but such wet-lab work is prohibitive due to the time as well as financial resources needed to assess the effect of even a single amino acid substitution. Computational methods for predicting the effects of a mutation on a protein structure can complement wet-lab work, and varying approaches are available with promising accuracy rates. In this work we compare and assess the utility of several machine learning methods and their ability to predict the effects of single and double mutations. We in silico generate mutant protein structures, and compute several rigidity metrics for each of them. We use these as features for our Support Vector Regression (SVR), Random Forest (RF), and Deep Neural Network (DNN) methods. We validate the predictions of our in silico mutations against experimental ΔΔG stability data, and attain Pearson Correlation values upwards of 0.71 for single mutations, and 0.81 for double mutations. We perform ablation studies to assess which features contribute most to a model’s success, and also introduce a voting scheme to synthesize a single prediction from the individual predictions of the three models.},
doi = {10.3390/molecules23020251},
journal = {Molecules},
number = 2,
volume = 23,
place = {United States},
year = {Sat Jan 27 00:00:00 EST 2018},
month = {Sat Jan 27 00:00:00 EST 2018}
}
Figures / Tables:
Works referenced in this record:
Hydrophobic stabilization in T4 lysozyme determined directly by multiple substitutions of Ile 3
journal, August 1988
- Matsumura, Masazumi; Becktel, Wayne J.; Matthews, Brian W.
- Nature, Vol. 334, Issue 6181
Contributions of hydrogen bonds of Thr 157 to the thermodynamic stability of phage T4 lysozyme
journal, November 1987
- Alber, Tom; Dao-pin, Sun; Wilson, Keith
- Nature, Vol. 330, Issue 6143
Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence
journal, September 1997
- Gilis, Dimitri; Rooman, Marianne
- Journal of Molecular Biology, Vol. 272, Issue 2
Improved prediction of protein side-chain conformations with SCWRL4
journal, December 2009
- Krivov, Georgii G.; Shapovalov, Maxim V.; Dunbrack, Roland L.
- Proteins: Structure, Function, and Bioinformatics, Vol. 77, Issue 4
Subsemble: an ensemble method for combining subset-specific algorithm fits
journal, November 2013
- Sapp, Stephanie; van der Laan, Mark J.; Canny, John
- Journal of Applied Statistics, Vol. 41, Issue 6
Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles
journal, October 2015
- Brender, Jeffrey R.; Zhang, Yang
- PLOS Computational Biology, Vol. 11, Issue 10
Dissection of helix capping in T4 lysozyme by structural and thermodynamic analysis of six amino acid substitutions at Thr 59
journal, April 1992
- Bell, Jeffrey A.; Becktel, Wayne J.; Sauer, Uwe
- Biochemistry, Vol. 31, Issue 14
Scalable molecular dynamics with NAMD
journal, January 2005
- Phillips, James C.; Braun, Rosemary; Wang, Wei
- Journal of Computational Chemistry, Vol. 26, Issue 16, p. 1781-1802
Prediction of protein stability changes for single-site mutations using support vector machines
journal, December 2005
- Cheng, Jianlin; Randall, Arlo; Baldi, Pierre
- Proteins: Structure, Function, and Bioinformatics, Vol. 62, Issue 4
Response of a protein structure to cavity-creating mutations and its relation to the hydrophobic effect
journal, January 1992
- Eriksson, A.; Baase, W.; Zhang, X.
- Science, Vol. 255, Issue 5041, p. 178-183
Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables
journal, January 1997
- Topham, C. M.; Srinivasan, N.; Blundell, T. L.
- Protein Engineering Design and Selection, Vol. 10, Issue 1
Tertiary templates for proteins
journal, February 1987
- Ponder, Jay W.; Richards, Frederic M.
- Journal of Molecular Biology, Vol. 193, Issue 4
Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools
journal, September 2015
- Jia, Lei; Yarlagadda, Ramya; Reed, Charles C.
- PLOS ONE, Vol. 10, Issue 9
Combining Estimates in Regression and Classification
journal, December 1996
- Leblanc, Michael; Tibshirani, Robert
- Journal of the American Statistical Association, Vol. 91, Issue 436
ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions
journal, January 2006
- Kumar, M. D. S.
- Nucleic Acids Research, Vol. 34, Issue 90001
Contributions of left-handed helical residues to the structure and stability of bacteriophage T4 lysozyme
journal, November 1989
- Nicholson, H.; Söderlind, E.; Tronrud, D. E.
- Journal of Molecular Biology, Vol. 210, Issue 1
Contributions of all 20 amino acids at site 96 to the stability and structure of T4 lysozyme
journal, May 2009
- Mooers, Blaine H. M.; Baase, Walter A.; Wray, Jonathan W.
- Protein Science, Vol. 18, Issue 5
LIBSVM: A library for support vector machines
journal, April 2011
- Chang, Chih-Chung; Lin, Chih-Jen
- ACM Transactions on Intelligent Systems and Technology, Vol. 2, Issue 3
Contribution of the hydrophobic effect to protein stability: analysis based on simulations of the Ile-96----Ala mutation in barnase.
journal, December 1991
- Prevost, M.; Wodak, S. J.; Tidor, B.
- Proceedings of the National Academy of Sciences, Vol. 88, Issue 23
Conformation of amino acid side-chains in proteins
journal, November 1978
- Janin, Joël; Wodak, Shoshanna; Levitt, Michael
- Journal of Molecular Biology, Vol. 125, Issue 3
Super Learner
journal, January 2007
- van der Laan, Mark J.; Polley, Eric C.; Hubbard, Alan E.
- Statistical Applications in Genetics and Molecular Biology, Vol. 6, Issue 1
Protein flexibility predictions using graph theory
journal, January 2001
- Jacobs, Donald J.; Rader, A. J.; Kuhn, Leslie A.
- Proteins: Structure, Function, and Genetics, Vol. 44, Issue 2
Accurate prediction of the stability and activity effects of site-directed mutagenesis on a protein core
journal, August 1991
- Lee, Christopher; Levitt, Michael
- Nature, Vol. 352, Issue 6334
SDM--a server for predicting effects of mutations on protein stability and malfunction
journal, May 2011
- Worth, C. L.; Preissner, R.; Blundell, T. L.
- Nucleic Acids Research, Vol. 39, Issue suppl
KINARI-Web: a server for protein rigidity analysis
journal, June 2011
- Fox, N.; Jagodzinski, F.; Li, Y.
- Nucleic Acids Research, Vol. 39, Issue suppl
Conformational analysis of the backbone-dependent rotamer preferences of protein sidechains
journal, May 1994
- Dunbrack, Roland L.; Karplus, Martin
- Nature Structural & Molecular Biology, Vol. 1, Issue 5
Exploiting the Link between Protein Rigidity and Thermostability for Data-Driven Protein Engineering
journal, October 2008
- Radestock, S.; Gohlke, H.
- Engineering in Life Sciences, Vol. 8, Issue 5
HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source
journal, September 2017
- Wan, Shixiang; Duan, Yucong; Zou, Quan
- PROTEOMICS, Vol. 17, Issue 17-18
PROTS-RF: A Robust Model for Predicting Mutation-Induced Protein Stability Changes
journal, October 2012
- Li, Yunqi; Fang, Jianwen
- PLoS ONE, Vol. 7, Issue 10
Exploiting the Link between Protein Rigidity and Thermostability for Data-Driven Protein Engineering
journal, October 2008
- Radestock, S.; Gohlke, H.
- Engineering in Life Sciences, Vol. 8, Issue 5
HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source
journal, September 2017
- Wan, Shixiang; Duan, Yucong; Zou, Quan
- PROTEOMICS, Vol. 17, Issue 17-18
Protein flexibility predictions using graph theory
journal, January 2001
- Jacobs, Donald J.; Rader, A. J.; Kuhn, Leslie A.
- Proteins: Structure, Function, and Genetics, Vol. 44, Issue 2
Prediction of protein stability changes for single-site mutations using support vector machines
journal, December 2005
- Cheng, Jianlin; Randall, Arlo; Baldi, Pierre
- Proteins: Structure, Function, and Bioinformatics, Vol. 62, Issue 4
Improved prediction of protein side-chain conformations with SCWRL4
journal, December 2009
- Krivov, Georgii G.; Shapovalov, Maxim V.; Dunbrack, Roland L.
- Proteins: Structure, Function, and Bioinformatics, Vol. 77, Issue 4
Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence
journal, September 1997
- Gilis, Dimitri; Rooman, Marianne
- Journal of Molecular Biology, Vol. 272, Issue 2
The effect of acidic pH on the adsorption and lytic activity of the peptides Polybia-MP1 and its histidine-containing analog in anionic lipid membrane: a biophysical study by molecular dynamics and spectroscopy
journal, April 2021
- Martins, Ingrid Bernardes Santana; Viegas, Taisa Giordano; dos Santos Alvares, Dayane
- Amino Acids, Vol. 53, Issue 5
Conformation of amino acid side-chains in proteins
journal, November 1978
- Janin, Joël; Wodak, Shoshanna; Levitt, Michael
- Journal of Molecular Biology, Vol. 125, Issue 3
Contributions of left-handed helical residues to the structure and stability of bacteriophage T4 lysozyme
journal, November 1989
- Nicholson, H.; Söderlind, E.; Tronrud, D. E.
- Journal of Molecular Biology, Vol. 210, Issue 1
Structural basis of Fabry disease
journal, September 2002
- Garman, Scott C.; Garboczi, David N.
- Molecular Genetics and Metabolism, Vol. 77, Issue 1-2
Dissection of helix capping in T4 lysozyme by structural and thermodynamic analysis of six amino acid substitutions at Thr 59
journal, April 1992
- Bell, Jeffrey A.; Becktel, Wayne J.; Sauer, Uwe
- Biochemistry, Vol. 31, Issue 14
Contributions of hydrogen bonds of Thr 157 to the thermodynamic stability of phage T4 lysozyme
journal, November 1987
- Alber, Tom; Dao-pin, Sun; Wilson, Keith
- Nature, Vol. 330, Issue 6143
Accurate prediction of the stability and activity effects of site-directed mutagenesis on a protein core
journal, August 1991
- Lee, Christopher; Levitt, Michael
- Nature, Vol. 352, Issue 6334
Combining Estimates in Regression and Classification
journal, December 1996
- Leblanc, Michael; Tibshirani, Robert
- Journal of the American Statistical Association, Vol. 91, Issue 436
Subsemble: an ensemble method for combining subset-specific algorithm fits
journal, November 2013
- Sapp, Stephanie; van der Laan, Mark J.; Canny, John
- Journal of Applied Statistics, Vol. 41, Issue 6
ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions
journal, January 2006
- Kumar, M. D. S.
- Nucleic Acids Research, Vol. 34, Issue 90001
SDM--a server for predicting effects of mutations on protein stability and malfunction
journal, May 2011
- Worth, C. L.; Preissner, R.; Blundell, T. L.
- Nucleic Acids Research, Vol. 39, Issue suppl
KINARI-Web: a server for protein rigidity analysis
journal, June 2011
- Fox, N.; Jagodzinski, F.; Li, Y.
- Nucleic Acids Research, Vol. 39, Issue suppl
Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables
journal, January 1997
- Topham, C. M.; Srinivasan, N.; Blundell, T. L.
- Protein Engineering Design and Selection, Vol. 10, Issue 1
Fast Prediction of Protein Methylation Sites Using a Sequence-Based Feature Selection Technique
journal, July 2019
- Wei, Leyi; Xing, Pengwei; Shi, Gaotao
- IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 16, Issue 4
PhosPred-RF: A Novel Sequence-Based Predictor for Phosphorylation Sites Using Sequential Information Only
journal, June 2017
- Wei, Leyi; Xing, Pengwei; Tang, Jijun
- IEEE Transactions on NanoBioscience, Vol. 16, Issue 4
Using Rigidity Analysis to Probe Mutation-Induced Structural Changes in Proteins
journal, June 2012
- Jagodzinski, Filip; Hardy, Jeanne; Streinu, Ileana
- Journal of Bioinformatics and Computational Biology, Vol. 10, Issue 03
LIBSVM: A library for support vector machines
journal, April 2011
- Chang, Chih-Chung; Lin, Chih-Jen
- ACM Transactions on Intelligent Systems and Technology, Vol. 2, Issue 3
ProMuteHT
conference, August 2017
- Andersson, Erik; Jagodzinski, Filip
- Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
A conservation and rigidity based method for detecting critical protein residues
journal, January 2013
- Akbal-Delibas, Bahar; Jagodzinski, Filip; Haspel, Nurit
- BMC Structural Biology, Vol. 13, Issue Suppl 1
Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles
journal, October 2015
- Brender, Jeffrey R.; Zhang, Yang
- PLOS Computational Biology, Vol. 11, Issue 10
PROTS-RF: A Robust Model for Predicting Mutation-Induced Protein Stability Changes
journal, October 2012
- Li, Yunqi; Fang, Jianwen
- PLoS ONE, Vol. 7, Issue 10
Super Learner
journal, January 2007
- van der Laan, Mark J.; Polley, Eric C.; Hubbard, Alan E.
- Statistical Applications in Genetics and Molecular Biology, Vol. 6, Issue 1
Works referencing / citing this record:
Robust Prediction of Single and Multiple Point Protein Mutations Stability Changes
journal, December 2019
- Álvarez-Machancoses, Óscar; De Andrés-Galiana, Enrique J.; Fernández-Martínez, Juan Luis
- Biomolecules, Vol. 10, Issue 1
Amino-Acid Network Clique Analysis of Protein Mutation Non-Additive Effects: A Case Study of Lysozme
journal, May 2018
- Ming, Dengming; Chen, Rui; Huang, He
- International Journal of Molecular Sciences, Vol. 19, Issue 5
Evaluating Protein Engineering Thermostability Prediction Tools Using an Independently Generated Dataset
journal, March 2020
- Huang, Peishan; Chu, Simon K. S.; Frizzo, Henrique N.
- ACS Omega, Vol. 5, Issue 12
Amino-Acid Network Clique Analysis of Protein Mutation Non-Additive Effects: A Case Study of Lysozme
journal, May 2018
- Ming, Dengming; Chen, Rui; Huang, He
- International Journal of Molecular Sciences, Vol. 19, Issue 5
PETRA: Drug Engineering via Rigidity Analysis
journal, March 2020
- Herr, Sam; Myers-Dean, Josh; Read, Hunter
- Molecules, Vol. 25, Issue 6
Figures / Tables found in this record: