skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Unsupervised and Supervised Learning over the Energy Landscape for Protein Decoy Selection

Journal Article · · Biomolecules
DOI:https://doi.org/10.3390/biom9100607· OSTI ID:1604000

The energy landscape that organizes microstates of a molecular system and governs the underlying molecular dynamics exposes the relationship between molecular form/structure, changes to form, and biological activity or function in the cell. However, several challenges stand in the way of leveraging energy landscapes for relating structure and structural dynamics to function. Energy landscapes are high-dimensional, multi-modal, and often overly-rugged. Deep wells or basins in them do not always correspond to stable structural states but are instead the result of inherent inaccuracies in semi-empirical molecular energy functions. Due to these challenges, energetics is typically ignored in computational approaches addressing long-standing central questions in computational biology, such as protein decoy selection. In the latter, the goal is to determine over a possibly large number of computationally-generated three-dimensional structures of a protein those structures that are biologically-active/native. In recent work, we have recast our attention on the protein energy landscape and its role in helping us to advance decoy selection. Here, we summarize some of our successes so far in this direction via unsupervised learning. More importantly, we further advance the argument that the energy landscape holds valuable information to aid and advance the state of protein decoy selection via novel machine learning methodologies that leverage supervised learning. Our focus in this article is on decoy selection for the purpose of a rigorous, quantitative evaluation of how leveraging protein energy landscapes advances an important problem in protein modeling. However, the ideas and concepts presented here are generally useful to make discoveries in studies aiming to relate molecular structure and structural dynamics to function.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA); National Science Foundation (NSF); Jeffress Memorial Trust
Grant/Contract Number:
89233218CNA000001; 1900061; 1821154; 20160317ER; AC52-06NA25396
OSTI ID:
1604000
Report Number(s):
LA-UR-19-27828; BIOMHC
Journal Information:
Biomolecules, Vol. 9, Issue 10; Related Information: This article belongs to the Special Issue An Energy Landscape Perspective of Protein Structure Prediction and Analysis.; ISSN 2218-273X
Publisher:
MDPICopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 4 works
Citation information provided by
Web of Science

References (64)

Methods for estimation of model accuracy in CASP12 journal October 2017
A mathematical procedure for superimposing atomic coordinates of proteins journal November 1972
The structural bioinformatics library: modeling in biomolecular science and beyond journal January 2017
Assessment of the assessment: Evaluation of the model quality estimates in CASP10: Model Quality Assessment journal August 2013
Graph-Based Community Detection for Decoy Selection in Template-Free Protein Structure Prediction journal February 2019
MQAPsingle: A quasi single-model approach for estimation of the quality of individual protein structure models: MQAPsingle journal May 2016
Scoring function for automated assessment of protein structure template quality journal January 2004
Protein model accuracy estimation based on local structure quality assessment using 3D convolutional neural network journal September 2019
Energy Functions that Discriminate X-ray and Near-native Folds from Well-constructed Decoys journal May 1996
Announcing the worldwide Protein Data Bank journal December 2003
The energy landscapes and motions of proteins journal December 1991
Mining high-throughput experimental data to link gene and function journal April 2011
From molecular energy landscapes to equilibrium dynamics via landscape analysis and markov state models journal December 2019
The role of dynamic conformational ensembles in biomolecular recognition journal October 2009
Entropy-accelerated exact clustering of protein decoys journal February 2011
An Energy Landscape Treatment of Decoy Selection in Template-Free Protein Structure Prediction journal June 2018
Improved model quality assessment using ProQ2 journal January 2012
Fast algorithm for population-based protein structural model analysis journal January 2013
Network properties of decoys and CASP predicted models: a comparison with native protein structures journal January 2013
An empirical energy potential with a reference state for protein fold and sequence recognition journal August 1999
From Extraction of Local Structures of Protein Energy Landscapes to Improved Decoy Selection in Template-Free Protein Structure Prediction journal January 2018
DeepQA: improving the estimation of single protein model quality with deep belief networks journal December 2016
Protein Structural Model Selection by Combining Consensus and Single Scoring Methods journal September 2013
GOAP: A Generalized Orientation-Dependent, All-Atom Statistical Potential for Protein Structure Prediction journal October 2011
Random Forest-Based Protein Model Quality Assessment (RFMQA) Using Structural Features and Potential Energy Terms journal September 2014
Processing and analysis of CASP3 protein structure predictions journal January 1999
Protein structural model selection based on protein-dependent scoring function journal January 2012
A second molecular biology revolution? The energy landscapes of biomolecular function journal January 2014
machine. journal October 2001
Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction journal November 2002
The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design journal May 2017
Principles that Govern the Folding of Protein Chains journal July 1973
Calibur: a tool for clustering large numbers of protein decoys journal January 2010
SVMQA: support–vector-machine-based protein single-model quality assessment journal April 2017
Assessment of model accuracy estimations in CASP12 journal September 2017
ProQ2: estimation of model accuracy implemented in Rosetta journal January 2016
A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction journal October 2010
Smooth orientation-dependent scoring function for coarse-grained protein quality assessment journal December 2018
Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13 journal April 2019
Connecting Molecular Energy Landscape Analysis with Markov Model-based Analysis of Equilibrium Structural Dynamics
  • Kabir, Kazi Lutful; Akhter, Nasrin; Shehu, Amarda
  • Proceedings of 11th International Conference on Bioinformatics and Computational Biology, EPiC Series in Computing https://doi.org/10.29007/tmgc
conference March 2019
Protein structure prediction by all-atom free-energy refinement journal January 2007
BIOCHEMISTRY: How Do Proteins Interact? journal June 2008
Specific interactions for ab initio folding of protein terminal regions with secondary structures journal February 2008
CHARMM: A program for macromolecular energy, minimization, and dynamics calculations journal July 1983
Ranking predicted protein structures with support vector regression journal November 2007
Protein model quality assessment using 3D oriented convolutional neural networks journal February 2019
Probabilistic Search and Energy Guidance for Biased Decoy Sampling in Ab Initio Protein Structure Prediction journal September 2013
Methods of model accuracy estimation can help selecting the best models from decoy sets: Assessment of model accuracy estimations in CASP11: Estimation of Model Accuracy journal September 2015
Free energies of protein decoys provide insight into determinants of protein stability journal November 2001
Principles and Overview of Sampling Methods for Modeling Macromolecular Structure and Dynamics journal April 2016
Critical assessment of methods of protein structure prediction: Progress and new directions in round XI: Progress in CASP XI journal June 2016
Discrimination of native protein structures using atom-atom contact scoring journal March 2003
MaxSub: an automated measure for the assessment of protein structure prediction quality journal September 2000
Discrimination of the native from misfolded protein models with an energy function including implicit solvation journal May 1999
SCUD: Fast structure clustering of decoys using reference state to remove overall rotation journal January 2005
Sorting protein decoys by machine-learning-to-rank journal August 2016
Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all-atom force field and the surface generalized born solvent model journal June 2002
A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules J . Am . Chem . Soc . 1995 , 117 , 5179−5197 journal January 1996
Protein single-model quality assessment by feature-based probability density functions journal April 2016
Four Small Puzzles That Rosetta Doesn't Solve journal May 2011
Scoring function for automated assessment of protein structure template quality journal June 2007
Factors associated with sufficient knowledge of antibiotics and antimicrobial resistance in the Japanese general population journal February 2020
The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design text January 2017
Protein single-model quality assessment by feature-based probability density functions preprint January 2016