skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Two-Fold Structural Classification Method for Determining the Accurate Ensemble of Protein Structures

Abstract

Atomic-level structural characterization of flexible proteins, such as intrinsically disordered proteins and multi-domain proteins connected by flexible linkers, is challenging as they possess distinct conformations in physiological conditions. Significant efforts have been made to develop integrated approaches by combining small angle neutron/X-ray scattering experiments with molecular simulations to reveal the distinct atomic structures and the corresponding populations for these flexible proteins. One widely used method, the basis-set supported ensemble method, classifies the simulation-generated protein conformations into a set of structural basis and then derives the corresponding populations by fitting to the experimental data. This method makes an implicit assumption that protein conformations of similar structures have similar small angle scattering profiles.The present work demonstrates that, for various protein systems ranging from compact globular proteins and flexible multidomain proteins through to intrinsically disordered proteins, this method provides inaccurate assessment of the structural ensemble of the protein molecules due to the breakdown of the assumption made. To alleviate this problem, a two-fold-clustering method is developed to cluster the simulation-generated protein structures using information on both 3D structure and scattering profiles. As benchmarked by both simulation and experimental results, this new method yields much more accurate populations of structural basis of protein molecules.

Authors:
 [1];  [2]; ORCiD logo [3]; ORCiD logo [4];  [5];  [5];  [6];  [1]
  1. Shanghai Jiao Tong Univ., Shanghai (China). School of Physics and Astronomy; Shanghai Jiao Tong Univ., Shanghai (China). Inst. of Natural Sciences
  2. Shanghai Jiao Tong Univ., Shanghai (China). Inst. of Natural Sciences; Shanghai Jiao Tong Univ., Shanghai (China). Zhiyuan College
  3. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Univ. of Tennessee, Knoxville, TN (United States). Dept. of Biochemistry, Cellular & Molecular Biology
  4. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  5. Shanghai Jiao Tong Univ., Shanghai (China). School of Life Science and Biotechnology
  6. Shanghai Jiao Tong Univ., Shanghai (China). Inst. of Natural Sciences; Univ. of Liverpool (United Kingdom). Dept. of Mathematical Sciences
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1495955
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
Communications in Computational Physics
Additional Journal Information:
Journal Volume: 25; Journal Issue: 4; Journal ID: ISSN 1815-2406
Publisher:
Global Science Press
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Protein structures; statistical data analysis; Monte Carlo; cluster analysis

Citation Formats

Tan, Pan, Fu, Zuyue, Petridis, Loukas, Qian, Shuo, You, Delin, Wei, Dongqing, Li, Jinglai, and Hong, Liang. A Two-Fold Structural Classification Method for Determining the Accurate Ensemble of Protein Structures. United States: N. p., 2018. Web. doi:10.4208/cicp.OA-2018-0140.
Tan, Pan, Fu, Zuyue, Petridis, Loukas, Qian, Shuo, You, Delin, Wei, Dongqing, Li, Jinglai, & Hong, Liang. A Two-Fold Structural Classification Method for Determining the Accurate Ensemble of Protein Structures. United States. doi:10.4208/cicp.OA-2018-0140.
Tan, Pan, Fu, Zuyue, Petridis, Loukas, Qian, Shuo, You, Delin, Wei, Dongqing, Li, Jinglai, and Hong, Liang. Sat . "A Two-Fold Structural Classification Method for Determining the Accurate Ensemble of Protein Structures". United States. doi:10.4208/cicp.OA-2018-0140. https://www.osti.gov/servlets/purl/1495955.
@article{osti_1495955,
title = {A Two-Fold Structural Classification Method for Determining the Accurate Ensemble of Protein Structures},
author = {Tan, Pan and Fu, Zuyue and Petridis, Loukas and Qian, Shuo and You, Delin and Wei, Dongqing and Li, Jinglai and Hong, Liang},
abstractNote = {Atomic-level structural characterization of flexible proteins, such as intrinsically disordered proteins and multi-domain proteins connected by flexible linkers, is challenging as they possess distinct conformations in physiological conditions. Significant efforts have been made to develop integrated approaches by combining small angle neutron/X-ray scattering experiments with molecular simulations to reveal the distinct atomic structures and the corresponding populations for these flexible proteins. One widely used method, the basis-set supported ensemble method, classifies the simulation-generated protein conformations into a set of structural basis and then derives the corresponding populations by fitting to the experimental data. This method makes an implicit assumption that protein conformations of similar structures have similar small angle scattering profiles.The present work demonstrates that, for various protein systems ranging from compact globular proteins and flexible multidomain proteins through to intrinsically disordered proteins, this method provides inaccurate assessment of the structural ensemble of the protein molecules due to the breakdown of the assumption made. To alleviate this problem, a two-fold-clustering method is developed to cluster the simulation-generated protein structures using information on both 3D structure and scattering profiles. As benchmarked by both simulation and experimental results, this new method yields much more accurate populations of structural basis of protein molecules.},
doi = {10.4208/cicp.OA-2018-0140},
journal = {Communications in Computational Physics},
number = 4,
volume = 25,
place = {United States},
year = {2018},
month = {12}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share: