A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins
Abstract
A general formalism to compute configurational properties of proteins and other heteropolymers with an arbitrary sequence of charges and nonuniform excluded volume interaction is presented. A variational approach is utilized to predict average distance between any two monomers in the chain. The presented analytical model, for the first time, explicitly incorporates the role of sequence charge distribution to determine relative sizes between two sequences that vary not only in total charge composition but also in charge decoration (even when charge composition is fixed). Furthermore, the formalism is general enough to allow variation in excluded volume interactions between two monomers. Model predictions are benchmarked against the allatom Monte Carlo studies of Das and Pappu [Proc. Natl. Acad. Sci. U. S. A. 110, 13392 (2013)] for 30 different synthetic sequences of polyampholytes. These sequences possess an equal number of glutamic acid (E) and lysine (K) residues but differ in the patterning within the sequence. Without any fit parameter, the model captures the strong sequence dependence of the simulated values of the radius of gyration with a correlation coefficient of R{sup 2} = 0.9. The model is then applied to real proteins to compare the unfolded state dimensions of 540 orthologous pairs ofmore »
 Authors:

 Department of Physics and Astronomy, University of Denver, Denver, Colorado 80208 (United States)
 Publication Date:
 OSTI Identifier:
 22493588
 Resource Type:
 Journal Article
 Journal Name:
 Journal of Chemical Physics
 Additional Journal Information:
 Journal Volume: 143; Journal Issue: 8; Other Information: (c) 2015 AIP Publishing LLC; Country of input: International Atomic Energy Agency (IAEA); Journal ID: ISSN 00219606
 Country of Publication:
 United States
 Language:
 English
 Subject:
 37 INORGANIC, ORGANIC, PHYSICAL AND ANALYTICAL CHEMISTRY; ATOMS; BENCHMARKS; CHARGE DISTRIBUTION; COMPARATIVE EVALUATIONS; CORRELATIONS; GLUTAMIC ACID; LYSINE; MONOMERS; MONTE CARLO METHOD; POLYMERS; PROTEINS; VARIATIONAL METHODS
Citation Formats
Sawle, Lucas, and Ghosh, Kingshuk. A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins. United States: N. p., 2015.
Web. doi:10.1063/1.4929391.
Sawle, Lucas, & Ghosh, Kingshuk. A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins. United States. doi:10.1063/1.4929391.
Sawle, Lucas, and Ghosh, Kingshuk. Fri .
"A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins". United States. doi:10.1063/1.4929391.
@article{osti_22493588,
title = {A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins},
author = {Sawle, Lucas and Ghosh, Kingshuk},
abstractNote = {A general formalism to compute configurational properties of proteins and other heteropolymers with an arbitrary sequence of charges and nonuniform excluded volume interaction is presented. A variational approach is utilized to predict average distance between any two monomers in the chain. The presented analytical model, for the first time, explicitly incorporates the role of sequence charge distribution to determine relative sizes between two sequences that vary not only in total charge composition but also in charge decoration (even when charge composition is fixed). Furthermore, the formalism is general enough to allow variation in excluded volume interactions between two monomers. Model predictions are benchmarked against the allatom Monte Carlo studies of Das and Pappu [Proc. Natl. Acad. Sci. U. S. A. 110, 13392 (2013)] for 30 different synthetic sequences of polyampholytes. These sequences possess an equal number of glutamic acid (E) and lysine (K) residues but differ in the patterning within the sequence. Without any fit parameter, the model captures the strong sequence dependence of the simulated values of the radius of gyration with a correlation coefficient of R{sup 2} = 0.9. The model is then applied to real proteins to compare the unfolded state dimensions of 540 orthologous pairs of thermophilic and mesophilic proteins. The excluded volume parameters are assumed similar under denatured conditions, and only electrostatic effects encoded in the sequence are accounted for. With these assumptions, thermophilic proteins are found—with high statistical significance—to have more compact disordered ensemble compared to their mesophilic counterparts. The method presented here, due to its analytical nature, is capable of making such high throughput analysis of multiple proteins and will have broad applications in proteomic studies as well as in other heteropolymeric systems.},
doi = {10.1063/1.4929391},
journal = {Journal of Chemical Physics},
issn = {00219606},
number = 8,
volume = 143,
place = {United States},
year = {2015},
month = {8}
}