DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data-Efficient Generation of Protein Conformational Ensembles with Backbone-to-Side-Chain Transformers

Journal Article · · Journal of Physical Chemistry. B

Excitement at the prospect of using data-driven generative models to sample configurational ensembles of biomolecular systems stems from the extraordinary success of these models on a diverse set of high-dimensional sampling tasks. Unlike image generation or even the closely related problem of protein structure prediction, there are currently no data sources with sufficient breadth to parametrize generative models for conformational ensembles. To enable discovery, a fundamentally different approach to building generative models is required: models should be able to propose rare, albeit physical, conformations that may not arise in even the largest data sets. Here, in this work, we introduce a modular strategy to generate conformations based on “backmapping” from a fixed protein backbone that (1) maintains conformational diversity of the side chains and (2) couples the side-chain fluctuations using global information about the protein conformation. Our model combines simple statistical models of side-chain conformations based on rotamer libraries with the now ubiquitous transformer architecture to sample with atomistic accuracy. Together, these ingredients provide a strategy for rapid data acquisition and hence a crucial ingredient for scalable physical simulation with generative neural networks.

Research Organization:
Stanford Univ., CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
Grant/Contract Number:
SC0022917
OSTI ID:
2341329
Alternate ID(s):
OSTI ID: 2474797
Journal Information:
Journal of Physical Chemistry. B, Journal Name: Journal of Physical Chemistry. B Journal Issue: 9 Vol. 128; ISSN 1520-6106
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English

References (57)

Free Energy Computations book January 2010
Bayesian statistical analysis of protein side-chain rotamer preferences journal August 1997
DiAMoNDBack: Diffusion-Denoising Autoregressive Model for Non-Deterministic Backmapping of Cα Protein Traces journal October 2023
Small molecules targeting the disordered transactivation domain of the androgen receptor induce the formation of collapsed helical states journal October 2022
The Impact of Side-Chain Packing on Protein Docking Refinement journal March 2015
Calculation of Proteins’ Total Side-Chain Torsional Entropy and Its Influence on Protein–Ligand Interactions journal August 2009
EPI-001, A Compound Active against Castration-Resistant Prostate Cancer, Targets Transactivation Unit 5 of the Androgen Receptor journal July 2016
PHENIX: a comprehensive Python-based system for macromolecular structure solution journal January 2010
The interrelationships of side-chain and main-chain conformations in proteins journal January 2001
Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool journal April 1997
Intrinsically unstructured proteins and their functions journal March 2005
Exploiting Sequence-Dependent Rotamer Information in Global Optimization of Proteins journal October 2022
Intrinsically disordered proteins in cellular signalling and regulation journal December 2014
Ensuring thermodynamic consistency with invertible coarse-graining journal March 2023
Intrinsically disordered proteins as crucial constituents of cellular aqueous two phase systems and coacervates journal November 2014
Developing a molecular dynamics force field for both folded and disordered protein states journal May 2018
The multiscale coarse-graining method. I. A rigorous bridge between atomistic and coarse-grained models journal June 2008
Probing transfer learning with a model of synthetic correlated datasets journal February 2022
Coarse-Graining Methods for Computational Biology journal May 2013
Crystal Structure of a Ten-Amino Acid Protein journal November 2008
Tertiary templates for proteins journal February 1987
Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm journal October 1999
A multi-modal coarse grained model of DNA flexibility mappable to the atomistic level journal January 2020
A Multiscale Coarse-Graining Method for Biomolecular Systems journal February 2005
Bypassing backmapping: Coarse-grained electronic property distributions using heteroscedastic Gaussian processes journal November 2022
ABSINTH: A new continuum solvation model for simulations of polypeptides in aqueous solutions journal April 2009
Polymer physics of intracellular phase transitions journal November 2015
AlphaFold and Implications for Intrinsically Disordered Proteins journal October 2021
An Analysis of Side‐Chain Conformation in Proteins* journal February 1979
Bottom-up Coarse-Graining: Principles and Perspectives journal September 2022
Unifying coarse-grained force fields for folded and disordered proteins journal February 2022
Side-Chain Conformational Preferences Govern Protein–Protein Interactions journal August 2016
Generation of conformational ensembles of small molecules via Surrogate Model-Assisted Molecular Dynamics preprint November 2023
Adversarial reverse mapping of equilibrated condensed-phase molecular structures journal October 2020
Studies on the Conformation of Amino Acids journal December 1970
Adaptive Monte Carlo augmented with normalizing flows journal March 2022
Long-Range Intra-Protein Communication Can Be Transmitted by Correlated Side-Chain Fluctuations Alone journal September 2011
Density estimation by dual ascent of the log-likelihood journal January 2010
RASP: rapid modeling of protein side chain conformations journal September 2011
Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning journal September 2019
Maximum Entropy Optimized Force Field for Intrinsically Disordered Proteins journal November 2019
Principles of protein structural ensemble determination journal February 2017
How Fast-Folding Proteins Fold journal October 2011
Beyond rotamers: a generative, probabilistic model of side chains in proteins journal June 2010
Stereochemistry of polypeptide chain configurations journal July 1963
Are there pathways for protein folding? journal January 1968
Highly accurate protein structure prediction with AlphaFold journal July 2021
Rotamer Libraries in the 21st Century journal August 2002
Improved prediction of protein side-chain conformations with SCWRL4 journal December 2009
The pseudo-marginal approach for efficient Monte Carlo computations journal April 2009
Structural Characterization of the Native NH2-Terminal Transactivation Domain of the Human Androgen Receptor: A Collapsed Disordered Conformation Underlies Structural Plasticity and Protein-Induced Folding journal February 2008
Fluctuations within Folded Proteins: Implications for Thermodynamic and Allosteric Regulation journal February 2015
FireDock: a web server for fast interaction refinement in molecular docking journal May 2008
Coarse graining molecular dynamics with graph neural networks journal November 2020
Modeling the mechanism of CLN025 beta-hairpin formation journal September 2017
Machine learning for protein folding and dynamics journal February 2020
Conformational Analysis of the Androgen Receptor Amino-terminal Domain Involved in Transactivation journal May 2002