DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Protein sequence design with a learned potential

Journal Article · · Nature Communications

The task of protein sequence design is central to nearly all rational protein engineering problems, and enormous effort has gone into the development of energy functions to guide design. Here, we investigate the capability of a deep neural network model to automate design of sequences onto protein backbones, having learned directly from crystal structure data and without any human-specified priors. The model generalizes to native topologies not seen during training, producing experimentally stable designs. We evaluate the generalizability of our method to a de novo TIM-barrel scaffold. The model produces novel sequences, and high-resolution crystal structures of two designs show excellent agreement with in silico models. Our findings demonstrate the tractability of an entirely learned method for protein sequence design.

Research Organization:
SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Institutes of Health (NIH); National Library of Medicine; USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-76SF00515
OSTI ID:
1869814
Journal Information:
Nature Communications, Journal Name: Nature Communications Journal Issue: 1 Vol. 13; ISSN 2041-1723
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (65)

Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features journal December 1983
Rapid approximation to molecular surface area via the use of Boolean logic and look-up tables journal March 1993
Relaxation of backbone bond geometry improves protein energy landscape modeling: Relaxation of Backbone Bond Geometry journal November 2013
RosettaHoles: Rapid assessment of protein core packing for structure prediction, refinement, design and validation journal January 2008
Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles: Sequence Profiles Compatible to a Structural Fold journal June 2014
SPIN2: Predicting sequence profiles from protein structures using deep neural networks journal March 2018
Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13) journal August 2019
ProDCoNN: Protein design using a convolutional neural network journal January 2020
Fast and simple monte carlo algorithm for side chain optimization in proteins: Application to model building by homology journal October 1992
Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von Heijne journal September 1999
Homology among (βα) 8 barrels: implications for the evolution of metabolic pathways 1 1Edited by G. Von Heijne journal November 2000
Protein Structure Prediction Using Rosetta book January 2004
Protein sequence design with deep generative models journal December 2021
Fast and Flexible Protein Design Using Deep Graph Neural Networks journal October 2020
Alternate States of Proteins Revealed by Detailed Energy Landscape Mapping journal January 2011
The Stability Landscape of de novo TIM Barrels Explored by a Modular Design Approach journal September 2021
DenseCPD: Improving the Accuracy of Neural-Network-Based Computational Protein Sequence Design with DenseNet journal March 2020
To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map journal December 2019
Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules journal November 2016
The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design journal May 2017
Rapid Sampling of Hydrogen Bond Networks for Computational Protein Design journal March 2018
Discovery of Novel Gain-of-Function Mutations Guided by Structure-Based Deep Learning journal October 2020
Kemp elimination catalysts by computational enzyme design journal March 2008
Computational design of ligand-binding proteins with high affinity and selectivity journal September 2013
Proof of principle for epitope-focused vaccine design journal February 2014
Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing journal May 2012
De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy journal November 2015
De novo design of a fluorescence-activating β-barrel journal September 2018
De novo design of potent and selective mimics of IL-2 and IL-15 journal January 2019
De novo protein design by deep network hallucination journal December 2021
Low-N protein engineering with data-efficient deep learning journal April 2021
Computational Protein Design with Deep Learning Neural Networks journal April 2018
Algorithm discovery by protein folding game players journal November 2011
Computational design of a self-assembling symmetrical β-propeller protein journal October 2014
Improved protein structure prediction using predicted interresidue orientations journal January 2020
Tight and specific lanthanide binding in a de novo TIM barrel with a large internal cavity designed by symmetric domain fusion journal November 2020
Native protein sequences are close to optimal for their structures journal September 2000
UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches journal November 2014
Efficiency of pseudolikelihood estimation for simple Gaussian fields journal January 1977
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs journal September 1997
Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements journal July 2001
RosettaDesign server for protein design journal July 2006
A series of PDB related databases for everyday needs journal November 2010
CATH: an expanded resource to predict protein function through structure and sequence journal November 2016
Gene3D: Extensive prediction of globular domains in proteins journal November 2017
The PSIPRED Protein Analysis Workbench: 20 years on journal April 2019
Protein Design is NP-hard journal October 2002
IG-VAE: Generative Modeling of Immunoglobulin Proteins by Direct 3D Coordinate Generation posted_content February 2022
Phaser crystallographic software journal July 2007
The Buccaneer software for automated model building. 1. Tracing protein chains journal August 2006
XDS journal January 2010
Features and development of Coot journal March 2010
REFMAC 5 for the refinement of macromolecular crystal structures journal March 2011
Protein database searches using compositionally adjusted substitution matrices journal October 2005
Computational Design of an Enzyme Catalyst for a Stereoselective Bimolecular Diels-Alder Reaction journal July 2010
Computational Design of Virus-Like Protein Assemblies on Carbon Nanotube Surfaces journal May 2011
De novo design of a transmembrane Zn 2+ -transporting four-helix bundle journal December 2014
De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity journal May 2016
Computational design of a modular protein sense-response system journal November 2019
3D deep convolutional neural networks for amino acid environment similarity analysis journal June 2017
Generalized Fragment Picking in Rosetta: Design, Protocols and Applications journal August 2011
RosettaRemodel: A Generalized Framework for Flexible Backbone Protein Design journal August 2011
A Pareto-Optimal Refinement Method for Protein Design Scaffolds journal April 2013
3D deep convolutional neural networks for amino acid environment similarity analysis collection January 2017
Computational design of environmental sensors for the potent opioid fentanyl journal September 2017