DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Generative $$β$$-hairpin design using a residue-based physicochemical property landscape

Journal Article · · Biophysical Journal

De novo peptide design is a new frontier that has broad application potential in the biological and biomedical fields. Most existing models for de novo peptide design are largely based on sequence homology that can be restricted based on evolutionarily derived protein sequences and lack the physicochemical context essential in protein folding. Generative machine learning for de novo peptide design is a promising way to synthesize theoretical data that are based on, but unique from, the observable universe. In this study, we created and tested a custom peptide generative adversarial network intended to design peptide sequences that can fold into the -hairpin secondary structure. This deep neural network model is designed to establish a preliminary foundation of the generative approach based on physicochemical and conformational properties of 20 canonical amino acids, for example, hydrophobicity and residue volume, using extant structure-specific sequence data from the PDB. The beta generative adversarial network model robustly distinguishes secondary structures of hairpin from α helix and intrinsically disordered peptides with an accuracy of up to 96% and generates artificial -hairpin peptide sequences with minimum sequence identities around 31% and 50% when compared against the current NCBI PDB and nonredundant databases, respectively. These results highlight the potential of generative models specifically anchored by physicochemical and conformational property features of amino acids to expand the sequence-to-structure landscape of proteins beyond evolutionary limits.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; National Science Foundation (NSF); Simons Foundation; National Institutes of Health (NIH); Extreme Science and Engineering Discovery Environment (XSEDE)
Grant/Contract Number:
AC05-00OR22725; 1764406; R01-GM148586; ACI-1548562; 1828187; TG-MCB130173
OSTI ID:
2311293
Journal Information:
Biophysical Journal, Vol. 123; ISSN 0006-3495
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (116)

Recent advances in de novo protein design: Principles, methods, and applications journal January 2021
De novo protein design, a retrospective journal January 2020
De novo design of self-assembling helical protein filaments journal November 2018
Design of ordered two-dimensional arrays mediated by noncovalent protein-protein interfaces journal June 2015
Self-Assembling 2D Arrays with de Novo Protein Building Blocks journal May 2019
A general strategy to construct small molecule biosensors in eukaryotes journal December 2015
Computational design of environmental sensors for the potent opioid fentanyl journal September 2017
Computational design of a modular protein sense-response system journal November 2019
De novo design of protein logic gates journal April 2020
De novo design of picomolar SARS-CoV-2 miniprotein inhibitors journal September 2020
De novo design of potent and selective mimics of IL-2 and IL-15 journal January 2019
Topological control of cytokine receptor signaling induces differential effects in hematopoiesis journal May 2019
Massively parallel de novo protein design for targeted therapeutics journal September 2017
The coming of age of de novo protein design journal September 2016
Principles that Govern the Folding of Protein Chains journal July 1973
Combinatorial protein design journal August 2002
Protein sequence design by conformational landscape optimization journal March 2021
QSAR Modeling of Imbalanced High-Throughput Screening Data in PubChem journal February 2014
A renaissance of neural networks in drug discovery journal July 2016
Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks journal December 2017
Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening: Machine-learning SFs to improve structure-based binding affinity prediction and virtual screening
  • Ain, Qurrat Ul; Aleksandrova, Antoniya; Roessler, Florian D.
  • Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 5, Issue 6 https://doi.org/10.1002/wcms.1225
journal August 2015
Use of machine learning approaches for novel drug discovery journal February 2016
Machine Learning Methods for Property Prediction in Chemoinformatics: Quo Vadis ? journal May 2012
Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach journal April 2015
Machine learning methods in chemoinformatics: Machine learning methods in chemoinformatics journal February 2014
De Novo Protein Design for Novel Folds Using Guided Conditional Wasserstein Generative Adversarial Networks journal September 2020
Direct generation of protein conformational ensembles via machine learning journal February 2023
Expanding functional protein sequence spaces using generative adversarial networks journal March 2021
HelixGAN a deep-learning methodology for conditional de novo design of α-helix structures journal January 2023
Designing and identifying β-hairpin peptide macrocycles with antibiotic potential journal January 2023
Past–future information bottleneck for sampling molecular reaction coordinate simultaneously with thermodynamics and kinetics journal August 2019
TorchANI: A Free and Open Source PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials journal June 2020
Graphics Processing Unit-Accelerated Semiempirical Born Oppenheimer Molecular Dynamics Using PyTorch journal July 2020
Support Vector Machines for predicting protein structural class journal June 2001
ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST journal July 2004
SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence journal July 2003
Testing statistical hypothesis on random trees and applications to the protein classification problem journal June 2009
Prediction of protein–protein interactions using random decision forest framework journal October 2005
Highly accurate protein structure prediction with AlphaFold journal July 2021
Robust deep learning–based protein sequence design using ProteinMPNN journal October 2022
Large language models generate functional protein sequences across diverse families journal January 2023
SCOP2 prototype: a new approach to protein structure mining journal November 2013
The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures journal November 2019
Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules journal November 2016
Protein design with a comprehensive statistical energy function and boosted by experimental selection for foldability journal October 2014
Toward High-Resolution de Novo Structure Prediction for Small Proteins journal September 2005
De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity journal May 2016
De novo design of a fluorescence-activating β-barrel journal September 2018
De novo design of transmembrane β barrels journal February 2021
Protein sequence design with a learned potential journal February 2022
Generative design of de novo proteins based on secondary-structure constraints using an attention-based diffusion model journal July 2023
Computational Protein Design with Deep Learning Neural Networks journal April 2018
Structure prediction of cyclic peptides by molecular dynamics + machine learning journal January 2021
Machine learning overcomes human bias in the discovery of self-assembling peptides journal October 2022
Accurate de novo design of hyperstable constrained peptides journal September 2016
Generating Ampicillin-Level Antimicrobial Peptides with Activity-Aware Generative Adversarial Networks journal August 2020
Deep Learning-Based Bioactive Therapeutic Peptide Generation and Screening journal February 2023
Fast and Flexible Protein Design Using Deep Graph Neural Networks journal October 2020
Cell-free biosynthesis combined with deep learning accelerates de novo-development of antimicrobial peptides journal November 2023
Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations journal March 2021
Role of β-Hairpin Formation in Aggregation: The Self-Assembly of the Amyloid-β(25–35) Peptide journal August 2012
Amyloid β-Peptide 25–35 Self-Assembly and Its Inhibition: A Model Undecapeptide System to Gain Atomistic and Secondary Structure Details of the Alzheimer’s Disease Process and Treatment journal September 2012
Nano-assembly of amyloid β peptide: role of the hairpin fold journal May 2017
Structural Mimicry of Retroviral Tat Proteins by Constrained β-Hairpin Peptidomimetics:  Ligands with High Affinity and Selectivity for Viral TAR RNA Regulatory Elements journal May 2004
A Designed β-Hairpin Peptide for Molecular Recognition of ATP in Water journal August 2003
Antimicrobial Peptides: Classification, Design, Application and Research Progress in Multiple Fields journal October 2020
Stabilization of a β-hairpin in monomeric Alzheimer's amyloid-β peptide inhibits amyloid formation journal March 2008
Engineered β-hairpin scaffolds from human prion protein regions: Structural and functional investigations of aggregates journal March 2020
The role of a β‐bulge in the folding of the β‐hairpin structure in ubiquitin journal October 2001
A Minimal Peptide Scaffold for β-Turn Display:  Optimizing a Strand Position in Disulfide-Cyclized β-Hairpins journal January 2001
β-Hairpin Peptidomimetics: Design, Structures and Biological Activities journal October 2008
Aromatic interactions in β-hairpin scaffold stability: A historical perspective journal January 2019
β-Hairpins as peptidomimetics of human phosphoprotein-binding domains journal January 2019
Stapled β-Hairpins Featuring 4-Mercaptoproline journal September 2021
A systematic analysis of the beta hairpin motif in the Protein Data Bank journal January 2021
DNCON2: improved protein contact prediction using two-level deep convolutional neural networks journal December 2017
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model journal January 2017
Backpropagation Applied to Handwritten Zip Code Recognition journal December 1989
AAindex: Amino Acid index database journal January 2000
propy: a tool to generate various modes of Chou’s PseAAC journal February 2013
The Protein Data Bank journal January 2000
Prediction of protein disorder based on IUPred: Prediction of Protein Disorder Based on IUPred journal November 2017
WebLogo: A Sequence Logo Generator journal May 2004
DeepSF: deep convolutional neural network for mapping protein sequences to folds journal December 2017
Why can deep convolutional neural networks improve protein fold recognition? A visual explanation by interpretation journal February 2021
Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features journal December 1983
VMD: Visual molecular dynamics journal February 1996
Comparison of simple potential functions for simulating liquid water journal July 1983
Scalable molecular dynamics with NAMD journal January 2005
The Amber biomolecular simulation programs journal January 2005
ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB journal July 2015
CHARMM36m: an improved force field for folded and intrinsically disordered proteins journal November 2016
How Robust Are Protein Folding Simulations with Respect to Force Field Parameterization? journal May 2011
Particle mesh Ewald: An N ⋅log( N ) method for Ewald sums in large systems journal June 1993
Long-Time-Step Molecular Dynamics through Hydrogen Mass Repartitioning journal March 2015
Accelerating Membrane Simulations with Hydrogen Mass Repartitioning journal June 2019
ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution journal November 2019
BeStSel: a web server for accurate protein secondary structure prediction and fold recognition from the circular dichroism spectra journal June 2018
Two-Dimensional NMR and Protein Structure journal June 1989
NMRPipe: A multidimensional spectral processing system based on UNIX pipes journal November 1995
NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy journal December 2014
Protein Structural Information Derived from NMR Chemical Shift with the Neural Network Program TALOS-N book November 2014
High-throughput functional annotation and data mining with the Blast2GO suite journal April 2008
Consistent blind protein structure generation from NMR chemical shift data journal March 2008
UCSF ChimeraX : Structure visualization for researchers, educators, and developers journal October 2020
Benchmarking AlphaFold2 on peptide structure prediction journal January 2023
A series of PDB related databases for everyday needs journal November 2010
Structure of the Mutant E92K of [2Fe–2S] Ferredoxin I from Spinacia oleracea at 1.7 Å Resolution journal November 1998
STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins journal July 2004
Serverless Prediction of Peptide Properties with Recurrent Neural Networks journal April 2023
Scoring function for automated assessment of protein structure template quality journal January 2004
How significant is a protein structure similarity with TM-score = 0.5? journal February 2010
Stabilization of β-Hairpin Peptides by Salt Bridges:  Role of Preorganization in the Energetic Contribution of Weak Interactions journal July 2003
Binding, folding and insertion of a β-hairpin peptide at a lipid bilayer surface: Influence of electrostatics and lipid tail packing journal March 2018
Solution Structure of Amyloid β-Peptide (25−35) in Different Media journal July 2004
Novel β-Hairpin Antimicrobial Peptides Containing the β-Turn Sequence of -RRRF- Having High Cell Selectivity and Low Incidence of Drug Resistance journal March 2022