Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering

Journal Article · · ACS Central Science
 [1];  [2];  [3]
  1. Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States
  2. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States
  3. Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, United States, Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California 91125, United States

Not Available

Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
Grant/Contract Number:
SC0022218
OSTI ID:
2287699
Alternate ID(s):
OSTI ID: 2316027
OSTI ID: 2287639
Journal Information:
ACS Central Science, Journal Name: ACS Central Science Journal Issue: 2 Vol. 10; ISSN 2374-7943
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English

References (224)

Directed Evolution: Bringing New Chemistry to Life journal November 2017
Epistasis in protein evolution: Epistasis in Protein Evolution journal February 2016
Design and optimization of enzymatic activity in a de novo β‐barrel scaffold journal October 2022
Why are proteins marginally stable? journal December 2001
Machine Learning-driven Protein Library Design: A Path Toward Smarter Libraries book April 2022
Machine Learning for Protein Engineering book October 2023
Learning epistatic interactions from sequence-activity data to predict enantioselectivity journal December 2017
Learning the local landscape of protein structures with convolutional neural networks journal November 2021
The NK model of rugged fitness landscapes and its application to maturation of the immune response journal November 1989
Protein sequence design with deep generative models journal December 2021
Structure-based protein design with deep learning journal December 2021
Leveraging Uncertainty in Machine Learning Accelerates Biological Discovery and Design journal November 2020
Informed training set design enables efficient machine learning-assisted directed protein evolution journal August 2021
Evolutionary velocity with protein language models predicts evolutionary dynamics of diverse proteins journal April 2022
In vitro continuous protein evolution empowered by machine learning and automation journal August 2023
ProGen2: Exploring the boundaries of protein language models journal November 2023
Generative design of de novo proteins based on secondary-structure constraints using an attention-based diffusion model journal July 2023
Machine learning to navigate fitness landscapes for protein engineering journal June 2022
From sequence to function through structure: Deep learning for protein design journal January 2023
LibGENiE – A bioinformatic pipeline for the design of information-enriched enzyme libraries journal January 2023
A Comprehensive Biophysical Description of Pairwise Epistasis throughout an Entire Protein Domain journal November 2014
Why High-error-rate Random Mutagenesis Libraries are Enriched in Functional and Improved Proteins journal July 2005
The 3D Modules of Enzyme Catalysis: Deconstructing Active Sites into Distinct Functional Entities journal October 2023
Automated Structure- and Sequence-Based Design of Proteins for High Bacterial Expression and Stability journal July 2016
Automated Design of Efficient and Functionally Diverse Enzyme Repertoires journal October 2018
Stability effects of mutations and protein evolvability journal October 2009
Advances in machine learning for directed evolution journal August 2021
Epistasis and intramolecular networks in protein evolution journal August 2021
Adaptive machine learning for protein engineering journal February 2022
Deep generative modeling for protein design journal February 2022
Directed Evolution: Methodologies and Applications journal July 2021
Deep Dive into Machine Learning Models for Protein Engineering journal April 2020
CLADE 2.0: Evolution-Driven Cluster Learning-Assisted Directed Evolution journal September 2022
Exposing the Limitations of Molecular Machine Learning with Activity Cliffs journal December 2022
QM/MM Modeling Aided Enzyme Engineering in Natural Products Biosynthesis journal August 2023
Analyzing Learned Molecular Representations for Property Prediction journal July 2019
The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design journal May 2017
Designing Chemical Reaction Arrays Using Phactor and ChatGPT journal August 2023
Designed High-Redox Potential Laccases Exhibit High Functional Diversity journal October 2022
Ancestral Sequence Reconstruction Enhances Gene Mining Efforts for Industrial Ene Reductases by Expanding Enzyme Panels with Thermostable Catalysts journal February 2023
Structure-Based Design of Small Imine Reductase Panels for Target Substrates journal September 2023
Machine Learning-Guided Protein Engineering journal October 2023
Catalyst Energy Prediction with CatBERTa: Unveiling Feature Exploration Strategies through Large Language Models journal November 2023
Diverse Engineered Heme Proteins Enable Stereodivergent Cyclopropanation of Unactivated Alkenes journal February 2018
Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules journal January 2018
Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction journal August 2019
100th Anniversary of Macromolecular Science Viewpoint: Data-Driven Protein Design journal February 2021
Discovery of Novel Gain-of-Function Mutations Guided by Structure-Based Deep Learning journal October 2020
evSeq: Cost-Effective Amplicon Sequencing of Every Variant in a Protein Library journal February 2022
“Multiagent” Screening Improves Directed Enzyme Evolution by Identifying Epistatic Mutations journal May 2022
Coherent Blending of Biophysics-Based Knowledge with Bayesian Neural Networks for Robust Protein Property Prediction journal October 2023
ProtWave-VAE: Integrating Autoregressive Sampling with Latent-Based Inference for Data-Driven Protein Design journal November 2023
DeCOIL: Optimization of Degenerate Codon Libraries for Machine Learning-Assisted Protein Engineering journal July 2023
Machine-Learning-Guided Mutagenesis for Directed Evolution of Fluorescent Proteins journal August 2018
Keep on Moving: Discovering and Perturbing the Conformational Dynamics of Enzymes journal December 2014
Molecular Dynamics:  Survey of Methods for Simulating the Activity of Proteins journal May 2006
Repertoire of Computationally Designed Peroxygenases for Enantiodivergent C–H Oxyfunctionalization Reactions journal January 2023
Design of Heme Enzymes with a Tunable Substrate Binding Pocket Adjacent to an Open Metal Coordination Site journal June 2023
Using Data Science for Mechanistic Insights and Selectivity Predictions in a Non-Natural Biocatalytic Reaction journal August 2023
Natural Selection and the Concept of a Protein Space journal February 1970
Engineering the third wave of biocatalysis journal May 2012
The coming of age of de novo protein design journal September 2016
Mutation effects predicted from sequence co-variation journal January 2017
Molecular evolution by staggered extension process (StEP) in vitro recombination journal March 1998
Integrative genomic mining for enzyme function to enable engineering of a non-natural biosynthetic pathway journal November 2015
Single-mutation fitness landscapes for an enzyme on multiple substrates reveal specificity is globally encoded journal June 2017
Methods for the directed evolution of proteins journal June 2015
Exploring protein fitness landscapes by directed evolution journal December 2009
Protein building blocks preserved by recombination journal June 2002
Pervasive cooperative mutational effects on multiple catalytic enzyme traits emerge via long-range conformational dynamics journal March 2021
Protein design and variant prediction using autoregressive generative models journal April 2021
Structure-based protein function prediction using graph convolutional networks journal May 2021
Machine learning differentiates enzymatic and non-enzymatic metals in proteins journal June 2021
Epistatic Net allows the sparse spectral regularization of deep neural networks for inferring fitness functions journal September 2021
Machine learning-guided acyl-ACP reductase engineering for improved in vivo fatty alcohol production journal October 2021
Neural relational inference to learn long-range allosteric interactions in proteins from molecular dynamics simulations journal March 2022
ProtGPT2 is a deep unsupervised language model for protein design journal July 2022
Designed active-site library reveals thousands of functional GFP variants journal May 2023
Rapid planning and analysis of high-throughput experiment arrays for reaction discovery journal July 2023
The importance of catalytic promiscuity for enzyme design and evolution journal November 2019
Highly accurate protein structure prediction with AlphaFold journal July 2021
Disease variant prediction with deep generative models of evolutionary data journal October 2021
De novo protein design by deep network hallucination journal December 2021
Mapping the energetic and allosteric landscapes of protein binding domains journal April 2022
Machine learning-aided engineering of hydrolases for PET depolymerization journal April 2022
De novo design of luciferases using deep learning journal February 2023
Mega-scale experimental analysis of protein folding stability in biology and design journal July 2023
De novo design of protein structure and function with RFdiffusion journal July 2023
Clustering predicted structures at the scale of the known protein universe journal September 2023
Unraveling the functional dark matter through global metagenomics journal October 2023
Deep diversification of an AAV capsid protein by machine learning journal February 2021
Learning protein fitness models from evolutionary and assay-labeled data journal January 2022
Using deep learning to annotate the protein universe journal February 2022
Single-sequence protein structure prediction using a language model and deep learning journal October 2022
Large language models generate functional protein sequences across diverse families journal January 2023
Efficient evolution of human antibodies from general protein language models journal April 2023
Deep generative models of genetic variation capture the effects of mutations journal September 2018
Machine-learning-guided directed evolution for protein engineering journal July 2019
Machine learning-guided channelrhodopsin engineering enables minimally invasive optogenetics journal October 2019
Unified rational protein engineering with sequence-based deep representation learning journal October 2019
CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks journal February 2021
Macromolecular modeling and design in Rosetta: recent methods and frameworks journal June 2020
Low-N protein engineering with data-efficient deep learning journal April 2021
Systematic molecular evolution enables robust biomolecule discovery journal December 2021
EnzymeML: seamless data flow and modeling of enzymatic data journal February 2023
Engineering new catalytic activities in enzymes journal January 2020
Deep learning-based kcat prediction enables improved enzyme-constrained model reconstruction journal June 2022
Machine learning-enabled retrobiosynthesis of molecules journal February 2023
Expanding functional protein sequence spaces using generative adversarial networks journal March 2021
Controllable protein design with language models journal June 2022
State-specific protein–ligand complex structure prediction with a multiscale deep generative model journal February 2024
Biocatalysis journal June 2021
Cluster learning-assisted directed evolution journal December 2021
Persistent spectral theory-guided protein engineering journal February 2023
Combining chemistry and protein engineering for new-to-nature biocatalysis journal January 2022
Predicting enzymatic reactions with a molecular transformer journal January 2021
On the conservative nature of intragenic recombination journal April 2005
Protein stability promotes evolvability journal March 2006
Navigating the protein fitness landscape with Gaussian processes journal December 2012
Structure-guided SCHEMA recombination generates diverse chimeric channelrhodopsins journal March 2017
Machine learning-assisted directed protein evolution with combinatorial libraries journal April 2019
Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis journal August 2019
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences journal April 2021
Protein sequence design by conformational landscape optimization journal March 2021
Neural networks to learn protein sequence–function relationships from deep mutational scanning data journal November 2021
On the sparsity of fitness functions and implications for learning journal December 2021
Enhancing computational enzyme design by a maximum entropy strategy journal February 2022
Conformal prediction under feedback covariate shift for biomolecular design journal October 2022
Decoupling of catalysis and transition state analog binding from mutations throughout a phosphatase revealed by high-throughput enzymology journal July 2023
Enhancing luciferase activity and stability through generative modeling of natural enzyme sequences journal November 2023
Interpretable Numerical Descriptors of Amino Acid Space journal May 2009
BioGPT: generative pre-trained transformer for biomedical text generation and mining journal September 2022
Protein engineering via Bayesian optimization-guided evolutionary algorithm and robotic experiments journal December 2022
Nucleotide augmentation for machine learning-guided protein engineering journal December 2022
PROSS 2: a new server for the design of stable and highly expressed protein variants journal December 2020
Enzyme promiscuity prediction using hierarchy-informed multi-label classification journal January 2021
ProteinBERT: a universal deep-learning model of protein sequence and function journal February 2022
Predicting enzymatic function of protein sequences with attention journal October 2023
UniProt: the Universal Protein Knowledgebase in 2023 journal November 2022
The FoldX web server: an online force field journal July 2005
Protein Design is NP-hard journal October 2002
Graphein - a Python Library for Geometric Deep Learning and Network Analysis on Protein Structures and Interaction Networks preprint October 2021
MSA Transformer preprint August 2021
FLIP: Benchmark tasks in fitness landscape inference for proteins journal January 2022
Function-guided protein design by deep manifold sampling preprint December 2021
Learning inverse folding from millions of predicted structures preprint September 2022
Generative power of a protein language model trained on multiple sequence alignments preprint November 2022
Convolutions are competitive with transformers for protein sequence pretraining preprint February 2024
Masked inverse folding with sequence transfer for protein representation learning preprint March 2023
High-resolution de novo structure prediction from primary sequence preprint July 2022
Advancing Antibiotic Resistance Classification with Deep Learning Using Protein Sequence and Structure preprint April 2023
GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics preprint November 2022
Tuned Fitness Landscapes for Benchmarking Model-Guided Protein Design journal October 2022
Protein design using structure-based residue preferences preprint June 2023
Codon language embeddings provide strong signals for protein engineering preprint December 2022
Language models generalize beyond natural proteins preprint December 2022
A high-level programming language for generative protein design preprint December 2022
ProT-VAE: Protein Transformer Variational AutoEncoder for Functional Protein Design preprint January 2023
Meta Learning Improves Robustness and Performance in Machine Learning-Guided Protein Engineering preprint January 2023
Computational Scoring and Experimental Evaluation of Enzymes Generated by Neural Networks preprint April 2023
What is hidden in the darkness? Deep-learning assisted large-scale protein family curation uncovers novel protein families and folds preprint March 2023
Flattening the curve - How to get better results with small deep-mutational-scanning datasets preprint October 2023
Sequence VS. Structure: Delving Deep into Data-Driven Protein Function Prediction preprint April 2023
Benchmarking Uncertainty Quantification for Protein Engineering preprint April 2023
ATOMDANCE: kernel-based denoising and allosteric resonance analysis for functional and evolutionary comparisons of protein dynamics journal October 2023
AnnoPRO: an Innovative Strategy for Protein Function Annotation based on Image-like Protein Representation and Multimodal Deep Learning journal May 2023
Self-driving laboratories to autonomously navigate the protein fitness landscape preprint May 2023
Assessing the performance of protein regression models preprint September 2023
FLOP: Tasks for Fitness Landscapes Of Protein wildtypes preprint June 2023
Fine-tuning Protein Embeddings for Generalizable Annotation Propagation journal June 2023
xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein preprint January 2024
Learning Complete Protein Representation by Deep Coupling of Sequence and Structure preprint July 2023
The simplicity of protein sequence-function relationships preprint September 2023
Protein generation with evolutionary diffusion: sequence is all you need preprint September 2023
Minimal epistatic networks from integrated sequence and mutational protein data preprint September 2023
What makes the effect of protein mutations difficult to predict? preprint September 2023
Removing bias in sequence models of protein fitness preprint September 2023
Improving protein expression, stability, and function with ProteinMPNN preprint October 2023
Generalized Biomolecular Modeling and Design with RoseTTAFold All-Atom preprint October 2023
Harnessing Generative AI to Decode Enzyme Catalysis and Evolution for Enhanced Engineering preprint October 2023
Explainable protein function annotation using local structure embeddings journal October 2023
Protein Language Models Uncover Carbohydrate-Active Enzyme Function in Metagenomics preprint October 2023
TRILL: Orchestrating Modular Deep-Learning Workflows for Democratized, Scalable Protein Analysis and Engineering preprint November 2023
The genetic architecture of protein stability preprint October 2023
Neural network extrapolation to distant regions of the protein fitness landscape preprint November 2023
Contrasting Sequence with Structure: Pre-training Graph Representations with PLMs preprint December 2023
Fine-tuning protein language models boosts predictions across diverse tasks preprint December 2023
Inverse folding of protein complexes with a structure-informed language model enables unsupervised antibody evolution preprint December 2023
Leveraging ancestral sequence reconstruction for protein representation learning preprint December 2023
Unexplored regions of the protein sequence-structure map revealed at scale by a library of foldtuned language models preprint December 2023
Evaluating Protein Transfer Learning with TAPE preprint June 2019
Is Novelty Predictable? journal December 2023
ProtTrans: Towards Cracking the Language of Lifes Code Through Self-Supervised Deep Learning and High Performance Computing journal January 2021
Optimal trade-off control in machine learning–based library design, with application to adeno-associated virus (AAV) for gene therapy journal January 2024
Protein Dynamism and Evolvability journal April 2009
Computational Design of an Enzyme Catalyst for a Stereoselective Bimolecular Diels-Alder Reaction journal July 2010
An evolution-based model for designing chorismate mutase enzymes journal July 2020
Revealing enzyme functional architecture via high-throughput microfluidic enzyme kinetics journal July 2021
Accurate prediction of protein structures and interactions using a three-track neural network journal July 2021
Scaffolding protein functional sites using deep learning journal July 2022
Hallucinating symmetric protein assemblies journal October 2022
Robust deep learning–based protein sequence design using ProteinMPNN journal October 2022
Evolutionary-scale prediction of atomic-level protein structure with a language model journal March 2023
Combinatorial assembly and design of enzymes journal January 2023
Enzyme function prediction using contrastive learning journal March 2023
Accurate proteome-wide missense variant effect prediction with AlphaMissense journal September 2023
A rugged yet easily navigable fitness landscape journal November 2023
From nature to industry: Harnessing enzymes for biocatalysis journal November 2023
Autonomous, multiproperty-driven molecular discovery: From predictions to measurements and back journal December 2023
High-throughput deep learning variant effect prediction with Sequence UNET journal May 2023
How Protein Stability and New Functions Trade Off journal February 2008
Ancestral Reconstruction journal July 2016
Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization journal October 2017
Generating functional protein variants with variational autoencoders journal February 2021
Machine learning modeling of family wide enzyme-substrate specificity screens journal February 2022
Engineering indel and substitution variants of diverse and ancient enzymes using Graphical Representation of Ancestral Sequence Predictions (GRASP) journal October 2022
Inferring protein fitness landscapes from laboratory evolution experiments journal March 2023
Ancestral protein reconstruction: techniques and applications journal January 2016
Enabling high‐throughput biology with flexible open‐source automation journal March 2021
Zero-shot prediction of mutation effects on protein function with multimodal deep representation learning preprint October 2023
EnzymeMap: Curation, validation and data-driven prediction of enzymatic reactions preprint April 2023
Protein Fitness Prediction Is Impacted by the Interplay of Language Models, Ensemble Learning, and Sampling Methods journal April 2023
De Novo Design of a Highly Stable Ovoid TIM Barrel: Unlocking Pocket Shape towards Functional Design journal January 2022
Adaptation in protein fitness landscapes is facilitated by indirect paths journal July 2016

Similar Records

DeCOIL: Optimization of Degenerate Codon Libraries for Machine Learning-Assisted Protein Engineering
Journal Article · Mon Jul 31 00:00:00 EDT 2023 · ACS Synthetic Biology · OSTI ID:1992599

Related Subjects