DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: UniKP: a unified framework for the prediction of enzyme kinetic parameters

Journal Article · · Nature Communications
ORCiD logo [1];  [2];  [2]; ORCiD logo [3]; ORCiD logo [1]
  1. Chinese Academy of Sciences, Shenzhen (China); University of Chinese Academy of Sciences, Beijing (China); Chinese Academy of Sciences, Shenzhen (China)
  2. Chinese Academy of Sciences, Shenzhen (China); Chinese Academy of Sciences, Shenzhen (China)
  3. Shenzhen (China); Chinese Academy of Sciences, Shenzhen (China); Joint BioEnergy Institute, Emeryville, CA (United States); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States); University of California, Berkeley, CA (United States); Technical University of Denmark, Lyngby (Denmark)

Prediction of enzyme kinetic parameters is essential for designing and optimizing enzymes for various biotechnological and industrial applications, but the limited performance of current prediction tools on diverse tasks hinders their practical applications. Here, we introduce UniKP, a unified framework based on pretrained language models for the prediction of enzyme kinetic parameters, including enzyme turnover number (kcat), Michaelis constant (Km), and catalytic efficiency (kcat / Km), from protein sequences and substrate structures. A two-layer framework derived from UniKP (EF-UniKP) has also been proposed to allow robust kcat prediction in considering environmental factors, including pH and temperature. In addition, four representative re-weighting methods are systematically explored to successfully reduce the prediction error in high-value prediction tasks. We have demonstrated the application of UniKP and EF-UniKP in several enzyme discovery and directed evolution tasks, leading to the identification of new enzymes and enzyme mutants with higher activity. UniKP is a valuable tool for deciphering the mechanisms of enzyme kinetics and enables novel insights into enzyme engineering and their industrial applications.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
2470985
Journal Information:
Nature Communications, Journal Name: Nature Communications Journal Issue: 1 Vol. 14; ISSN 2041-1723
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (41)

Discovery of Novel Tyrosine Ammonia Lyases for the Enzymatic Synthesis of p‐Coumaric Acid journal April 2022
Ensemble learning: A survey journal February 2018
Characterization of mutants of a tyrosine ammonia-lyase from Rhodotorula glutinis journal July 2016
An approach for classification of highly imbalanced data using weighting and undersampling journal April 2010
Basic local alignment search tool journal October 1990
Directed evolution of enzyme catalysts journal December 1997
Metabolic Models of Protein Allocation Call for the Kinetome journal December 2017
Underground metabolism: network-level perspective and biotechnological potential journal February 2018
Plant uptake of NaCl in relation to enzyme kinetics and toxic effects journal September 2008
Functional expression in Escherichia coli of the tyrosine-inducible tyrosine ammonia-lyase enzyme from yeast Trichosporon cutaneum for production of p-hydroxycinnamic acid journal September 2007
Production of p-hydroxycinnamic acid from glucose in Saccharomyces cerevisiae and Escherichia coli by expression of heterologous genes from plants and fungi journal March 2007
The Moderately Efficient Enzyme: Evolutionary and Physicochemical Trends Shaping Enzyme Parameters journal May 2011
Machine learning applied to enzyme turnover numbers reveals protein structural correlates and improves metabolic models journal December 2018
Building a global alliance of biofoundries journal May 2019
Deep learning-based kcat prediction enables improved enzyme-constrained model reconstruction journal June 2022
Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently journal January 2015
A Note on the Kinetics of Enzyme Action journal January 1925
Despite slow catalysis and confused substrate specificity, all ribulose bisphosphate carboxylases may be nearly perfectly optimized journal April 2006
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences journal April 2021
Structural Flexibility Modulates the Activity of Human Glutathione Transferase P1-1 journal July 1996
Genome-wide Analysis of Substrate Specificities of the Escherichia coli Haloacid Dehalogenase-like Phosphatase Family journal September 2006
Anticancer peptides prediction with deep representation learning features journal February 2021
IPPF-FE: an integrated peptide and protein function prediction framework based on fused features and ensemble models journal November 2022
Deep learning improves antimicrobial peptide recognition journal March 2018
Predicting plant Rubisco kinetics from RbcL sequence data using machine learning journal September 2022
BRENDA, the enzyme information system in 2011 journal November 2010
SABIO-RK--database for biochemical reaction kinetics journal November 2011
PubChem Substance and Compound databases journal September 2015
ImageNet: A large-scale hierarchical image database
  • Deng, Jia; Dong, Wei; Socher, Richard
  • 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), 2009 IEEE Conference on Computer Vision and Pattern Recognition https://doi.org/10.1109/CVPR.2009.5206848
conference June 2009
Class-Balanced Loss Based on Effective Number of Samples conference June 2019
ProtTrans: Towards Cracking the Language of Lifes Code Through Self-Supervised Deep Learning and High Performance Computing journal January 2021
Highly Active and Specific Tyrosine Ammonia-Lyases from Diverse Origins Enable Enhanced Production of Aromatic Compounds in Bacteria and Saccharomyces cerevisiae journal April 2015
Deep learning allows genome-scale prediction of Michaelis constants from structural features journal October 2021
Prediction of Microbial Growth Rate versus Biomass Yield by a Metabolic Network with Kinetic Parameters journal July 2012
Exploiting spatial dimensions to enable parallelized continuous directed evolution journal September 2022
Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks journal May 2021
SMOTE: Synthetic Minority Over-sampling Technique journal January 2002
A Unified Geolocation Channel Model--Part I (Path Loss) journal January 2017
The class imbalance problem: A systematic study1 journal November 2002
Identification of Protein Subcellular Localization With Network and Functional Embeddings journal January 2021
T4SE-XGB: Interpretable Sequence-Based Prediction of Type IV Secreted Effectors Using eXtreme Gradient Boosting Algorithm journal September 2020