DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Biosystems Design by Machine Learning

Abstract

Biosystems such as enzymes, pathways, and whole cells have been increasingly explored for biotechnological applications. Yet, the intricate connectivity and complexity of biosystems pose a major hurdle in designing biosystems with desired features. As -omics and other high throughput technologies have been rapidly developed, the promise of applying machine learning (ML) techniques in biosystems design has started to become a reality. ML models enable the identification of patterns within complicated biological data across multiple scales of analysis and can augment biosystems design applications by predicting new candidates for optimized performance. ML is being used at every stage of biosystems design to help find non-obvious engineering solutions with fewer design iterations. In this review, we first describe commonly used models and modeling paradigms within ML. We then discuss some applications of these models that have already shown success in biotechnological applications. Moreover, we discuss successful applications at all scales of biosystems design, including nucleic acids, genetic circuits, proteins, pathways, genomes, and bioprocess. Lastly, we discuss some limitations of these methods and potential solutions as well as prospects of the combination of ML and biosystems design.

Authors:
 [1];  [2];  [1];  [3];  [2];  [4]
  1. Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), Urbana, IL (United States); Univ. of Illinois at Urbana-Champaign, IL (United States). Carl R. Woese Inst. for Genomic Biology and Dept. of Chemical and Biomolecular Engineering
  2. Univ. of Illinois at Urbana-Champaign, IL (United States). Dept. of Computer Science; Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), Urbana, IL (United States)
  3. Univ. of Illinois at Urbana-Champaign, IL (United States). Dept. of Chemistry and Carl R. Woese Inst. for Genomic Biology; Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), Urbana, IL (United States)
  4. Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), Urbana, IL (United States); Univ. of Illinois at Urbana-Champaign, IL (United States). Carl R. Woese Inst. for Genomic Biology, Dept. of Chemical and Biomolecular Engineering, and Dept. of Chemistry
Publication Date:
Research Org.:
Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), Urbana, IL (United States); Univ. of Illinois at Urbana-Champaign, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER); National Institutes of Health (NIH)
OSTI Identifier:
1632114
Grant/Contract Number:  
SC0018420; SC0018260; 1UM1HG009402; 1U54DK107965; AI144967
Resource Type:
Accepted Manuscript
Journal Name:
ACS Synthetic Biology
Additional Journal Information:
Journal Volume: 9; Journal Issue: 7; Journal ID: ISSN 2161-5063
Publisher:
American Chemical Society (ACS)
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Machine learning; Biosystems design; Synthetic biology; Metabolic engineering

Citation Formats

Volk, Michael Jeffrey, Lourentzou, Ismini, Mishra, Shekhar, Vo, Lam Tung, Zhai, Chengxiang, and Zhao, Huimin. Biosystems Design by Machine Learning. United States: N. p., 2020. Web. doi:10.1021/acssynbio.0c00129.
Volk, Michael Jeffrey, Lourentzou, Ismini, Mishra, Shekhar, Vo, Lam Tung, Zhai, Chengxiang, & Zhao, Huimin. Biosystems Design by Machine Learning. United States. https://doi.org/10.1021/acssynbio.0c00129
Volk, Michael Jeffrey, Lourentzou, Ismini, Mishra, Shekhar, Vo, Lam Tung, Zhai, Chengxiang, and Zhao, Huimin. Tue . "Biosystems Design by Machine Learning". United States. https://doi.org/10.1021/acssynbio.0c00129. https://www.osti.gov/servlets/purl/1632114.
@article{osti_1632114,
title = {Biosystems Design by Machine Learning},
author = {Volk, Michael Jeffrey and Lourentzou, Ismini and Mishra, Shekhar and Vo, Lam Tung and Zhai, Chengxiang and Zhao, Huimin},
abstractNote = {Biosystems such as enzymes, pathways, and whole cells have been increasingly explored for biotechnological applications. Yet, the intricate connectivity and complexity of biosystems pose a major hurdle in designing biosystems with desired features. As -omics and other high throughput technologies have been rapidly developed, the promise of applying machine learning (ML) techniques in biosystems design has started to become a reality. ML models enable the identification of patterns within complicated biological data across multiple scales of analysis and can augment biosystems design applications by predicting new candidates for optimized performance. ML is being used at every stage of biosystems design to help find non-obvious engineering solutions with fewer design iterations. In this review, we first describe commonly used models and modeling paradigms within ML. We then discuss some applications of these models that have already shown success in biotechnological applications. Moreover, we discuss successful applications at all scales of biosystems design, including nucleic acids, genetic circuits, proteins, pathways, genomes, and bioprocess. Lastly, we discuss some limitations of these methods and potential solutions as well as prospects of the combination of ML and biosystems design.},
doi = {10.1021/acssynbio.0c00129},
journal = {ACS Synthetic Biology},
number = 7,
volume = 9,
place = {United States},
year = {Tue Jun 02 00:00:00 EDT 2020},
month = {Tue Jun 02 00:00:00 EDT 2020}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 61 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Biosystems design by directed evolution
journal, July 2019


Complex systems in metabolic engineering
journal, December 2015


Synthetic biology and regulatory networks: where metabolic systems biology meets control engineering
journal, April 2016

  • He, Fei; Murabito, Ettore; Westerhoff, Hans V.
  • Journal of The Royal Society Interface, Vol. 13, Issue 117
  • DOI: 10.1098/rsif.2015.1046

Metabolic engineering, synthetic biology and systems biology
journal, January 2012


Systems biology of yeast: enabling technology for development of cell factories for production of advanced biofuels
journal, August 2012


Machine learning techniques for protein function prediction
journal, October 2019

  • Bonetta, Rosalin; Valentino, Gianluca
  • Proteins: Structure, Function, and Bioinformatics, Vol. 88, Issue 3
  • DOI: 10.1002/prot.25832

Using deep learning to model the hierarchical structure and function of a cell
journal, March 2018

  • Ma, Jianzhu; Yu, Michael Ku; Fong, Samson
  • Nature Methods, Vol. 15, Issue 4
  • DOI: 10.1038/nmeth.4627

Representation Learning: A Review and New Perspectives
journal, August 2013

  • Bengio, Y.; Courville, A.; Vincent, P.
  • IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, Issue 8
  • DOI: 10.1109/TPAMI.2013.50

Building high-level features using large scale unsupervised learning
conference, May 2013

  • Le, Quoc V.
  • ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • DOI: 10.1109/ICASSP.2013.6639343

Multi-Label Classification: An Overview
journal, July 2007

  • Tsoumakas, Grigorios; Katakis, Ioannis
  • International Journal of Data Warehousing and Mining, Vol. 3, Issue 3
  • DOI: 10.4018/jdwm.2007070101

Supervised Sequence Labelling
book, January 2012


Taking the Human Out of the Loop: A Review of Bayesian Optimization
journal, January 2016


The Elements of Statistical Learning
book, January 2009


Unsupervised Learning: Foundations of Neural Computation
January 1999


Reinforcement Learning Based Adaptive Sampling: REAPing Rewards by Exploring Protein Conformational Landscapes
journal, August 2018

  • Shamsi, Zahra; Cheng, Kevin J.; Shukla, Diwakar
  • The Journal of Physical Chemistry B, Vol. 122, Issue 35
  • DOI: 10.1021/acs.jpcb.8b06521

A deep learning based data driven soft sensor for bioprocesses
journal, August 2018


Active Learning
journal, June 2012


Automated analysis of high‐content microscopy data with deep learning
journal, April 2017

  • Kraus, Oren Z.; Grys, Ben T.; Ba, Jimmy
  • Molecular Systems Biology, Vol. 13, Issue 4
  • DOI: 10.15252/msb.20177551

A Survey on Transfer Learning
journal, October 2010

  • Pan, Sinno Jialin; Yang, Qiang
  • IEEE Transactions on Knowledge and Data Engineering, Vol. 22, Issue 10
  • DOI: 10.1109/TKDE.2009.191

Solving large scale linear prediction problems using stochastic gradient descent algorithms
conference, January 2004


Random Forests
journal, January 2001


An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression
journal, August 1992

  • Altman, N. S.
  • The American Statistician, Vol. 46, Issue 3
  • DOI: 10.2307/2685209

Deep Residual Learning for Image Recognition
conference, June 2016

  • He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2016.90

Long Short-Term Memory
journal, November 1997


Unified rational protein engineering with sequence-based deep representation learning
journal, October 2019


A deep learning framework for modeling structural features of RNA-binding protein targets
journal, October 2015

  • Zhang, Sai; Zhou, Jingtian; Hu, Hailin
  • Nucleic Acids Research, Vol. 44, Issue 4
  • DOI: 10.1093/nar/gkv1025

DeepSol: a deep learning framework for sequence-based protein solubility prediction
journal, March 2018


DeepLoc: prediction of protein subcellular localization using deep learning
journal, July 2017

  • Almagro Armenteros, José Juan; Sønderby, Casper Kaae; Sønderby, Søren Kaae
  • Bioinformatics, Vol. 33, Issue 21
  • DOI: 10.1093/bioinformatics/btx431

Construction of precise support vector machine based models for predicting promoter strength
journal, March 2017


iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators
journal, October 2018


Tuning the Performance of Synthetic Riboswitches using Machine Learning
journal, December 2018

  • Groher, Ann-Christin; Jager, Sven; Schneider, Christopher
  • ACS Synthetic Biology, Vol. 8, Issue 1
  • DOI: 10.1021/acssynbio.8b00207

Prediction of aptamer–protein interacting pairs based on sparse autoencoder feature extraction and an ensemble classifier
journal, May 2019


A novel nucleic acid sequence encoding strategy for high-performance aptamer identification and the aid of sequence design and optimization
journal, November 2017


Prediction of Aptamer-Target Interacting Pairs with Pseudo-Amino Acid Composition
journal, January 2014


The new frontier of genome engineering with CRISPR-Cas9
journal, November 2014


Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9
journal, January 2016

  • Doench, John G.; Fusi, Nicolo; Sullender, Meagan
  • Nature Biotechnology, Vol. 34, Issue 2
  • DOI: 10.1038/nbt.3437

CHOPCHOP v3: expanding the CRISPR web toolbox beyond genome editing
journal, May 2019

  • Labun, Kornel; Montague, Tessa G.; Krause, Maximilian
  • Nucleic Acids Research, Vol. 47, Issue W1
  • DOI: 10.1093/nar/gkz365

A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action
journal, October 2017


Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity
journal, January 2018

  • Kim, Hui Kwon; Min, Seonwoo; Song, Myungjae
  • Nature Biotechnology, Vol. 36, Issue 3
  • DOI: 10.1038/nbt.4061

DeepCRISPR: optimized CRISPR guide RNA design by deep learning
journal, June 2018


Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs
journal, January 2018

  • Listgarten, Jennifer; Weinstein, Michael; Kleinstiver, Benjamin P.
  • Nature Biomedical Engineering, Vol. 2, Issue 1
  • DOI: 10.1038/s41551-017-0178-6

Off-target predictions in CRISPR-Cas9 gene editing using deep learning
journal, September 2018


General Conclusions: Teleonomic Mechanisms in Cellular Metabolism, Growth, and Differentiation
journal, January 1961


Circuit simulation of genetic networks
journal, August 1995


Writing DNA with GenoCADTM
journal, May 2009

  • Czar, M. J.; Cai, Y.; Peccoud, J.
  • Nucleic Acids Research, Vol. 37, Issue Web Server
  • DOI: 10.1093/nar/gkp361

iBioSim: a tool for the analysis and design of genetic circuits
journal, July 2009


Genetic circuit design automation
journal, March 2016


An atlas of gene regulatory networks reveals multiple three‐gene mechanisms for interpreting morphogen gradients
journal, January 2010

  • Cotterell, James; Sharpe, James
  • Molecular Systems Biology, Vol. 6, Issue 1
  • DOI: 10.1038/msb.2010.74

Optimal Regulatory Circuit Topologies for Fold-Change Detection
journal, February 2017


Reverse Engineering the Gap Gene Network of Drosophila melanogaster
journal, May 2006


Efficient Reverse-Engineering of a Developmental Gene Regulatory Network
journal, July 2012


Evolving Robust Gene Regulatory Networks
journal, January 2015


Evolving phenotypic networks in silico
journal, November 2014


Designing synthetic networks in silico: a generalised evolutionary algorithm approach
journal, December 2017


Adapting machine-learning algorithms to design gene circuits
journal, April 2019


The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology
journal, June 2014

  • Galdzicki, Michal; Clancy, Kevin P.; Oberortner, Ernst
  • Nature Biotechnology, Vol. 32, Issue 6
  • DOI: 10.1038/nbt.2891

The game of chess and searches in protein sequence space
journal, December 1998


Exploring protein fitness landscapes by directed evolution
journal, December 2009

  • Romero, Philip A.; Arnold, Frances H.
  • Nature Reviews Molecular Cell Biology, Vol. 10, Issue 12
  • DOI: 10.1038/nrm2805

AAindex: amino acid index database, progress report 2008
journal, December 2007

  • Kawashima, S.; Pokarowski, P.; Pokarowska, M.
  • Nucleic Acids Research, Vol. 36, Issue Database
  • DOI: 10.1093/nar/gkm998

ProFET: Feature engineering captures high-level protein functions
journal, June 2015


Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics
journal, November 2015


Learned protein embeddings for machine learning
journal, March 2018


Amino acid substitution matrices from protein blocks.
journal, November 1992

  • Henikoff, S.; Henikoff, J. G.
  • Proceedings of the National Academy of Sciences, Vol. 89, Issue 22, p. 10915-10919
  • DOI: 10.1073/pnas.89.22.10915

Deep mutational scanning: a new style of protein science
journal, July 2014

  • Fowler, Douglas M.; Fields, Stanley
  • Nature Methods, Vol. 11, Issue 8
  • DOI: 10.1038/nmeth.3027

Deep learning
journal, May 2015

  • LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
  • Nature, Vol. 521, Issue 7553
  • DOI: 10.1038/nature14539

Metabolic Burden: Cornerstones in Synthetic Biology and Metabolic Engineering Applications
journal, August 2016


Metabolic pathway engineering
journal, March 2018


A machine learning approach to predict metabolic pathway dynamics from time-series multiomics data
journal, May 2018


Approaches to Computational Strain Design in the Multiomics Era
journal, April 2019


Succinate Overproduction: A Case Study of Computational Strain Design Using a Comprehensive Escherichia coli Kinetic Model
journal, January 2015

  • Khodayari, Ali; Chowdhury, Anupam; Maranas, Costas D.
  • Frontiers in Bioengineering and Biotechnology, Vol. 2
  • DOI: 10.3389/fbioe.2014.00076

A genome-scale Escherichia coli kinetic metabolic model k-ecoli457 satisfying flux data for multiple mutant strains
journal, December 2016

  • Khodayari, Ali; Maranas, Costas D.
  • Nature Communications, Vol. 7, Issue 1
  • DOI: 10.1038/ncomms13806

Machine learning methods for metabolic pathway prediction
journal, January 2010

  • Dale, Joseph M.; Popescu, Liviu; Karp, Peter D.
  • BMC Bioinformatics, Vol. 11, Issue 1
  • DOI: 10.1186/1471-2105-11-15

Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling
journal, January 2018


Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review
journal, March 2016


Mathematical Modeling and Dynamic Simulation of Metabolic Reaction Systems Using Metabolome Time Series Data
journal, May 2016

  • Sriyudthsak, Kansuporn; Shiraishi, Fumihide; Hirai, Masami Yokota
  • Frontiers in Molecular Biosciences, Vol. 3
  • DOI: 10.3389/fmolb.2016.00015

Principal component analysis of proteomics (PCAP) as a tool to direct metabolic engineering
journal, March 2015


From in vivo to in silico biology and back
journal, October 2006

  • Di Ventura, Barbara; Lemerle, Caroline; Michalodimitrakis, Konstantinos
  • Nature, Vol. 443, Issue 7111
  • DOI: 10.1038/nature05127

Improving Metabolic Pathway Efficiency by Statistical Model-Based Multivariate Regulatory Metabolic Engineering
journal, August 2016


Towards a fully automated algorithm driven platform for biosystems design
journal, November 2019


Customized optimization of metabolic pathways by combinatorial transcriptional engineering
journal, June 2012

  • Du, Jing; Yuan, Yongbo; Si, Tong
  • Nucleic Acids Research, Vol. 40, Issue 18
  • DOI: 10.1093/nar/gks549

Machine Learning of Designed Translational Control Allows Predictive Pathway Optimization in Escherichia coli
journal, December 2018


DeePromoter: Robust Promoter Predictor Using Deep Learning
journal, April 2019


Using Genome-scale Models to Predict Biological Capabilities
journal, May 2015


Current status and applications of genome-scale metabolic models
journal, June 2019


An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR)
journal, August 2003

  • Reed, Jennifer L.; Vo, Thuy D.; Schilling, Christophe H.
  • Genome Biology, Vol. 4, Issue 9, p. R54
  • DOI: 10.1186/gb-2003-4-9-r54

Leveraging knowledge engineering and machine learning for microbial bio-manufacturing
journal, July 2018


Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming
journal, April 2016


An integrated approach to characterize genetic interaction networks in yeast metabolism
journal, May 2011

  • Szappanos, Balázs; Kovács, Károly; Szamecz, Béla
  • Nature Genetics, Vol. 43, Issue 7
  • DOI: 10.1038/ng.846

Systematizing the generation of missing metabolic knowledge
journal, June 2010

  • Orth, Jeffrey D.; Palsson, Bernhard Ø.
  • Biotechnology and Bioengineering, Vol. 107, Issue 3
  • DOI: 10.1002/bit.22844

Machine and deep learning meet genome-scale metabolic modeling
journal, July 2019


Machine Learning Predicts the Yeast Metabolome from the Quantitative Proteome of Kinase Knockouts
journal, September 2018


Genome-Scale Identification of Legionella pneumophila Effectors Using a Machine Learning Approach
journal, July 2009


MIReNA: finding microRNAs with high accuracy and no learning at genome scale and from deep sequencing data
journal, June 2010


Facilitate Collaborations among Synthetic Biology, Metabolic Engineering and Machine Learning
journal, March 2016

  • Wu, Stephen Gang; Shimizu, Kazuyuki; Tang, Joseph Kuo-Hsiang
  • ChemBioEng Reviews, Vol. 3, Issue 2
  • DOI: 10.1002/cben.201500024

The potential of random forest and neural networks for biomass and recombinant protein modeling in Escherichia coli fed-batch fermentations
journal, August 2015

  • Melcher, Michael; Scharl, Theresa; Spangl, Bernhard
  • Biotechnology Journal, Vol. 10, Issue 11
  • DOI: 10.1002/biot.201400790

Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling: STAR Fermentation Modeling
journal, August 2016

  • Melcher, Michael; Scharl, Theresa; Luchner, Markus
  • Biotechnology and Bioengineering, Vol. 114, Issue 2
  • DOI: 10.1002/bit.26073

Data Mining and Analytics in the Process Industry: The Role of Machine Learning
journal, January 2017


Analysis of the tendency for the electronic conductivity to change during alcoholic fermentation
journal, April 2019


Machine learning framework for assessment of microbial factory performance
journal, January 2019


Evaluating Factors That Influence Microbial Synthesis Yields by Linear Regression with Numerical and Ordinal Variables
journal, November 2010

  • Colletti, Peter F.; Goyal, Yogesh; Varman, Arul M.
  • Biotechnology and Bioengineering, Vol. 108, Issue 4
  • DOI: 10.1002/bit.22996

Statistics-based model for prediction of chemical biosynthesis yield from Saccharomyces cerevisiae
journal, January 2011

  • Varman, Arul M.; Xiao, Yi; Leonard, Effendi
  • Microbial Cell Factories, Vol. 10, Issue 1
  • DOI: 10.1186/1475-2859-10-45

Sequencing technologies — the next generation
journal, December 2009

  • Metzker, Michael L.
  • Nature Reviews Genetics, Vol. 11, Issue 1
  • DOI: 10.1038/nrg2626

Analytics for Metabolic Engineering
journal, September 2015

  • Petzold, Christopher J.; Chan, Leanne Jade G.; Nhan, Melissa
  • Frontiers in Bioengineering and Biotechnology, Vol. 3
  • DOI: 10.3389/fbioe.2015.00135

Engineering Cellular Metabolism
journal, March 2016


Engineering biological systems using automated biofoundries
journal, July 2017


Building biological foundries for next-generation synthetic biology
journal, May 2015


Automated multiplex genome-scale engineering in yeast
journal, May 2017

  • Si, Tong; Chao, Ran; Min, Yuhao
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms15187

Machine Learning in Medicine
journal, April 2019

  • Rajkomar, Alvin; Dean, Jeffrey; Kohane, Isaac
  • New England Journal of Medicine, Vol. 380, Issue 14
  • DOI: 10.1056/NEJMra1814259

Improving catalytic function by ProSAR-driven enzyme evolution
journal, February 2007

  • Fox, Richard J.; Davis, S. Christopher; Mundorff, Emily C.
  • Nature Biotechnology, Vol. 25, Issue 3
  • DOI: 10.1038/nbt1286

Q-learning
journal, May 1992

  • Watkins, Christopher J. C. H.; Dayan, Peter
  • Machine Learning, Vol. 8, Issue 3-4
  • DOI: 10.1007/BF00992698

Human-level control through deep reinforcement learning
journal, February 2015

  • Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David
  • Nature, Vol. 518, Issue 7540
  • DOI: 10.1038/nature14236

Mastering the game of Go without human knowledge
journal, October 2017

  • Silver, David; Schrittwieser, Julian; Simonyan, Karen
  • Nature, Vol. 550, Issue 7676
  • DOI: 10.1038/nature24270

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
journal, December 2018


Unsupervised word sense disambiguation rivaling supervised methods
conference, January 1995

  • Yarowsky, David
  • Proceedings of the 33rd annual meeting on Association for Computational Linguistics -
  • DOI: 10.3115/981658.981684

Effective self-training for parsing
conference, January 2006

  • McClosky, David; Charniak, Eugene; Johnson, Mark
  • Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics -
  • DOI: 10.3115/1220835.1220855

Combining labeled and unlabeled data with co-training
conference, January 1998

  • Blum, Avrim; Mitchell, Tom
  • Proceedings of the eleventh annual conference on Computational learning theory - COLT' 98
  • DOI: 10.1145/279943.279962

Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning
journal, August 2019

  • Miyato, Takeru; Maeda, Shin-Ichi; Koyama, Masanori
  • IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, Issue 8
  • DOI: 10.1109/TPAMI.2018.2858821

Dual Strategy Active Learning
book, January 2007


Active learning using pre-clustering
conference, January 2004

  • Nguyen, Hieu T.; Smeulders, Arnold
  • Twenty-first international conference on Machine learning - ICML '04
  • DOI: 10.1145/1015330.1015349

Inductive transfer learning for molecular activity prediction: Next-Gen QSAR Models with MolPMoFiT
journal, April 2020


I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure
journal, July 2005

  • Capriotti, E.; Fariselli, P.; Casadio, R.
  • Nucleic Acids Research, Vol. 33, Issue Web Server
  • DOI: 10.1093/nar/gki375

Predicting protein stability changes from sequences using support vector machines
journal, September 2005


Prediction of protein stability changes for single-site mutations using support vector machines
journal, December 2005

  • Cheng, Jianlin; Randall, Arlo; Baldi, Pierre
  • Proteins: Structure, Function, and Bioinformatics, Vol. 62, Issue 4
  • DOI: 10.1002/prot.20810

In silico characterization of protein chimeras: Relating sequence and function within the same fold
journal, October 2009

  • Buske, Fabian A.; Their, Ricarda; Gillam, Elizabeth M. J.
  • Proteins: Structure, Function, and Bioinformatics, Vol. 77, Issue 1
  • DOI: 10.1002/prot.22422

ProTherm, Thermodynamic Database for Proteins and Mutants: developments in version 3.0
journal, January 2002


Predicting changes in protein thermostability brought about by single- or multi-site mutations
journal, January 2010


Grading amino acid properties increased accuracies of single point mutation on protein stability prediction
journal, March 2012


PROTS-RF: A Robust Model for Predicting Mutation-Induced Protein Stability Changes
journal, October 2012


Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools
journal, September 2015


mCSM: predicting the effects of mutations in proteins using graph-based signatures
journal, November 2013


Navigating the protein fitness landscape with Gaussian processes
journal, December 2012

  • Romero, P. A.; Krause, A.; Arnold, F. H.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 3
  • DOI: 10.1073/pnas.1215251110

The Protein Data Bank
journal, January 2000


mGPfusion: predicting protein stability changes with Gaussian process kernel learning and data fusion
journal, June 2018


Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0
journal, August 2009


NeEMO: a method using residue interaction networks to improve prediction of protein stability upon mutation
journal, January 2014


PROSO II - a new method for protein solubility prediction: PROSO II
journal, May 2012


Evaluation of methods for modeling transcription factor sequence specificity
journal, January 2013

  • Weirauch, Matthew T.; Cote, Atina; Norel, Raquel
  • Nature Biotechnology, Vol. 31, Issue 2
  • DOI: 10.1038/nbt.2486

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
journal, July 2015

  • Alipanahi, Babak; Delong, Andrew; Weirauch, Matthew T.
  • Nature Biotechnology, Vol. 33, Issue 8
  • DOI: 10.1038/nbt.3300

The ENCODE project
journal, November 2012


Convolutional neural network architectures for predicting DNA–protein binding
journal, June 2016


sc-PDB: a 3D-database of ligandable binding sites—10 years on
journal, October 2014

  • Desaphy, Jérémy; Bret, Guillaume; Rognan, Didier
  • Nucleic Acids Research, Vol. 43, Issue D1
  • DOI: 10.1093/nar/gku928

DeepSite: protein-binding site predictor using 3D-convolutional neural networks
journal, May 2017


Learning epistatic interactions from sequence-activity data to predict enantioselectivity
journal, December 2017

  • Zaugg, Julian; Gumulya, Yosephine; Malde, Alpeshkumar K.
  • Journal of Computer-Aided Molecular Design, Vol. 31, Issue 12
  • DOI: 10.1007/s10822-017-0090-x

BRENDA, enzyme data and metabolic information
journal, January 2002


Semisupervised Gaussian Process for Automated Enzyme Search
journal, March 2016


Machine learning-assisted directed protein evolution with combinatorial libraries
journal, April 2019

  • Wu, Zachary; Kan, S. B. Jennifer; Lewis, Russell D.
  • Proceedings of the National Academy of Sciences, Vol. 116, Issue 18
  • DOI: 10.1073/pnas.1901979116

Machine Learning Identifies Chemical Characteristics That Promote Enzyme Catalysis
journal, February 2019

  • Bonk, Brian M.; Weis, James W.; Tidor, Bruce
  • Journal of the American Chemical Society, Vol. 141, Issue 9
  • DOI: 10.1021/jacs.8b13879

Global Topology Analysis of the Escherichia coli Inner Membrane Proteome
journal, May 2005


A statistical model for improved membrane protein expression using sequence-derived features
journal, March 2018

  • Saladi, Shyam M.; Javed, Nauman; Müller, Axel
  • Journal of Biological Chemistry, Vol. 293, Issue 13
  • DOI: 10.1074/jbc.RA117.001052

Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization
journal, October 2017


UniProt: a worldwide hub of protein knowledge
November 2018


Machine-Learning-Guided Mutagenesis for Directed Evolution of Fluorescent Proteins
journal, August 2018


Predicting growth rate from gene expression
journal, December 2018

  • Wytock, Thomas P.; Motter, Adilson E.
  • Proceedings of the National Academy of Sciences, Vol. 116, Issue 2
  • DOI: 10.1073/pnas.1808080116

Data denoising with transfer learning in single-cell transcriptomics
journal, August 2019


Single-cell RNA-seq denoising using a deep count autoencoder
journal, January 2019