Evaluating point-prediction uncertainties in neural networks for protein-ligand binding prediction

Fan, Ya Ju; Allen, Jonathan E.; McLoughlin, Kevin S.; Shi, Da; Bennion, Brian J.; Zhang, Xiaohua; Lightstone, Felice C.

doi:10.1016/j.aichem.2023.100004

Title: Evaluating point-prediction uncertainties in neural networks for protein-ligand binding prediction

Journal Article · Sat Jun 03 00:00:00 EDT 2023 · Artificial Intelligence Chemistry

DOI:https://doi.org/10.1016/j.aichem.2023.100004· OSTI ID:1988215

^[1]; Allen, Jonathan E. ^[2]; McLoughlin, Kevin S. ^[2];

^[3]; Bennion, Brian J. ^[4]; Zhang, Xiaohua ^[4]; Lightstone, Felice C. ^[4]

Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States). Center for Applied Scientific Computing
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States). Biological Science and Security Center
Frederick National Laboratory for Cancer Research, Frederick, MD (United States)
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)

Neural Network (NN) models provide potential to speed up the drug discovery process and reduce its failure rates. The success of NN models requires uncertainty quantification (UQ) as drug discovery explores chemical space beyond the training data distribution. Standard NN models do not provide uncertainty information. Some methods require changing the NN architecture or training procedure, limiting the selection of NN models. Moreover, predictive uncertainty can come from different sources. It is important to have the ability to separately model different types of predictive uncertainty, as the model can take assorted actions depending on the source of uncertainty. In this paper, we examine UQ methods that estimate different sources of predictive uncertainty for NN models aiming at protein-ligand binding prediction. We use our prior knowledge on chemical compounds to design the experiments. By utilizing a visualization method we create non-overlapping and chemically diverse partitions from a collection of chemical compounds. These partitions are used as training and test set splits to explore NN model uncertainty. We demonstrate how the uncertainties estimated by the selected methods describe different sources of uncertainty under different partitions and featurization schemes and the relationship to prediction error.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)

Sponsoring Organization:: USDOE National Nuclear Security Administration (NNSA); Defense Threat Reduction Agency (DTRA); National Institutes of Health (NIH); Department of Health and Human Services

Grant/Contract Number:: AC52-07NA27344; HDTRA1036045; 75N91019D00024; 75N91019F00134

OSTI ID:: 1988215

Report Number(s):: LLNL-JRNL-839676; 1060646

Journal Information:: Artificial Intelligence Chemistry, Vol. 1, Issue 1; ISSN 2949-7477

Publisher:: ElsevierCopyright Statement

Country of Publication:: United States

Language:: English

References (27)

High-throughput virtual screening of small molecule inhibitors for SARS-CoV-2 protein targets with deep fusion models Stevenson, Garrett A.; Jones, Derek; Kim, Hyojin Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3458817.3476193	conference	November 2021
Protein kinases — the major drug targets of the twenty-first century? Cohen, Philip Nature Reviews Drug Discovery, Vol. 1, Issue 4 https://doi.org/10.1038/nrd773	journal	April 2002
Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction Scalia, Gabriele; Grambow, Colin A.; Pernici, Barbara Journal of Chemical Information and Modeling, Vol. 60, Issue 6 https://doi.org/10.1021/acs.jcim.9b00975	journal	April 2020
Opportunities and obstacles for deep learning in biology and medicine Ching, Travers; Himmelstein, Daniel S.; Beaulieu-Jones, Brett K. Journal of The Royal Society Interface, Vol. 15, Issue 141 https://doi.org/10.1098/rsif.2017.0387	journal	April 2018
Analyzing Learned Molecular Representations for Property Prediction Yang, Kevin; Swanson, Kyle; Jin, Wengong Journal of Chemical Information and Modeling, Vol. 59, Issue 8 https://doi.org/10.1021/acs.jcim.9b00237	journal	July 2019
A hybrid framework for improving uncertainty quantification in deep learning-based QSAR regression modeling Wang, Dingyan; Yu, Jie; Chen, Lifan Journal of Cheminformatics, Vol. 13, Issue 1 https://doi.org/10.1186/s13321-021-00551-x	journal	September 2021
Machine Learning Models to Predict Inhibition of the Bile Salt Export Pump McLoughlin, Kevin S.; Jeong, Claire G.; Sweitzer, Thomas D. Journal of Chemical Information and Modeling, Vol. 61, Issue 2 https://doi.org/10.1021/acs.jcim.0c00950	journal	January 2021
Generalized Born Model with a Simple, Robust Molecular Volume Correction Mongan, John; Simmerling, Carlos; McCammon, J. Andrew Journal of Chemical Theory and Computation, Vol. 3, Issue 1 https://doi.org/10.1021/ct600085e	journal	December 2006
Keeping the neural networks simple by minimizing the description length of the weights Hinton, Geoffrey E.; van Camp, Drew Proceedings of the sixth annual conference on Computational learning theory - COLT '93 https://doi.org/10.1145/168304.168306	conference	January 1993
Leveraging Uncertainty in Machine Learning Accelerates Biological Discovery and Design Hie, Brian; Bryson, Bryan D.; Berger, Bonnie Cell Systems, Vol. 11, Issue 5 https://doi.org/10.1016/j.cels.2020.09.007	journal	November 2020
Understanding Cytotoxicity and Cytostaticity in a High-Throughput Screening Collection Mervin, Lewis H.; Cao, Qing; Barrett, Ian P. ACS Chemical Biology, Vol. 11, Issue 11 https://doi.org/10.1021/acschembio.6b00538	journal	September 2016
Bayesian neural networks Kononenko, I. Biological Cybernetics, Vol. 61, Issue 5 https://doi.org/10.1007/BF00200801	journal	September 1989
Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery Bosc, Nicolas; Atkinson, Francis; Felix, Eloy Journal of Cheminformatics, Vol. 11, Issue 1 https://doi.org/10.1186/s13321-018-0325-4	journal	January 2019
ChEMBL web services: streamlining access to drug discovery data and utilities Davies, Mark; Nowotka, Michał; Papadatos, George Nucleic Acids Research, Vol. 43, Issue W1 https://doi.org/10.1093/nar/gkv352	journal	April 2015
ChEMBL: towards direct deposition of bioassay data Mendez, David; Gaulton, Anna; Bento, A. Patrícia Nucleic Acids Research, Vol. 47, Issue D1 https://doi.org/10.1093/nar/gky1075	journal	November 2018
A review of uncertainty quantification in deep learning: Techniques, applications and challenges Abdar, Moloud; Pourpanah, Farhad; Hussain, Sadiq Information Fusion, Vol. 76 https://doi.org/10.1016/j.inffus.2021.05.008	journal	December 2021
A Practical Bayesian Framework for Backpropagation Networks MacKay, David J. C. Neural Computation, Vol. 4, Issue 3 https://doi.org/10.1162/neco.1992.4.3.448	journal	May 1992
Protein Kinase Inhibitors: Insights into Drug Design from Structure Noble, M. E. M. Science, Vol. 303, Issue 5665 https://doi.org/10.1126/science.1095920	journal	March 2004
Influence of Varying Training Set Composition and Size on Support Vector Machine-Based Prediction of Active Compounds Rodríguez-Pérez, Raquel; Vogt, Martin; Bajorath, Jürgen Journal of Chemical Information and Modeling, Vol. 57, Issue 4 https://doi.org/10.1021/acs.jcim.7b00088	journal	April 2017
Uncertainty quantification in drug design Mervin, Lewis H.; Johansson, Simon; Semenova, Elizaveta Drug Discovery Today, Vol. 26, Issue 2 https://doi.org/10.1016/j.drudis.2020.11.027	journal	February 2021
Applications of machine learning in drug discovery and development Vamathevan, Jessica; Clark, Dominic; Czodrowski, Paul Nature Reviews Drug Discovery, Vol. 18, Issue 6 https://doi.org/10.1038/s41573-019-0024-5	journal	April 2019
Uncertainty Quantification in CNN Through the Bootstrap of Convex Neural Networks Du, Hongfei; Barut, Emre; Jin, Fang Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, Issue 13 https://doi.org/10.1609/aaai.v35i13.17434	journal	May 2021
AMPL: A Data-Driven Modeling Pipeline for Drug Discovery Minnich, Amanda J.; McLoughlin, Kevin; Tse, Margaret Journal of Chemical Information and Modeling, Vol. 60, Issue 4 https://doi.org/10.1021/acs.jcim.9b01053	journal	April 2020
Towards reproducible computational drug discovery Schaduangrat, Nalini; Lampa, Samuel; Simeon, Saw Journal of Cheminformatics, Vol. 12, Issue 1 https://doi.org/10.1186/s13321-020-0408-x	journal	January 2020
Prediction of atomization energy using graph kernel and active learning Tang, Yu-Hang; de Jong, Wibe A. The Journal of Chemical Physics, Vol. 150, Issue 4 https://doi.org/10.1063/1.5078640	journal	January 2019
Drug discovery with explainable artificial intelligence Jiménez-Luna, José; Grisoni, Francesca; Schneider, Gisbert Nature Machine Intelligence, Vol. 2, Issue 10 https://doi.org/10.1038/s42256-020-00236-4	journal	October 2020
Uncertainty Quantification Using Neural Networks for Molecular Property Prediction Hirschfeld, Lior; Swanson, Kyle; Yang, Kevin Journal of Chemical Information and Modeling, Vol. 60, Issue 8 https://doi.org/10.1021/acs.jcim.0c00502	journal	July 2020

Similar Records

Pose Classification Using Three-Dimensional Atomic Structure-Based Neural Networks Applied to Ion Channel–Ligand Docking

Journal Article · Thu Apr 21 00:00:00 EDT 2022 · Journal of Chemical Information and Modeling · OSTI ID:1988215

Shim, Heesung; Kim, Hyojin; Allen, Jonathan E.; +1 more

Robust and scalable uncertainty estimation with conformal prediction for machine-learned interatomic potentials

Journal Article · Wed Dec 21 00:00:00 EST 2022 · Machine Learning: Science and Technology · OSTI ID:1988215

Hu, Yuge; Musielewicz, Joseph; Ulissi, Zachary W.; +1 more

Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization

Journal Article · Wed Mar 11 00:00:00 EDT 2020 · ACS Central Science · OSTI ID:1988215

Janet, Jon Paul; Ramesh, Sahasrajit; Duan, Chenru; +1 more

Related Subjects

97 MATHEMATICS AND COMPUTING
uncertainty quantification
neural networks
drug discovery
applicability domain

Title: Evaluating point-prediction uncertainties in neural networks for protein-ligand binding prediction

Citation Formats

References (27)

Similar Records

Related Subjects