Decoding the protein–ligand interactions using parallel graph neural networks
Abstract
Abstract Protein–ligand interactions (PLIs) are essential for biochemical functionality and their identification is crucial for estimating biophysical properties for rational therapeutic design. Currently, experimental characterization of these properties is the most accurate method, however, this is very time-consuming and labor-intensive. A number of computational methods have been developed in this context but most of the existing PLI prediction heavily depends on 2D protein sequence data. Here, we present a novel parallel graph neural network (GNN) to integrate knowledge representation and reasoning for PLI prediction to perform deep learning guided by expert knowledge and informed by 3D structural data. We develop two distinct GNN architectures: $$$$\hbox {GNN}_{\mathrm{F}}$$$$ is the base implementation that employs distinct featurization to enhance domain-awareness, while $$$$\hbox {GNN}_{\mathrm{P}}$$$$ is a novel implementation that can predict with no prior knowledge of the intermolecular interactions. The comprehensive evaluation demonstrated that GNN can successfully capture the binary interactions between ligand and protein’s 3D structure with 0.979 test accuracy for $$$$\hbox {GNN}_{\mathrm{F}}$$$$ and 0.958 for $$$$\hbox {GNN}_{\mathrm{P}}$$$$ for predicting activity of a protein–ligand complex. These models are further adapted for regression tasks to predict experimental binding affinities and $$$$\hbox {pIC}_{\mathrm{50}}$$$$ crucial for compound’s potency and efficacy. We achieve a Pearson correlation coefficient of 0.66 and 0.65 on experimental affinity and 0.50 and 0.51 on $$$$\hbox {pIC}_{\mathrm{50}}$$$$ with $$$$\hbox {GNN}_{\mathrm{F}}$$$$ and $$$$\hbox {GNN}_{\mathrm{P}}$$$$ , respectively, outperforming similar 2D sequence based models. Our method can serve as an interpretable and explainable artificial intelligence (AI) tool for predicted activity, potency, and biophysical properties of lead candidates. To this end, we show the utility of $$$$\hbox {GNN}_{\mathrm{P}}$$$$ on SARS-Cov-2 protein targets by screening a large compound library and comparing the prediction with the experimentally measured data.
- Authors:
- Publication Date:
- Research Org.:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1867072
- Alternate Identifier(s):
- OSTI ID: 1869774
- Report Number(s):
- PNNL-SA-166075
Journal ID: ISSN 2045-2322; 7624; PII: 10418
- Grant/Contract Number:
- AC05-76RL01830
- Resource Type:
- Published Article
- Journal Name:
- Scientific Reports
- Additional Journal Information:
- Journal Name: Scientific Reports Journal Volume: 12 Journal Issue: 1; Journal ID: ISSN 2045-2322
- Publisher:
- Nature Publishing Group
- Country of Publication:
- United Kingdom
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; computational biophysics; screening; viral infection
Citation Formats
Knutson, Carter, Bontha, Mridula, Bilbrey, Jenna A., and Kumar, Neeraj. Decoding the protein–ligand interactions using parallel graph neural networks. United Kingdom: N. p., 2022.
Web. doi:10.1038/s41598-022-10418-2.
Knutson, Carter, Bontha, Mridula, Bilbrey, Jenna A., & Kumar, Neeraj. Decoding the protein–ligand interactions using parallel graph neural networks. United Kingdom. https://doi.org/10.1038/s41598-022-10418-2
Knutson, Carter, Bontha, Mridula, Bilbrey, Jenna A., and Kumar, Neeraj. Tue .
"Decoding the protein–ligand interactions using parallel graph neural networks". United Kingdom. https://doi.org/10.1038/s41598-022-10418-2.
@article{osti_1867072,
title = {Decoding the protein–ligand interactions using parallel graph neural networks},
author = {Knutson, Carter and Bontha, Mridula and Bilbrey, Jenna A. and Kumar, Neeraj},
abstractNote = {Abstract Protein–ligand interactions (PLIs) are essential for biochemical functionality and their identification is crucial for estimating biophysical properties for rational therapeutic design. Currently, experimental characterization of these properties is the most accurate method, however, this is very time-consuming and labor-intensive. A number of computational methods have been developed in this context but most of the existing PLI prediction heavily depends on 2D protein sequence data. Here, we present a novel parallel graph neural network (GNN) to integrate knowledge representation and reasoning for PLI prediction to perform deep learning guided by expert knowledge and informed by 3D structural data. We develop two distinct GNN architectures: $$\hbox {GNN}_{\mathrm{F}}$$ GNN F is the base implementation that employs distinct featurization to enhance domain-awareness, while $$\hbox {GNN}_{\mathrm{P}}$$ GNN P is a novel implementation that can predict with no prior knowledge of the intermolecular interactions. The comprehensive evaluation demonstrated that GNN can successfully capture the binary interactions between ligand and protein’s 3D structure with 0.979 test accuracy for $$\hbox {GNN}_{\mathrm{F}}$$ GNN F and 0.958 for $$\hbox {GNN}_{\mathrm{P}}$$ GNN P for predicting activity of a protein–ligand complex. These models are further adapted for regression tasks to predict experimental binding affinities and $$\hbox {pIC}_{\mathrm{50}}$$ pIC 50 crucial for compound’s potency and efficacy. We achieve a Pearson correlation coefficient of 0.66 and 0.65 on experimental affinity and 0.50 and 0.51 on $$\hbox {pIC}_{\mathrm{50}}$$ pIC 50 with $$\hbox {GNN}_{\mathrm{F}}$$ GNN F and $$\hbox {GNN}_{\mathrm{P}}$$ GNN P , respectively, outperforming similar 2D sequence based models. Our method can serve as an interpretable and explainable artificial intelligence (AI) tool for predicted activity, potency, and biophysical properties of lead candidates. To this end, we show the utility of $$\hbox {GNN}_{\mathrm{P}}$$ GNN P on SARS-Cov-2 protein targets by screening a large compound library and comparing the prediction with the experimentally measured data.},
doi = {10.1038/s41598-022-10418-2},
journal = {Scientific Reports},
number = 1,
volume = 12,
place = {United Kingdom},
year = {2022},
month = {5}
}
https://doi.org/10.1038/s41598-022-10418-2
Works referenced in this record:
Molecular graph convolutions: moving beyond fingerprints
journal, August 2016
- Kearnes, Steven; McCloskey, Kevin; Berndl, Marc
- Journal of Computer-Aided Molecular Design, Vol. 30, Issue 8
Protein–Ligand Scoring with Convolutional Neural Networks
journal, April 2017
- Ragoza, Matthew; Hochuli, Joshua; Idrobo, Elisa
- Journal of Chemical Information and Modeling, Vol. 57, Issue 4
Artificial intelligence in early drug discovery enabling precision medicine
journal, June 2021
- Boniolo, Fabio; Dorigatti, Emilio; Ohnmacht, Alexander J.
- Expert Opinion on Drug Discovery, Vol. 16, Issue 9
Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening
journal, August 2019
- Chen, Lieyang; Cruz, Anthony; Ramsey, Steven
- PLOS ONE, Vol. 14, Issue 8
LigandFit: a novel method for the shape-directed rapid docking of ligands to protein active sites
journal, January 2003
- Venkatachalam, C. M.; Jiang, X.; Oldfield, T.
- Journal of Molecular Graphics and Modelling, Vol. 21, Issue 4
A review on deep learning approaches in healthcare systems: Taxonomies, challenges, and open issues
journal, January 2021
- Shamshirband, Shahab; Fathi, Mahdis; Dehzangi, Abdollah
- Journal of Biomedical Informatics, Vol. 113
Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy
journal, March 2004
- Friesner, Richard A.; Banks, Jay L.; Murphy, Robert B.
- Journal of Medicinal Chemistry, Vol. 47, Issue 7
Deep-Learning-Based Drug–Target Interaction Prediction
journal, March 2017
- Wen, Ming; Zhang, Zhimin; Niu, Shaoyu
- Journal of Proteome Research, Vol. 16, Issue 4
LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
journal, June 2021
- Tsukiyama, Sho; Hasan, Md Mehedi; Fujii, Satoshi
- Briefings in Bioinformatics, Vol. 22, Issue 6
Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction
journal, September 2020
- Khatun, Mst. Shamima; Shoombuatong, Watshara; Hasan, Md. Mehedi
- Current Genomics, Vol. 21, Issue 6
Discovery of ZAP70 inhibitors by high-throughput docking into a conformation of its kinase domain generated by molecular dynamics
journal, October 2013
- Zhao, Hongtao; Caflisch, Amedeo
- Bioorganic & Medicinal Chemistry Letters, Vol. 23, Issue 20
The influence of the inactives subset generation on the performance of machine learning methods
journal, April 2013
- Smusz, Sabina; Kurczab, Rafał; Bojarski, Andrzej J.
- Journal of Cheminformatics, Vol. 5, Issue 1
K DEEP : Protein–Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks
journal, January 2018
- Jiménez, José; Škalič, Miha; Martínez-Rosell, Gerard
- Journal of Chemical Information and Modeling, Vol. 58, Issue 2
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine
journal, February 2003
- Jain, Ajay N.
- Journal of Medicinal Chemistry, Vol. 46, Issue 4
The PDBbind Database: Collection of Binding Affinities for Protein−Ligand Complexes with Known Three-Dimensional Structures
journal, June 2004
- Wang, Renxiao; Fang, Xueliang; Lu, Yipin
- Journal of Medicinal Chemistry, Vol. 47, Issue 12
Development and evaluation of a deep learning model for protein–ligand binding affinity prediction
journal, May 2018
- Stepniewska-Dziubinska, Marta M.; Zielenkiewicz, Piotr; Siedlecki, Pawel
- Bioinformatics, Vol. 34, Issue 21
DeepDTA: deep drug–target binding affinity prediction
journal, September 2018
- Öztürk, Hakime; Özgür, Arzucan; Ozkirimli, Elif
- Bioinformatics, Vol. 34, Issue 17
Improved Protein–Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference
journal, March 2021
- Jones, Derek; Kim, Hyojin; Zhang, Xiaohua
- Journal of Chemical Information and Modeling, Vol. 61, Issue 4
DOCK 6: Impact of new features and current docking performance
journal, April 2015
- Allen, William J.; Balius, Trent E.; Mukherjee, Sudipto
- Journal of Computational Chemistry, Vol. 36, Issue 15
Potent Noncovalent Inhibitors of the Main Protease of SARS-CoV-2 from Molecular Sculpting of the Drug Perampanel Guided by Free Energy Perturbation Calculations
journal, February 2021
- Zhang, Chun-Hui; Stone, Elizabeth A.; Deshmukh, Maya
- ACS Central Science, Vol. 7, Issue 3
Predicting Drug–Target Interactions with Deep-Embedding Learning of Graphs and Sequences
journal, June 2021
- Chen, Wei; Chen, Guanxing; Zhao, Lu
- The Journal of Physical Chemistry A, Vol. 125, Issue 25
Artificial intelligence in drug design
journal, July 2018
- Zhong, Feisheng; Xing, Jing; Li, Xutong
- Science China Life Sciences, Vol. 61, Issue 10
Computational Methods in Drug Discovery
journal, December 2013
- Sliwoski, Gregory; Kothiwale, Sandeepkumar; Meiler, Jens
- Pharmacological Reviews, Vol. 66, Issue 1
MONN: A Multi-objective Neural Network for Predicting Compound-Protein Interactions and Affinities
journal, April 2020
- Li, Shuya; Wan, Fangping; Shu, Hantao
- Cell Systems, Vol. 10, Issue 4
Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs
conference, July 2017
- Monti, Federico; Boscaini, Davide; Masci, Jonathan
- 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Graph Convolutional Neural Networks for Predicting Drug-Target Interactions
journal, October 2019
- Torng, Wen; Altman, Russ B.
- Journal of Chemical Information and Modeling, Vol. 59, Issue 10
Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking
journal, July 2012
- Mysinger, Michael M.; Carchia, Michael; Irwin, John. J.
- Journal of Medicinal Chemistry, Vol. 55, Issue 14
rDock: A Fast, Versatile and Open Source Program for Docking Ligands to Proteins and Nucleic Acids
journal, April 2014
- Ruiz-Carmona, Sergio; Alvarez-Garcia, Daniel; Foloppe, Nicolas
- PLoS Computational Biology, Vol. 10, Issue 4
A Simple QM/MM Approach for Capturing Polarization Effects in Protein−Ligand Binding Free Energy Calculations
journal, May 2011
- Beierlein, Frank R.; Michel, Julien; Essex, Jonathan W.
- The Journal of Physical Chemistry B, Vol. 115, Issue 17
The PDBbind Database: Methodologies and Updates
journal, June 2005
- Wang, Renxiao; Fang, Xueliang; Lu, Yipin
- Journal of Medicinal Chemistry, Vol. 48, Issue 12
Accurate Modeling of Scaffold Hopping Transformations in Drug Discovery
journal, December 2016
- Wang, Lingle; Deng, Yuqing; Wu, Yujie
- Journal of Chemical Theory and Computation, Vol. 13, Issue 1
Predicting Drug–Target Interaction Using a Novel Graph Neural Network with 3D Structure-Embedded Graph Representation
journal, August 2019
- Lim, Jaechang; Ryu, Seongok; Park, Kyubyong
- Journal of Chemical Information and Modeling, Vol. 59, Issue 9
DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks
journal, February 2019
- Karimi, Mostafa; Wu, Di; Wang, Zhangyang
- Bioinformatics, Vol. 35, Issue 18
Crystal Structure of SARS-CoV-2 Main Protease in Complex with the Non-Covalent Inhibitor ML188
journal, January 2021
- Lockbaum, Gordon J.; Reyes, Archie C.; Lee, Jeong Min
- Viruses, Vol. 13, Issue 2
Recent Development of Bioinformatics Tools for microRNA Target
Prediction
journal, February 2022
- Khatun, Mst Shamima; Alam, Md Ashad; Shoombuatong, Watshara
- Current Medicinal Chemistry, Vol. 29, Issue 5