DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Decoding the protein–ligand interactions using parallel graph neural networks

Abstract

Abstract Protein–ligand interactions (PLIs) are essential for biochemical functionality and their identification is crucial for estimating biophysical properties for rational therapeutic design. Currently, experimental characterization of these properties is the most accurate method, however, this is very time-consuming and labor-intensive. A number of computational methods have been developed in this context but most of the existing PLI prediction heavily depends on 2D protein sequence data. Here, we present a novel parallel graph neural network (GNN) to integrate knowledge representation and reasoning for PLI prediction to perform deep learning guided by expert knowledge and informed by 3D structural data. We develop two distinct GNN architectures: $$$$\hbox {GNN}_{\mathrm{F}}$$$$ GNN F is the base implementation that employs distinct featurization to enhance domain-awareness, while $$$$\hbox {GNN}_{\mathrm{P}}$$$$ GNN P is a novel implementation that can predict with no prior knowledge of the intermolecular interactions. The comprehensive evaluation demonstrated that GNN can successfully capture the binary interactions between ligand and protein’s 3D structure with 0.979 test accuracy for $$$$\hbox {GNN}_{\mathrm{F}}$$$$ GNN F and 0.958 for $$$$\hbox {GNN}_{\mathrm{P}}$$$$ GNN P for predicting activity of a protein–ligand complex. These models are further adapted for regression tasks to predict experimental binding affinities and $$$$\hbox {pIC}_{\mathrm{50}}$$$$ pIC 50 crucial for compound’s potency and efficacy. We achieve a Pearson correlation coefficient of 0.66 and 0.65 on experimental affinity and 0.50 and 0.51 on $$$$\hbox {pIC}_{\mathrm{50}}$$$$ pIC 50 with $$$$\hbox {GNN}_{\mathrm{F}}$$$$ GNN F and $$$$\hbox {GNN}_{\mathrm{P}}$$$$ GNN P , respectively, outperforming similar 2D sequence based models. Our method can serve as an interpretable and explainable artificial intelligence (AI) tool for predicted activity, potency, and biophysical properties of lead candidates. To this end, we show the utility of $$$$\hbox {GNN}_{\mathrm{P}}$$$$ GNN P on SARS-Cov-2 protein targets by screening a large compound library and comparing the prediction with the experimentally measured data.

Authors:
; ; ;
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1867072
Alternate Identifier(s):
OSTI ID: 1869774
Report Number(s):
PNNL-SA-166075
Journal ID: ISSN 2045-2322; 7624; PII: 10418
Grant/Contract Number:  
AC05-76RL01830
Resource Type:
Published Article
Journal Name:
Scientific Reports
Additional Journal Information:
Journal Name: Scientific Reports Journal Volume: 12 Journal Issue: 1; Journal ID: ISSN 2045-2322
Publisher:
Nature Publishing Group
Country of Publication:
United Kingdom
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; computational biophysics; screening; viral infection

Citation Formats

Knutson, Carter, Bontha, Mridula, Bilbrey, Jenna A., and Kumar, Neeraj. Decoding the protein–ligand interactions using parallel graph neural networks. United Kingdom: N. p., 2022. Web. doi:10.1038/s41598-022-10418-2.
Knutson, Carter, Bontha, Mridula, Bilbrey, Jenna A., & Kumar, Neeraj. Decoding the protein–ligand interactions using parallel graph neural networks. United Kingdom. https://doi.org/10.1038/s41598-022-10418-2
Knutson, Carter, Bontha, Mridula, Bilbrey, Jenna A., and Kumar, Neeraj. Tue . "Decoding the protein–ligand interactions using parallel graph neural networks". United Kingdom. https://doi.org/10.1038/s41598-022-10418-2.
@article{osti_1867072,
title = {Decoding the protein–ligand interactions using parallel graph neural networks},
author = {Knutson, Carter and Bontha, Mridula and Bilbrey, Jenna A. and Kumar, Neeraj},
abstractNote = {Abstract Protein–ligand interactions (PLIs) are essential for biochemical functionality and their identification is crucial for estimating biophysical properties for rational therapeutic design. Currently, experimental characterization of these properties is the most accurate method, however, this is very time-consuming and labor-intensive. A number of computational methods have been developed in this context but most of the existing PLI prediction heavily depends on 2D protein sequence data. Here, we present a novel parallel graph neural network (GNN) to integrate knowledge representation and reasoning for PLI prediction to perform deep learning guided by expert knowledge and informed by 3D structural data. We develop two distinct GNN architectures: $$\hbox {GNN}_{\mathrm{F}}$$ GNN F is the base implementation that employs distinct featurization to enhance domain-awareness, while $$\hbox {GNN}_{\mathrm{P}}$$ GNN P is a novel implementation that can predict with no prior knowledge of the intermolecular interactions. The comprehensive evaluation demonstrated that GNN can successfully capture the binary interactions between ligand and protein’s 3D structure with 0.979 test accuracy for $$\hbox {GNN}_{\mathrm{F}}$$ GNN F and 0.958 for $$\hbox {GNN}_{\mathrm{P}}$$ GNN P for predicting activity of a protein–ligand complex. These models are further adapted for regression tasks to predict experimental binding affinities and $$\hbox {pIC}_{\mathrm{50}}$$ pIC 50 crucial for compound’s potency and efficacy. We achieve a Pearson correlation coefficient of 0.66 and 0.65 on experimental affinity and 0.50 and 0.51 on $$\hbox {pIC}_{\mathrm{50}}$$ pIC 50 with $$\hbox {GNN}_{\mathrm{F}}$$ GNN F and $$\hbox {GNN}_{\mathrm{P}}$$ GNN P , respectively, outperforming similar 2D sequence based models. Our method can serve as an interpretable and explainable artificial intelligence (AI) tool for predicted activity, potency, and biophysical properties of lead candidates. To this end, we show the utility of $$\hbox {GNN}_{\mathrm{P}}$$ GNN P on SARS-Cov-2 protein targets by screening a large compound library and comparing the prediction with the experimentally measured data.},
doi = {10.1038/s41598-022-10418-2},
journal = {Scientific Reports},
number = 1,
volume = 12,
place = {United Kingdom},
year = {2022},
month = {5}
}

Works referenced in this record:

Molecular graph convolutions: moving beyond fingerprints
journal, August 2016

  • Kearnes, Steven; McCloskey, Kevin; Berndl, Marc
  • Journal of Computer-Aided Molecular Design, Vol. 30, Issue 8
  • DOI: 10.1007/s10822-016-9938-8

Protein–Ligand Scoring with Convolutional Neural Networks
journal, April 2017

  • Ragoza, Matthew; Hochuli, Joshua; Idrobo, Elisa
  • Journal of Chemical Information and Modeling, Vol. 57, Issue 4
  • DOI: 10.1021/acs.jcim.6b00740

Artificial intelligence in early drug discovery enabling precision medicine
journal, June 2021

  • Boniolo, Fabio; Dorigatti, Emilio; Ohnmacht, Alexander J.
  • Expert Opinion on Drug Discovery, Vol. 16, Issue 9
  • DOI: 10.1080/17460441.2021.1918096

Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening
journal, August 2019


LigandFit: a novel method for the shape-directed rapid docking of ligands to protein active sites
journal, January 2003


A review on deep learning approaches in healthcare systems: Taxonomies, challenges, and open issues
journal, January 2021

  • Shamshirband, Shahab; Fathi, Mahdis; Dehzangi, Abdollah
  • Journal of Biomedical Informatics, Vol. 113
  • DOI: 10.1016/j.jbi.2020.103627

Glide:  A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy
journal, March 2004

  • Friesner, Richard A.; Banks, Jay L.; Murphy, Robert B.
  • Journal of Medicinal Chemistry, Vol. 47, Issue 7
  • DOI: 10.1021/jm0306430

Deep-Learning-Based Drug–Target Interaction Prediction
journal, March 2017


LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec
journal, June 2021

  • Tsukiyama, Sho; Hasan, Md Mehedi; Fujii, Satoshi
  • Briefings in Bioinformatics, Vol. 22, Issue 6
  • DOI: 10.1093/bib/bbab228

Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction
journal, September 2020


Discovery of ZAP70 inhibitors by high-throughput docking into a conformation of its kinase domain generated by molecular dynamics
journal, October 2013


The influence of the inactives subset generation on the performance of machine learning methods
journal, April 2013

  • Smusz, Sabina; Kurczab, Rafał; Bojarski, Andrzej J.
  • Journal of Cheminformatics, Vol. 5, Issue 1
  • DOI: 10.1186/1758-2946-5-17

K DEEP : Protein–Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks
journal, January 2018

  • Jiménez, José; Škalič, Miha; Martínez-Rosell, Gerard
  • Journal of Chemical Information and Modeling, Vol. 58, Issue 2
  • DOI: 10.1021/acs.jcim.7b00650

Surflex:  Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine
journal, February 2003

  • Jain, Ajay N.
  • Journal of Medicinal Chemistry, Vol. 46, Issue 4
  • DOI: 10.1021/jm020406h

The PDBbind Database:  Collection of Binding Affinities for Protein−Ligand Complexes with Known Three-Dimensional Structures
journal, June 2004

  • Wang, Renxiao; Fang, Xueliang; Lu, Yipin
  • Journal of Medicinal Chemistry, Vol. 47, Issue 12
  • DOI: 10.1021/jm030580l

Development and evaluation of a deep learning model for protein–ligand binding affinity prediction
journal, May 2018


DeepDTA: deep drug–target binding affinity prediction
journal, September 2018


Improved Protein–Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference
journal, March 2021

  • Jones, Derek; Kim, Hyojin; Zhang, Xiaohua
  • Journal of Chemical Information and Modeling, Vol. 61, Issue 4
  • DOI: 10.1021/acs.jcim.0c01306

DOCK 6: Impact of new features and current docking performance
journal, April 2015

  • Allen, William J.; Balius, Trent E.; Mukherjee, Sudipto
  • Journal of Computational Chemistry, Vol. 36, Issue 15
  • DOI: 10.1002/jcc.23905

Predicting Drug–Target Interactions with Deep-Embedding Learning of Graphs and Sequences
journal, June 2021

  • Chen, Wei; Chen, Guanxing; Zhao, Lu
  • The Journal of Physical Chemistry A, Vol. 125, Issue 25
  • DOI: 10.1021/acs.jpca.1c02419

Artificial intelligence in drug design
journal, July 2018


Computational Methods in Drug Discovery
journal, December 2013

  • Sliwoski, Gregory; Kothiwale, Sandeepkumar; Meiler, Jens
  • Pharmacological Reviews, Vol. 66, Issue 1
  • DOI: 10.1124/pr.112.007336

MONN: A Multi-objective Neural Network for Predicting Compound-Protein Interactions and Affinities
journal, April 2020


Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs
conference, July 2017

  • Monti, Federico; Boscaini, Davide; Masci, Jonathan
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2017.576

Graph Convolutional Neural Networks for Predicting Drug-Target Interactions
journal, October 2019

  • Torng, Wen; Altman, Russ B.
  • Journal of Chemical Information and Modeling, Vol. 59, Issue 10
  • DOI: 10.1021/acs.jcim.9b00628

Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking
journal, July 2012

  • Mysinger, Michael M.; Carchia, Michael; Irwin, John. J.
  • Journal of Medicinal Chemistry, Vol. 55, Issue 14
  • DOI: 10.1021/jm300687e

rDock: A Fast, Versatile and Open Source Program for Docking Ligands to Proteins and Nucleic Acids
journal, April 2014

  • Ruiz-Carmona, Sergio; Alvarez-Garcia, Daniel; Foloppe, Nicolas
  • PLoS Computational Biology, Vol. 10, Issue 4
  • DOI: 10.1371/journal.pcbi.1003571

A Simple QM/MM Approach for Capturing Polarization Effects in Protein−Ligand Binding Free Energy Calculations
journal, May 2011

  • Beierlein, Frank R.; Michel, Julien; Essex, Jonathan W.
  • The Journal of Physical Chemistry B, Vol. 115, Issue 17
  • DOI: 10.1021/jp109054j

The PDBbind Database:  Methodologies and Updates
journal, June 2005

  • Wang, Renxiao; Fang, Xueliang; Lu, Yipin
  • Journal of Medicinal Chemistry, Vol. 48, Issue 12
  • DOI: 10.1021/jm048957q

Accurate Modeling of Scaffold Hopping Transformations in Drug Discovery
journal, December 2016

  • Wang, Lingle; Deng, Yuqing; Wu, Yujie
  • Journal of Chemical Theory and Computation, Vol. 13, Issue 1
  • DOI: 10.1021/acs.jctc.6b00991

Predicting Drug–Target Interaction Using a Novel Graph Neural Network with 3D Structure-Embedded Graph Representation
journal, August 2019

  • Lim, Jaechang; Ryu, Seongok; Park, Kyubyong
  • Journal of Chemical Information and Modeling, Vol. 59, Issue 9
  • DOI: 10.1021/acs.jcim.9b00387

Crystal Structure of SARS-CoV-2 Main Protease in Complex with the Non-Covalent Inhibitor ML188
journal, January 2021

  • Lockbaum, Gordon J.; Reyes, Archie C.; Lee, Jeong Min
  • Viruses, Vol. 13, Issue 2
  • DOI: 10.3390/v13020174

Recent Development of Bioinformatics Tools for microRNA Target Prediction
journal, February 2022