DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Decoding the protein–ligand interactions using parallel graph neural networks

Journal Article · · Scientific Reports

Protein–ligand interactions (PLIs) are essential for biochemical functionality and their identification is crucial for estimating biophysical properties for rational therapeutic design. Currently, experimental characterization of these properties is the most accurate method, however, this is very time-consuming and labor-intensive. A number of computational methods have been developed in this context but most of the existing PLI prediction heavily depends on 2D protein sequence data. Here, we present a novel parallel graph neural network (GNN) to integrate knowledge representation and reasoning for PLI prediction to perform deep learning guided by expert knowledge and informed by 3D structural data. We develop two distinct GNN architectures: GNNF is the base implementation that employs distinct featurization to enhance domain-awareness, while GNNP is a novel implementation that can predict with no prior knowledge of the intermolecular interactions. The comprehensive evaluation demonstrated that GNN can successfully capture the binary interactions between ligand and protein’s 3D structure with 0.979 test accuracy for GNNF and 0.958 for GNNP for predicting activity of a protein–ligand complex. These models are further adapted for regression tasks to predict experimental binding affinities and pIC50 crucial for compound’s potency and efficacy. We achieve a Pearson correlation coefficient of 0.66 and 0.65 on experimental affinity and 0.50 and 0.51 on pIC50 with GNNF and GNNP, respectively, outperforming similar 2D sequence based models. Our method can serve as an interpretable and explainable artificial intelligence (AI) tool for predicted activity, potency, and biophysical properties of lead candidates. To this end, we show the utility of GNNP on SARS-Cov-2 protein targets by screening a large compound library and comparing the prediction with the experimentally measured data.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-76RL01830
OSTI ID:
1867072
Alternate ID(s):
OSTI ID: 1869774
Report Number(s):
PNNL-SA-166075
Journal Information:
Scientific Reports, Vol. 12, Issue 1; ISSN 2045-2322
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (37)

Molecular graph convolutions: moving beyond fingerprints journal August 2016
Protein–Ligand Scoring with Convolutional Neural Networks journal April 2017
Artificial intelligence in early drug discovery enabling precision medicine journal June 2021
Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening journal August 2019
High Throughput Virtual Screening and Validation of a SARS-CoV-2 Main Protease Non-Covalent Inhibitor dataset January 2021
LigandFit: a novel method for the shape-directed rapid docking of ligands to protein active sites journal January 2003
A review on deep learning approaches in healthcare systems: Taxonomies, challenges, and open issues journal January 2021
Glide:  A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy journal March 2004
Deep-Learning-Based Drug–Target Interaction Prediction journal March 2017
LSTM-PHV: prediction of human-virus protein–protein interactions by LSTM with word2vec journal June 2021
Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction journal September 2020
Discovery of ZAP70 inhibitors by high-throughput docking into a conformation of its kinase domain generated by molecular dynamics journal October 2013
The influence of the inactives subset generation on the performance of machine learning methods journal April 2013
K DEEP : Protein–Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks journal January 2018
Surflex:  Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine journal February 2003
The PDBbind Database:  Collection of Binding Affinities for Protein−Ligand Complexes with Known Three-Dimensional Structures journal June 2004
Development and evaluation of a deep learning model for protein–ligand binding affinity prediction journal May 2018
DeepDTA: deep drug–target binding affinity prediction journal September 2018
Improved Protein–Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference journal March 2021
DOCK 6: Impact of new features and current docking performance journal April 2015
Potent Noncovalent Inhibitors of the Main Protease of SARS-CoV-2 from Molecular Sculpting of the Drug Perampanel Guided by Free Energy Perturbation Calculations journal February 2021
Predicting Drug–Target Interactions with Deep-Embedding Learning of Graphs and Sequences journal June 2021
Artificial intelligence in drug design journal July 2018
Computational Methods in Drug Discovery journal December 2013
MONN: A Multi-objective Neural Network for Predicting Compound-Protein Interactions and Affinities journal April 2020
Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs conference July 2017
Graph Convolutional Neural Networks for Predicting Drug-Target Interactions journal October 2019
Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking journal July 2012
rDock: A Fast, Versatile and Open Source Program for Docking Ligands to Proteins and Nucleic Acids journal April 2014
A Simple QM/MM Approach for Capturing Polarization Effects in Protein−Ligand Binding Free Energy Calculations journal May 2011
The PDBbind Database:  Methodologies and Updates journal June 2005
Accurate Modeling of Scaffold Hopping Transformations in Drug Discovery journal December 2016
Predicting Drug–Target Interaction Using a Novel Graph Neural Network with 3D Structure-Embedded Graph Representation journal August 2019
DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks journal February 2019
Crystal Structure of SARS-CoV-2 Main Protease in Complex with the Non-Covalent Inhibitor ML188 journal January 2021
Recent Development of Bioinformatics Tools for microRNA Target Prediction journal February 2022
AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading journal January 2009