skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor

Journal Article · · Bioinformatics
 [1];  [1];  [2];  [3];  [3]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Computational Biosciences Dept.
  2. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Computer Science & Informatics Department
  3. Sandia National Lab. (SNL-CA), Livermore, CA (United States). Biosystems Research

Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. Additionally, there is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein–chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformatics representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Lastly, such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.

Research Organization:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
AC04-94AL85000
OSTI ID:
1426948
Report Number(s):
SAND-2007-1648J; 526857
Journal Information:
Bioinformatics, Vol. 24, Issue 2; ISSN 1367-4803
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 119 works
Citation information provided by
Web of Science

References (32)

Stochastic Generator of Chemical Structure. 1. Application to the Structure Elucidation of Large Molecules journal September 1994
Mathematical Correction for Fingerprint Similarity Measures to Improve Chemical Retrieval journal March 2007
The European Bioinformatics Institute's data resources: towards systems biology journal December 2004
SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence journal July 2003
The signature molecular descriptor journal March 2004
SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins journal July 2004
MOLECULAR BIOLOGY: NIH Molecular Libraries Initiative journal November 2004
Computational Assignment of the EC Numbers for Genomic-Scale Analysis of Enzymatic Reactions journal December 2004
Similarity Searching of Chemical Databases Using Atom Environment Descriptors (MOLPRINT 2D):  Evaluation of Performance journal September 2004
The Pharmacophore Kernel for Virtual Screening with Support Vector Machines journal August 2006
Predicting protein-protein interactions using signature products journal August 2004
Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity journal June 2005
The Predictive Toxicology Challenge 2000-2001 journal January 2001
Kernel methods for predicting protein-protein interactions journal June 2005
The Signature Molecular Descriptor. 2. Enumerating Molecules from Their Extended Valence Sequences
  • Faulon, Jean-Loup; Churchwell, Carla J.; Visco, Donald P.
  • Journal of Chemical Information and Computer Sciences, Vol. 43, Issue 3 https://doi.org/10.1021/ci020346o
journal May 2003
Amino acid substitution matrices from protein blocks. journal November 1992
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs journal September 1997
Ab initio quantum mechanical study of the binding energies of human estrogen receptor ? with its ligands: An application of fragment molecular orbital method journal January 2004
New developments in the InterPro database journal January 2007
From genomics to chemical genomics: new developments in KEGG journal January 2006
DrugBank: a comprehensive resource for in silico drug discovery and exploration journal January 2006
On Graph Kernels: Hardness Results and Efficient Alternatives book January 2003
The Difficult Road from Sequence to Function journal May 2006
A Critical Assessment of Docking Programs and Scoring Functions journal October 2006
Chemoinformatics book September 2003
Predicting ligand-binding function in families of bacterial receptors journal March 2000
Predicting protein-protein interactions from primary structure journal May 2001
The Signature Molecular Descriptor. 4. Canonizing Molecules Using Extended Valence Sequences journal March 2004
What is a support vector machine? journal December 2006
Solving the protein sequence metric problem journal April 2005
Protein function prediction via graph kernels journal June 2005
The pharmacophore kernel for virtual screening with support vector machines preprint January 2006

Cited By (31)

Engineering antibiotic production and overcoming bacterial resistance journal June 2011
Chemoinformatics as a Theoretical Chemistry Discipline journal January 2011
Inferring Chemogenomic Features from Drug-Target Interaction Networks journal December 2013
Scalable Prediction of Compound‐protein Interaction on Compressed Molecular Fingerprints journal December 2019
Drug recommendation with minimal side effects based on direct and temporal symptoms journal October 2018
The continuous molecular fields approach to building 3D-QSAR models journal May 2013
Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently journal January 2015
Perturbation Detection Through Modeling of Gene Expression on a Latent Biological Pathway Network: A Bayesian Hierarchical Approach journal January 2016
Computational prediction of drug–target interactions using chemogenomic approaches: an empirical survey journal January 2018
Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework journal June 2010
Drug target prediction using adverse event report systems: a pharmacogenomic approach journal September 2012
SensiPath: computer-aided design of sensing-enabling metabolic pathways journal April 2016
Efficient multi-task chemogenomics for drug specificity prediction posted_content January 2018
Scalable prediction of compound-protein interactions using minwise hashing journal January 2013
A deep learning-based method for drug-target interaction prediction based on long short-term memory neural network journal March 2020
Predicting target proteins for drug candidate compounds based on drug-induced gene expression data in a chemical structure-independent manner journal December 2015
Prediction of Chemical-Protein Interactions Network with Weighted Network-Based Inference Method journal July 2012
Assignment of EC Numbers to Enzymatic Reactions with Reaction Difference Fingerprints journal December 2012
Efficient multi-task chemogenomics for drug specificity prediction journal October 2018
DrugOn: a fully integrated pharmacophore modeling and structure optimization toolkit journal January 2015
Identification of chemogenomic features from drug–target interaction networks using interpretable classifiers journal September 2012
Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets journal June 2013
DINIES: drug–target interaction network inference engine based on supervised analysis journal May 2014
A retrosynthetic biology approach to metabolic pathway design for therapeutic production journal January 2011
Inferring protein domains associated with drug side effects based on drug-target interaction network journal December 2013
Improvement of experimental testing and network training conditions with genome-wide microarrays for more accurate predictions of drug gene targets journal January 2014
In silico prediction of potential chemical reactions mediated by human enzymes journal June 2018
Network-based characterization of drug-protein interaction signatures with a space-efficient approach journal April 2019
Side effect profile similarities shared between antidepressants and immune-modulators reveal potential novel targets for treating major depressive disorders journal October 2016
Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features journal March 2010
Learning a peptide-protein binding affinity predictor with kernel ridge regression text January 2012