DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction

Abstract

Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using an assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.

Authors:
 [1];  [1];  [2]
  1. Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
  2. Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1234001
Alternate Identifier(s):
OSTI ID: 1247926
Grant/Contract Number:  
DE–AC05–00OR22725; AC05-00OR22725
Resource Type:
Published Article
Journal Name:
International Journal of Genomics
Additional Journal Information:
Journal Name: International Journal of Genomics Journal Volume: 2015; Journal ID: ISSN 2314-436X
Publisher:
Hindawi Publishing Corporation
Country of Publication:
Egypt
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES

Citation Formats

Yao, Jianzhuang, Guo, Hong, and Yang, Xiaohan. PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction. Egypt: N. p., 2015. Web. doi:10.1155/2015/608042.
Yao, Jianzhuang, Guo, Hong, & Yang, Xiaohan. PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction. Egypt. https://doi.org/10.1155/2015/608042
Yao, Jianzhuang, Guo, Hong, and Yang, Xiaohan. Thu . "PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction". Egypt. https://doi.org/10.1155/2015/608042.
@article{osti_1234001,
title = {PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction},
author = {Yao, Jianzhuang and Guo, Hong and Yang, Xiaohan},
abstractNote = {Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using an assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.},
doi = {10.1155/2015/608042},
journal = {International Journal of Genomics},
number = ,
volume = 2015,
place = {Egypt},
year = {Thu Jan 01 00:00:00 EST 2015},
month = {Thu Jan 01 00:00:00 EST 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1155/2015/608042

Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Proteome survey reveals modularity of the yeast cell machinery
journal, January 2006

  • Gavin, Anne-Claude; Aloy, Patrick; Grandi, Paola
  • Nature, Vol. 440, Issue 7084
  • DOI: 10.1038/nature04532

Phylogenetic profiles for the prediction of protein–protein interactions: How to select reference organisms?
journal, February 2007

  • Sun, Jingchun; Li, Yixue; Zhao, Zhongming
  • Biochemical and Biophysical Research Communications, Vol. 353, Issue 4
  • DOI: 10.1016/j.bbrc.2006.12.146

Predicting protein-protein interactions in unbalanced data using the primary structure of proteins
journal, January 2010


LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy
journal, January 2014


Prediction of protein–protein interactions using random decision forest framework
journal, October 2005


A more complete, complexed and structured interactome
journal, June 2007


Protein complexes take the bait
journal, January 2002

  • Kumar, Anuj; Snyder, Michael
  • Nature, Vol. 415, Issue 6868
  • DOI: 10.1038/415123a

Identification of functional links between genes using phylogenetic profiles
journal, August 2003


Hierarchical Classification of Protein Folds Using a Novel Ensemble Classifier
journal, February 2013


Automatic selection of reference taxa for protein–protein interaction prediction with phylogenetic profiling
journal, January 2012


The Negatome database: a reference set of non-interacting protein pairs
journal, November 2009

  • Smialowski, Pawel; Pagel, Philipp; Wong, Philip
  • Nucleic Acids Research, Vol. 38, Issue suppl_1
  • DOI: 10.1093/nar/gkp1026

DIP: the Database of Interacting Proteins
journal, January 2000


Understanding Protein–Protein Interactions Using Local Structural Features
journal, April 2013

  • Planas-Iglesias, Joan; Bonet, Jaume; García-García, Javier
  • Journal of Molecular Biology, Vol. 425, Issue 7
  • DOI: 10.1016/j.jmb.2013.01.014

Biana: a software framework for compiling biological interactions and analyzing networks
journal, January 2010

  • Garcia-Garcia, Javier; Guney, Emre; Aragues, Ramon
  • BMC Bioinformatics, Vol. 11, Issue 1
  • DOI: 10.1186/1471-2105-11-56

STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012

  • Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1094

Protein interaction predictions from diverse sources
journal, May 2008


Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles
journal, April 1999

  • Pellegrini, M.; Marcotte, E. M.; Thompson, M. J.
  • Proceedings of the National Academy of Sciences, Vol. 96, Issue 8
  • DOI: 10.1073/pnas.96.8.4285

Comparative assessment of performance and genome dependence among phylogenetic profiling methods
journal, September 2006

  • Snitkin, Evan S.; Gustafson, Adam M.; Mellor, Joseph
  • BMC Bioinformatics, Vol. 7, Issue 1
  • DOI: 10.1186/1471-2105-7-420

Constructing Multigenome Views of Whole Microbial Genomes
journal, January 1998


Using ensemble methods to deal with imbalanced data in predicting protein–protein interactions
journal, February 2012


Integrative Neural Network Approach for Protein Interaction Prediction from Heterogeneous Data
book, January 2008


Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment
journal, May 2007


Learning to predict protein–protein interactions from protein sequences
journal, October 2003


Gene Ontology-driven inference of protein–protein interactions using inducers
journal, November 2011


Discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages
journal, August 2003

  • Date, Shailesh V.; Marcotte, Edward M.
  • Nature Biotechnology, Vol. 21, Issue 9
  • DOI: 10.1038/nbt861

Bio::Homology::InterologWalk - A Perl module to build putative protein-protein interaction networks through interolog mapping
journal, July 2011

  • Gallone, Giuseppe; Simpson, T. Ian; Armstrong, J. Douglas
  • BMC Bioinformatics, Vol. 12, Issue 1
  • DOI: 10.1186/1471-2105-12-289

BIPS: BIANA Interolog Prediction Server. A tool for protein–protein interaction inference
journal, June 2012

  • Garcia-Garcia, Javier; Schleker, Sylvia; Klein-Seetharaman, Judith
  • Nucleic Acids Research, Vol. 40, Issue W1
  • DOI: 10.1093/nar/gks553

Evaluation of different biological data and computational classification methods for use in protein interaction prediction
journal, January 2006

  • Qi, Yanjun; Bar-Joseph, Ziv; Klein-Seetharaman, Judith
  • Proteins: Structure, Function, and Bioinformatics, Vol. 63, Issue 3
  • DOI: 10.1002/prot.20865

An improved method for identifying functionally linked proteins using phylogenetic profiles
journal, May 2007


Functional organization of the yeast proteome by systematic analysis of protein complexes
journal, January 2002

  • Gavin, Anne-Claude; Bösche, Markus; Krause, Roland
  • Nature, Vol. 415, Issue 6868
  • DOI: 10.1038/415141a

Selection of organisms for the co-evolution-based study of protein interactions
journal, September 2011


nDNA-prot: identification of DNA-binding proteins based on unbalanced classification
journal, September 2014


Random Forests
journal, January 2001


Comparative assessment of large-scale data sets of protein–protein interactions
journal, May 2002

  • von Mering, Christian; Krause, Roland; Snel, Berend
  • Nature, Vol. 417, Issue 6887
  • DOI: 10.1038/nature750

Computational Approaches for the Prediction of Protein-Protein Interactions: A Survey
journal, December 2011

  • A. Theofilatos, Konstantinos; M. Dimitrakopoulos, Christos; K. Tsakalidis, Athanasios
  • Current Bioinformatics, Vol. 6, Issue 4
  • DOI: 10.2174/157489311798072981