PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction
Abstract
Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using an assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.
- Authors:
-
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- Publication Date:
- Research Org.:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1234001
- Alternate Identifier(s):
- OSTI ID: 1247926
- Grant/Contract Number:
- DE–AC05–00OR22725; AC05-00OR22725
- Resource Type:
- Published Article
- Journal Name:
- International Journal of Genomics
- Additional Journal Information:
- Journal Name: International Journal of Genomics Journal Volume: 2015; Journal ID: ISSN 2314-436X
- Publisher:
- Hindawi Publishing Corporation
- Country of Publication:
- Egypt
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES
Citation Formats
Yao, Jianzhuang, Guo, Hong, and Yang, Xiaohan. PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction. Egypt: N. p., 2015.
Web. doi:10.1155/2015/608042.
Yao, Jianzhuang, Guo, Hong, & Yang, Xiaohan. PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction. Egypt. https://doi.org/10.1155/2015/608042
Yao, Jianzhuang, Guo, Hong, and Yang, Xiaohan. Thu .
"PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction". Egypt. https://doi.org/10.1155/2015/608042.
@article{osti_1234001,
title = {PPCM: Combing Multiple Classifiers to Improve Protein-Protein Interaction Prediction},
author = {Yao, Jianzhuang and Guo, Hong and Yang, Xiaohan},
abstractNote = {Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using an assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.},
doi = {10.1155/2015/608042},
journal = {International Journal of Genomics},
number = ,
volume = 2015,
place = {Egypt},
year = {Thu Jan 01 00:00:00 EST 2015},
month = {Thu Jan 01 00:00:00 EST 2015}
}
https://doi.org/10.1155/2015/608042
Web of Science
Works referenced in this record:
Proteome survey reveals modularity of the yeast cell machinery
journal, January 2006
- Gavin, Anne-Claude; Aloy, Patrick; Grandi, Paola
- Nature, Vol. 440, Issue 7084
Phylogenetic profiles for the prediction of protein–protein interactions: How to select reference organisms?
journal, February 2007
- Sun, Jingchun; Li, Yixue; Zhao, Zhongming
- Biochemical and Biophysical Research Communications, Vol. 353, Issue 4
Predicting protein-protein interactions in unbalanced data using the primary structure of proteins
journal, January 2010
- Yu, Chi-Yuan; Chou, Lih-Ching; Chang, Darby
- BMC Bioinformatics, Vol. 11, Issue 1
LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy
journal, January 2014
- Lin, Chen; Chen, Wenqiang; Qiu, Cheng
- Neurocomputing, Vol. 123
Prediction of protein–protein interactions using random decision forest framework
journal, October 2005
- Chen, Xue-Wen; Liu, Mei
- Bioinformatics, Vol. 21, Issue 24
A more complete, complexed and structured interactome
journal, June 2007
- Devos, Damien; Russell, Robert B.
- Current Opinion in Structural Biology, Vol. 17, Issue 3
Protein complexes take the bait
journal, January 2002
- Kumar, Anuj; Snyder, Michael
- Nature, Vol. 415, Issue 6868
Identification of functional links between genes using phylogenetic profiles
journal, August 2003
- Wu, J.; Kasif, S.; DeLisi, C.
- Bioinformatics, Vol. 19, Issue 12
Hierarchical Classification of Protein Folds Using a Novel Ensemble Classifier
journal, February 2013
- Lin, Chen; Zou, Ying; Qin, Ji
- PLoS ONE, Vol. 8, Issue 2
Automatic selection of reference taxa for protein–protein interaction prediction with phylogenetic profiling
journal, January 2012
- Simonsen, Martin; Maetschke, Stefan R.; Ragan, Mark A.
- Bioinformatics, Vol. 28, Issue 6
A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data
journal, October 2003
- Jansen, R.
- Science, Vol. 302, Issue 5644
The Negatome database: a reference set of non-interacting protein pairs
journal, November 2009
- Smialowski, Pawel; Pagel, Philipp; Wong, Philip
- Nucleic Acids Research, Vol. 38, Issue suppl_1
DIP: the Database of Interacting Proteins
journal, January 2000
- Xenarios, I.
- Nucleic Acids Research, Vol. 28, Issue 1
Understanding Protein–Protein Interactions Using Local Structural Features
journal, April 2013
- Planas-Iglesias, Joan; Bonet, Jaume; García-García, Javier
- Journal of Molecular Biology, Vol. 425, Issue 7
Biana: a software framework for compiling biological interactions and analyzing networks
journal, January 2010
- Garcia-Garcia, Javier; Guney, Emre; Aragues, Ramon
- BMC Bioinformatics, Vol. 11, Issue 1
STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012
- Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune
- Nucleic Acids Research, Vol. 41, Issue D1
Protein interaction predictions from diverse sources
journal, May 2008
- Liu, Yin; Kim, Inyoung; Zhao, Hongyu
- Drug Discovery Today, Vol. 13, Issue 9-10
Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles
journal, April 1999
- Pellegrini, M.; Marcotte, E. M.; Thompson, M. J.
- Proceedings of the National Academy of Sciences, Vol. 96, Issue 8
Comparative assessment of performance and genome dependence among phylogenetic profiling methods
journal, September 2006
- Snitkin, Evan S.; Gustafson, Adam M.; Mellor, Joseph
- BMC Bioinformatics, Vol. 7, Issue 1
The Cell as a Collection of Protein Machines: Preparing the Next Generation of Molecular Biologists
journal, February 1998
- Alberts, Bruce
- Cell, Vol. 92, Issue 3
Constructing Multigenome Views of Whole Microbial Genomes
journal, January 1998
- Gaasterland, Terry; Ragan, Mark A.
- Microbial & Comparative Genomics, Vol. 3, Issue 3
Using ensemble methods to deal with imbalanced data in predicting protein–protein interactions
journal, February 2012
- Zhang, Yongqing; Zhang, Danling; Mi, Gang
- Computational Biology and Chemistry, Vol. 36
Integrative Neural Network Approach for Protein Interaction Prediction from Heterogeneous Data
book, January 2008
- Chen, Xue-wen; Liu, Mei; Hu, Yong
- Advanced Data Mining and Applications
Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment
journal, May 2007
- Jothi, Raja; Przytycka, Teresa M.; Aravind, L.
- BMC Bioinformatics, Vol. 8, Issue 1
Learning to predict protein–protein interactions from protein sequences
journal, October 2003
- Gomez, Shawn M.; Noble, William Stafford; Rzhetsky, Andrey
- Bioinformatics, Vol. 19, Issue 15
Gene Ontology-driven inference of protein–protein interactions using inducers
journal, November 2011
- Maetschke, Stefan R.; Simonsen, Martin; Davis, Melissa J.
- Bioinformatics, Vol. 28, Issue 1
Discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages
journal, August 2003
- Date, Shailesh V.; Marcotte, Edward M.
- Nature Biotechnology, Vol. 21, Issue 9
Bio::Homology::InterologWalk - A Perl module to build putative protein-protein interaction networks through interolog mapping
journal, July 2011
- Gallone, Giuseppe; Simpson, T. Ian; Armstrong, J. Douglas
- BMC Bioinformatics, Vol. 12, Issue 1
BIPS: BIANA Interolog Prediction Server. A tool for protein–protein interaction inference
journal, June 2012
- Garcia-Garcia, Javier; Schleker, Sylvia; Klein-Seetharaman, Judith
- Nucleic Acids Research, Vol. 40, Issue W1
Evaluation of different biological data and computational classification methods for use in protein interaction prediction
journal, January 2006
- Qi, Yanjun; Bar-Joseph, Ziv; Klein-Seetharaman, Judith
- Proteins: Structure, Function, and Bioinformatics, Vol. 63, Issue 3
An improved method for identifying functionally linked proteins using phylogenetic profiles
journal, May 2007
- Cokus, Shawn; Mizutani, Sayaka; Pellegrini, Matteo
- BMC Bioinformatics, Vol. 8, Issue S4
Functional organization of the yeast proteome by systematic analysis of protein complexes
journal, January 2002
- Gavin, Anne-Claude; Bösche, Markus; Krause, Roland
- Nature, Vol. 415, Issue 6868
Selection of organisms for the co-evolution-based study of protein interactions
journal, September 2011
- Herman, Dorota; Ochoa, David; Juan, David
- BMC Bioinformatics, Vol. 12, Issue 1
nDNA-prot: identification of DNA-binding proteins based on unbalanced classification
journal, September 2014
- Song, Li; Li, Dapeng; Zeng, Xiangxiang
- BMC Bioinformatics, Vol. 15, Issue 1
Comparative assessment of large-scale data sets of protein–protein interactions
journal, May 2002
- von Mering, Christian; Krause, Roland; Snel, Berend
- Nature, Vol. 417, Issue 6887
Computational Approaches for the Prediction of Protein-Protein Interactions: A Survey
journal, December 2011
- A. Theofilatos, Konstantinos; M. Dimitrakopoulos, Christos; K. Tsakalidis, Athanasios
- Current Bioinformatics, Vol. 6, Issue 4