Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Sipros Ensemble improves database searching and filtering for complex metaproteomics

Journal Article · · Bioinformatics
 [1];  [2];  [3];  [4];  [5];  [6];  [5];  [2]
  1. Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States); Univ. of North Texas, Denton, TX (United States); DOE/OSTI
  2. Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
  3. Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
  4. Oregon State Univ., Corvallis, OR (United States)
  5. Univ. of Washington, Seattle, WA (United States)
  6. Stellenbosch University (South Africa)
Complex microbial communities can be characterized by metagenomics and metaproteomics. However, metagenome assemblies often generate enormous, and yet incomplete, protein databases, which undermines the identification of peptides and proteins in metaproteomics. This challenge calls for increased discrimination of true identifications from false identifications by database searching and filtering algorithms in metaproteomics. Sipros Ensemble was developed here for metaproteomics using an ensemble approach. Three diverse scoring functions from MyriMatch, Comet and the original Sipros were incorporated within a single database searching engine. Supervised classification with logistic regression was used to filter database searching results. Benchmarking with soil and marine microbial communities demonstrated a higher number of peptide and protein identifications by Sipros Ensemble than MyriMatch/Percolator, Comet/Percolator, MS-GF+/Percolator, Comet & MyriMatch/iProphet and Comet & MyriMatch & MS-GF+/iProphet. Sipros Ensemble was computationally efficient and scalable on supercomputers. Freely available under the GNU GPL license at http://sipros.omicsbio.org.
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
Gordon and Betty Moore Foundation (GBMF); USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC05-00OR22725; SC0010566
OSTI ID:
1625293
Journal Information:
Bioinformatics, Journal Name: Bioinformatics Journal Issue: 5 Vol. 34; ISSN 1367-4803
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United States
Language:
English

References (27)

Comet: An open-source MS/MS sequence database search tool journal December 2012
Microbial metaproteomics for characterizing the range of metabolic functions and activities of human gut microbiota journal May 2015
An Unsupervised, Model-Free, Machine-Learning Combiner for Peptide Identifications from Tandem Mass Spectra journal March 2009
An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database journal November 1994
A Method for Assessing the Statistical Significance of Mass Spectrometry-Based Protein Identifications Using General Scoring Schemes journal February 2003
Integrated Proteomic Pipeline Using Multiple Search Engines for a Proteogenomic Study with a Controlled Protein False Discovery Rate journal August 2016
Evaluation of Multidimensional Chromatography Coupled with Tandem Mass Spectrometry (LC/LC−MS/MS) for Large-Scale Protein Analysis:  The Yeast Proteome journal February 2003
MyriMatch:  Highly Accurate Tandem Mass Spectral Peptide Identification by Multivariate Hypergeometric Analysis journal February 2007
MSblender: A Probabilistic Approach for Integrating Peptide Identifications from Multiple Database Search Engines journal July 2011
Fast and Accurate Database Searches with MS-GF+Percolator journal December 2013
Rapid and Accurate Peptide Identification from Tandem Mass Spectra journal May 2008
Large-scale analysis of the yeast proteome by multidimensional protein identification technology journal March 2001
Diverse and divergent protein post-translational modifications in two growth stages of a natural microbial community journal July 2014
Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry journal February 2007
Semi-supervised learning for peptide identification from shotgun proteomics datasets journal October 2007
Large-scale database searching using tandem mass spectra: Looking up the answer in the back of the book journal November 2004
Quantitative Tracking of Isotope Flows in Proteomes of Microbial Communities journal February 2011
iProphet: Multi-level Integrative Analysis of Shotgun Proteomic Data Improves Peptide and Protein Identification Rates and Error Estimates journal August 2011
A Face in the Crowd: Recognizing Peptides Through Database Search journal August 2011
Interpretation of Shotgun Proteomic Data: The Protein Inference Problem journal July 2005
Exhaustive database searching for amino acid mutations in proteomes journal May 2012
Sipros/ProRata: a versatile informatics system for quantitative community proteomics journal June 2013
Integrated proteomics and metabolomics suggests symbiotic metabolism and multimodal regulation in a fungal-endobacterial system: Symbiotic metabolism and multimodal regulation journal January 2017
Proteomic Stable Isotope Probing Reveals Taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton journal April 2016
A comprehensive and scalable database search system for metaproteomics journal August 2016
Proteomic Stable Isotope Probing Reveals Biosynthesis Dynamics of Slow Growing Methane Based Microbial Communities journal April 2016
Proteogenomic analyses indicate bacterial methylotrophy and archaeal heterotrophy are prevalent below the grass root zone journal January 2016

Cited By (5)


Similar Records

Fast and accurate database searches with MS-GF+Percolator
Journal Article · Thu Feb 27 23:00:00 EST 2014 · Journal of Proteome Research, 13(2):890-897 · OSTI ID:1126323

Optimizing metaproteomics database construction: lessons from a study of the vaginal microbiome
Journal Article · Thu Jun 22 20:00:00 EDT 2023 · mSystems · OSTI ID:2229120

Metaproteomics: extracting and mining proteome information to characterize metabolic activities in microbial communities
Journal Article · Tue Dec 31 23:00:00 EST 2013 · Current Protocols in Bioinformatics · OSTI ID:1149761