Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Support Vector Machine Classification of Probability Models and Peptide Features for Improved Peptide Identification from Shotgun Proteomics

Conference ·

Proteomics is a rapidly advancing field offering a new perspective to biological systems. Mass spectrometry (MS) is a popular experimental approach because it allows global protein characterization of a sample in a high-throughput manner. The identification of a protein is based on the spectral signature of fragments of the constituent proteins, i.e., peptides. This peptide identification is typically performed with a computational database search algorithm; however, these database search algorithms return a large number of false positive identifications. We present a new scoring algorithm that uses a SVM to integrate database scoring metrics with peptide physiochemical properties, resulting in an improved ability to separate true from false peptide identification from MS. The Peptide Identification Classifier SVM (PICS) score using only five variables is significantly more accurate than the single best database metric, quantified as the area under a Receive Operating Characteristic curve of ~0.94 versus ~0.90.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
928287
Report Number(s):
PNNL-SA-58675; KJ0102000
Country of Publication:
United States
Language:
English