skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Improved Quality Control Processing of Peptide-centric LC-MS Proteomics Data

Journal Article · · Bioinformatics

In the analysis of differential peptide peak intensities (i.e., abundance measures), LC-MS analyses with poor quality peptide abundance data can bias downstream statistical analyses and hence the biological interpretation for an otherwise high quality data set. Although considerable effort has been placed on assuring the quality of the peptide identification with respect to spectral processing, to date quality assessment of the subsequent peptide abundance data matrix has been limited to a subjective visual inspection of run-by-run correlation or individual peptide components. Identifying statistical outliers is a critical step in the processing of proteomics data as many of the downstream statistical analyses (e.g., ANOVA) rely upon accurate estimates of sample variance, and their results are influenced by extreme values. Results: We describe a multivariate statistical strategy for the identification of LC-MS runs with extreme peptide abundance distributions. Comparison with current method (run-by-run correlation) demonstrates the significantly better rate of identification of outlier runs by the multivariate strategy. Simulation studies also suggest this strategy significantly outperforms correlation alone in the identification of statistically extreme LC-MS runs.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Environmental Molecular Sciences Lab. (EMSL)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1031421
Report Number(s):
PNNL-SA-76540; 33706; 33200; 600306000; TRN: US201201%%596
Journal Information:
Bioinformatics, Vol. 27, Issue 20; ISSN 1367-4803
Country of Publication:
United States
Language:
English