skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: ProRata: A quantitative proteomics program for accurate protein abundance ratio estimation with confidence interval evaluation

Abstract

A profile likelihood algorithm is proposed for quantitative shotgun proteomics to infer the abundance ratios of proteins from the abundance ratios of isotopically labeled peptides derived from proteolysis. Previously, we have shown that the estimation variability and bias of peptide abundance ratios can be predicted from their profile signal-to-noise ratios. Given multiple quantified peptides for a protein, the profile likelihood algorithm probabilistically weighs the peptide abundance ratios by their inferred estimation variability, accounts for their expected estimation bias, and suppresses contribution from outliers. This algorithm yields maximum likelihood point estimation and profile likelihood confidence interval estimation of protein abundance ratios. This point estimator is more accurate than an estimator based on the average of peptide abundance ratios. The confidence interval estimation provides an "error bar" for each protein abundance ratio that reflects its estimation precision and statistical uncertainty. The accuracy of the point estimation and the precision and confidence level of the interval estimation were benchmarked with standard mixtures of isotopically labeled proteomes. The profile likelihood algorithm was integrated into a quantitative proteomics program, called ProRata, freely available at www.MSProRata.org.

Authors:
 [1];  [1];  [1];  [1];  [1];  [2];  [1];  [1];  [3]
  1. ORNL
  2. {Greg} B [ORNL
  3. {Bob} L [ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE Office of Science (SC)
OSTI Identifier:
930742
DOE Contract Number:
DE-AC05-00OR22725
Resource Type:
Journal Article
Resource Relation:
Journal Name: Analytical Chemistry; Journal Volume: 78; Journal Issue: 20
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; ABUNDANCE; ACCURACY; ALGORITHMS; EVALUATION; MIXTURES; PEPTIDES; PROTEINS; PROTEOLYSIS; SIGNAL-TO-NOISE RATIO

Citation Formats

Pan, Chongle, Kora, Guruprasad H, McDonald, W Hayes, Tabb, Dave L, Verberkmoes, Nathan C, Hurst, Gregory, Pelletier, Dale A, Samatova, Nagiza F, and Hettich, Robert. ProRata: A quantitative proteomics program for accurate protein abundance ratio estimation with confidence interval evaluation. United States: N. p., 2006. Web. doi:10.1021/ac060654b.
Pan, Chongle, Kora, Guruprasad H, McDonald, W Hayes, Tabb, Dave L, Verberkmoes, Nathan C, Hurst, Gregory, Pelletier, Dale A, Samatova, Nagiza F, & Hettich, Robert. ProRata: A quantitative proteomics program for accurate protein abundance ratio estimation with confidence interval evaluation. United States. doi:10.1021/ac060654b.
Pan, Chongle, Kora, Guruprasad H, McDonald, W Hayes, Tabb, Dave L, Verberkmoes, Nathan C, Hurst, Gregory, Pelletier, Dale A, Samatova, Nagiza F, and Hettich, Robert. Sun . "ProRata: A quantitative proteomics program for accurate protein abundance ratio estimation with confidence interval evaluation". United States. doi:10.1021/ac060654b.
@article{osti_930742,
title = {ProRata: A quantitative proteomics program for accurate protein abundance ratio estimation with confidence interval evaluation},
author = {Pan, Chongle and Kora, Guruprasad H and McDonald, W Hayes and Tabb, Dave L and Verberkmoes, Nathan C and Hurst, Gregory and Pelletier, Dale A and Samatova, Nagiza F and Hettich, Robert},
abstractNote = {A profile likelihood algorithm is proposed for quantitative shotgun proteomics to infer the abundance ratios of proteins from the abundance ratios of isotopically labeled peptides derived from proteolysis. Previously, we have shown that the estimation variability and bias of peptide abundance ratios can be predicted from their profile signal-to-noise ratios. Given multiple quantified peptides for a protein, the profile likelihood algorithm probabilistically weighs the peptide abundance ratios by their inferred estimation variability, accounts for their expected estimation bias, and suppresses contribution from outliers. This algorithm yields maximum likelihood point estimation and profile likelihood confidence interval estimation of protein abundance ratios. This point estimator is more accurate than an estimator based on the average of peptide abundance ratios. The confidence interval estimation provides an "error bar" for each protein abundance ratio that reflects its estimation precision and statistical uncertainty. The accuracy of the point estimation and the precision and confidence level of the interval estimation were benchmarked with standard mixtures of isotopically labeled proteomes. The profile likelihood algorithm was integrated into a quantitative proteomics program, called ProRata, freely available at www.MSProRata.org.},
doi = {10.1021/ac060654b},
journal = {Analytical Chemistry},
number = 20,
volume = 78,
place = {United States},
year = {Sun Jan 01 00:00:00 EST 2006},
month = {Sun Jan 01 00:00:00 EST 2006}
}
  • The abundance ratio between the light and heavy iso-topologues of an isotopically labeled peptide can be estimated from their selected ion chromatograms. How-ever, quantitative shotgun proteomics measurements yield selected ion chromatograms at highly variable signal-to-noise ratios for tens of thousands of peptides. This challenge calls for algorithms that not only robustly estimate the abundance ratios of different peptides but also rigorously score each abundance ratio for the expected estimation bias and variability. Scoring of the abundance ratios, much like scoring of sequence assignment for tandem mass spectra by peptide identification algorithms, enables filtering of unreliable peptide quantification and use ofmore » formal statistical inference in the subsequent protein abundance ratio estimation. In this study, aparallel paired covariance algorithm is used for robust peak detection in selected ion chromatograms. A peak profile is generated for each peptide, which is a scatter plot of ion intensities measured for the two isotopologues with in their chromatographic peaks. Principal component analysis of the peak profile is proposed to estimate the peptide abundance ratio and to score the estimation with the signal-to-noise ratio of the peak profile (profile signal-to-noise ratio). We demonstrate that the profile signal-to-noise ratio is inversely correlated with the variability and bias of peptide abundance ratio estimation.« less
  • High-throughput proteomics is rapidly evolving to require high mass measurement accuracy for a variety of different applications. Increased mass measurement accuracy in bottom-up proteomics specifically allows for an improved ability to distinguish and characterize detected MS features, which may in turn be identified by, e.g., matching to entries in a database for both precursor and fragmentation mass identification methods. Many tools exist with which to score the identification of peptides from LC-MS/MS measurements or to assess matches to an accurate mass and time (AMT) tag database, but these two calculations remain distinctly unrelated. Here we present a statistical method, Statisticalmore » Tools for AMT tag Confidence (STAC), which extends our previous work incorporating prior probabilities of correct sequence identification from LC-MS/MS, as well as the quality with which LC-MS features match AMT tags, to evaluate peptide identification confidence. Compared to existing tools, we are able to obtain significantly more high-confidence peptide identifications at a given false discovery rate and additionally assign confidence estimates to individual peptide identifications. Freely available software implementations of STAC are available in both command line and as a Windows graphical application.« less
  • Numerical simulations were used to estimate the precision of parameters responsible for NMR spin-lattice relaxation in a coupled two-spin system where two relaxation mechanisms, dipole-dipole and chemical shift anisotropy, are present. Confidence intervals in nonlinear least-squares fitting, necessary for multiexponential relaxation, can be obtained with restrictive assumptions. The simulations allow one to investigate the effect of experimental variables on parameter precision without performing many lengthy experiments. Precision depends upon may factors including the nucleus observed, how the recovery curves are sampled, the sample temperature stability, and the number of simultaneously estimated parameters.
  • Mass spectrometric analysis of Caldicellulosiruptor obsidiansis cultures grown on four different carbon sources identified 65% of the cells predicted proteins in cell lysates and supernatants. Biological and technical replication together with sophisticated statistical analysis were used to reliably quantify protein abundances and their changes as a function of carbon source. Extracellular, multifunctional glycosidases were significantly more abundant on cellobiose than on the crystalline cellulose substrates Avicel and filter paper, indicating either disaccharide induction or constitutive protein expression. Highly abundant flagellar, chemotaxis, and pilus proteins were detected during growth on insoluble substrates, suggesting motility or specific substrate attachment. The highly abundantmore » extracellular binding protein COB47-0549 together with the COB47-1616 ATPase might comprise the primary ABC-transport system for cellooligosaccharides, while COB47-0096 and COB47-0097 could facilitate monosaccharide uptake. Oligosaccharide degradation can occur either via extracellular hydrolysis by a GH1 {beta}-glycosidase or by intracellular phosphorolysis using two GH94 enzymes. When C. obsidiansis was grown on switchgrass, the abundance of hemicellulases (including GH3, GH5, GH51, and GH67 enzymes) and certain sugar transporters increased significantly. Cultivation on biomass also caused a concerted increase in cytosolic enzymes for xylose and arabinose fermentation.« less