skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

This content will become publicly available on August 26, 2020

Title: A five-level classification system for proteoform identifications

Abstract

The term proteoform, introduced in these pages in 2013, has rapidly gained acceptance in the proteomics community. The challenge and importance of comprehensively identifying proteoforms in complex samples has been recognized, and reports have begun to appear of new platforms towards that end. However, one interesting central ambiguity has emerged, namely determining precisely what is meant by a “proteoform identification”. At present, the only practical approaches for establishing the exact primary structure of a proteoform employ mass spectrometry (MS), and there is a wide range of MS results that provide “proteoform identifications”. This seemingly small matter has significant impact, as the ambiguity in knowing what is meant by an “identification” makes it difficult to compare results from different laboratories and approaches. This situation hinders the ability of the community to evaluate technological progress and to efficiently expand biological knowledge. To address this issue, we propose a five-level system for classifying proteoform identifications. Here, the classification scheme stems directly from a consideration of the four types of possible ambiguity possible for a proteoform identification, ranging from the most subtle (i.e., precise localization of a post-translational modification, or PTM) to the most dramatic (i.e., ambiguity in the gene of origin). The fivemore » classes then correspond to the level of ambiguity present in the identification, ranging from no ambiguity at all (Level 1), to ambiguity of all four types (Level 5). Details of the scheme are provided in Table 1 and Supplementary Table 1, with specific use cases and examples provided in Supplementary Figure 1.« less

Authors:
ORCiD logo [1]; ORCiD logo [2];  [1];  [1];  [2];  [2];  [1];  [1]; ORCiD logo [3];  [4]; ORCiD logo [5]; ORCiD logo [6]; ORCiD logo [7]; ORCiD logo [8];  [6];  [9]; ORCiD logo [10]; ORCiD logo [11];  [12];  [13] more »; ORCiD logo [2] « less
  1. Univ. of Wisconsin-Madison, Madison, WI (United States)
  2. Northwestern Univ., Evanston, IL (United States)
  3. Northeastern Univ., Boston, MA (United States)
  4. National High Magnetic Field Laboratory, Tallahassee, FL (United States)
  5. Inst. Pasteur, Paris (France)
  6. Univ. of Oxford, Oxford (United Kingdom)
  7. Univ. of California, Los Angeles, CA (United States)
  8. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
  9. Univ. Medical Center, Hamburg (Germany)
  10. Spectroswiss CH, Lausanne (Switzerland)
  11. Inst. for Research in Biomedicine, Barcelona (Spain)
  12. European Bioinformatics Inst., Cambridge (United Kingdom)
  13. Consortium for Top Down Proteomics, Cambridge, MA (United States)
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1573357
Report Number(s):
PNNL-SA-147035
Journal ID: ISSN 1548-7091
Grant/Contract Number:  
AC05-76RL01830
Resource Type:
Accepted Manuscript
Journal Name:
Nature Methods
Additional Journal Information:
Journal Volume: 16; Journal Issue: 10; Journal ID: ISSN 1548-7091
Publisher:
Nature Publishing Group
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY

Citation Formats

Smith, Lloyd M., Thomas, Paul M., Shortreed, Michael R., Schaffer, Leah V., Fellers, Ryan T., LeDuc, Richard D., Tucholski, Trisha, Ge, Ying, Agar, Jeffrey N., Anderson, Lissa C., Chamot-Rooke, Julia, Gault, Joseph, Loo, Joseph A., Paša-Tolić, Ljiljana, Robinson, Carol V., Schlüter, Hartmut, Tsybin, Yury O., Vilaseca, Marta, Vizcaíno, Juan Antonio, Danis, Paul O., and Kelleher, Neil L. A five-level classification system for proteoform identifications. United States: N. p., 2019. Web. doi:10.1038/s41592-019-0573-x.
Smith, Lloyd M., Thomas, Paul M., Shortreed, Michael R., Schaffer, Leah V., Fellers, Ryan T., LeDuc, Richard D., Tucholski, Trisha, Ge, Ying, Agar, Jeffrey N., Anderson, Lissa C., Chamot-Rooke, Julia, Gault, Joseph, Loo, Joseph A., Paša-Tolić, Ljiljana, Robinson, Carol V., Schlüter, Hartmut, Tsybin, Yury O., Vilaseca, Marta, Vizcaíno, Juan Antonio, Danis, Paul O., & Kelleher, Neil L. A five-level classification system for proteoform identifications. United States. doi:10.1038/s41592-019-0573-x.
Smith, Lloyd M., Thomas, Paul M., Shortreed, Michael R., Schaffer, Leah V., Fellers, Ryan T., LeDuc, Richard D., Tucholski, Trisha, Ge, Ying, Agar, Jeffrey N., Anderson, Lissa C., Chamot-Rooke, Julia, Gault, Joseph, Loo, Joseph A., Paša-Tolić, Ljiljana, Robinson, Carol V., Schlüter, Hartmut, Tsybin, Yury O., Vilaseca, Marta, Vizcaíno, Juan Antonio, Danis, Paul O., and Kelleher, Neil L. Mon . "A five-level classification system for proteoform identifications". United States. doi:10.1038/s41592-019-0573-x.
@article{osti_1573357,
title = {A five-level classification system for proteoform identifications},
author = {Smith, Lloyd M. and Thomas, Paul M. and Shortreed, Michael R. and Schaffer, Leah V. and Fellers, Ryan T. and LeDuc, Richard D. and Tucholski, Trisha and Ge, Ying and Agar, Jeffrey N. and Anderson, Lissa C. and Chamot-Rooke, Julia and Gault, Joseph and Loo, Joseph A. and Paša-Tolić, Ljiljana and Robinson, Carol V. and Schlüter, Hartmut and Tsybin, Yury O. and Vilaseca, Marta and Vizcaíno, Juan Antonio and Danis, Paul O. and Kelleher, Neil L.},
abstractNote = {The term proteoform, introduced in these pages in 2013, has rapidly gained acceptance in the proteomics community. The challenge and importance of comprehensively identifying proteoforms in complex samples has been recognized, and reports have begun to appear of new platforms towards that end. However, one interesting central ambiguity has emerged, namely determining precisely what is meant by a “proteoform identification”. At present, the only practical approaches for establishing the exact primary structure of a proteoform employ mass spectrometry (MS), and there is a wide range of MS results that provide “proteoform identifications”. This seemingly small matter has significant impact, as the ambiguity in knowing what is meant by an “identification” makes it difficult to compare results from different laboratories and approaches. This situation hinders the ability of the community to evaluate technological progress and to efficiently expand biological knowledge. To address this issue, we propose a five-level system for classifying proteoform identifications. Here, the classification scheme stems directly from a consideration of the four types of possible ambiguity possible for a proteoform identification, ranging from the most subtle (i.e., precise localization of a post-translational modification, or PTM) to the most dramatic (i.e., ambiguity in the gene of origin). The five classes then correspond to the level of ambiguity present in the identification, ranging from no ambiguity at all (Level 1), to ambiguity of all four types (Level 5). Details of the scheme are provided in Table 1 and Supplementary Table 1, with specific use cases and examples provided in Supplementary Figure 1.},
doi = {10.1038/s41592-019-0573-x},
journal = {Nature Methods},
number = 10,
volume = 16,
place = {United States},
year = {2019},
month = {8}
}

Journal Article:
Free Publicly Available Full Text
This content will become publicly available on August 26, 2020
Publisher's Version of Record

Save / Share:

Works referenced in this record:

Proteoform: a single term describing protein complexity
journal, February 2013

  • Smith, Lloyd M.; Kelleher, Neil L.
  • Nature Methods, Vol. 10, Issue 3
  • DOI: 10.1038/nmeth.2369

Characterization of Proteoforms with Unknown Post-translational Modifications Using the MIScore
journal, July 2016


Protein Ontology (PRO): enhancing and scaling up the representation of protein entities
journal, November 2016

  • Natale, Darren A.; Arighi, Cecilia N.; Blake, Judith A.
  • Nucleic Acids Research, Vol. 45, Issue D1
  • DOI: 10.1093/nar/gkw1075

ProForma: A Standard Proteoform Notation
journal, February 2018


Top or Middle? Up or Down? Toward a Standard Lexicon for Protein Top-Down and Allied Mass Spectrometry Approaches
journal, May 2019

  • Lermyte, Frederik; Tsybin, Yury O.; O’Connor, Peter B.
  • Journal of The American Society for Mass Spectrometry, Vol. 30, Issue 7
  • DOI: 10.1007/s13361-019-02201-x

High Resolution CZE-MS Quantitative Characterization of Intact Biopharmaceutical Proteins: Proteoforms of Interferon-β1
journal, December 2015


The C-Score: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics
journal, June 2014

  • LeDuc, Richard D.; Fellers, Ryan T.; Early, Bryan P.
  • Journal of Proteome Research, Vol. 13, Issue 7
  • DOI: 10.1021/pr401277r

Identification and Quantification of Murine Mitochondrial Proteoforms Using an Integrated Top-Down and Intact-Mass Strategy
journal, September 2018

  • Schaffer, Leah V.; Rensvold, Jarred W.; Shortreed, Michael R.
  • Journal of Proteome Research, Vol. 17, Issue 10
  • DOI: 10.1021/acs.jproteome.8b00469

Identification and Characterization of Human Proteoforms by Top-Down LC-21 Tesla FT-ICR Mass Spectrometry
journal, December 2016