DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity

Abstract

This dataset deposition announces the submission to public repositories of the PNNL Biodiversity Library, a large collection of global proteomics data for 112 bacterial and archaeal organisms. The data comprises 35,162 tandem mass spectrometry (MS/MS) datasets from ~10 years of research. All data has been searched, annotated and organized in a consistent manner to promote reuse by the community. Protein identifications were cross-referenced with KEGG functional annotations which allows for pathway oriented investigation. We present the data as a freely available community resource. A variety of data re-use options are described for computational modeling, proteomics assay design and bioengineering. Instrument data and analysis files are available at ProteomeXchange via the MassIVE partner repository under the identifiers PXD001860 and MSV000079053.

Authors:
 [1];  [1];  [1];  [1];  [1];  [1];  [1];  [2];  [1];  [1];  [1]
  1. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Biological Sciences Division
  2. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Environmental Molecular Science Laboratory
Publication Date:
Research Org.:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States). Environmental Molecular Sciences Laboratory (EMSL)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
OSTI Identifier:
1213004
Report Number(s):
PNNL-SA-108944
48582; KP1601010
Grant/Contract Number:  
AC05-76RL01830
Resource Type:
Accepted Manuscript
Journal Name:
Scientific Data, 2:Art. No. 150041
Additional Journal Information:
Journal Name: Scientific Data, 2:Art. No. 150041
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Environmental Molecular Sciences Laboratory

Citation Formats

Payne, Samuel H., Monroe, Matthew E., Overall, Christopher C., Kiebel, Gary R., Degan, Michael G., Gibbons, Bryson C., Fujimoto, Grant M., Purvine, Samuel O., Adkins, Joshua N., Lipton, Mary S., and Smith, Richard D. The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity. United States: N. p., 2015. Web. doi:10.1038/sdata.2015.41.
Payne, Samuel H., Monroe, Matthew E., Overall, Christopher C., Kiebel, Gary R., Degan, Michael G., Gibbons, Bryson C., Fujimoto, Grant M., Purvine, Samuel O., Adkins, Joshua N., Lipton, Mary S., & Smith, Richard D. The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity. United States. https://doi.org/10.1038/sdata.2015.41
Payne, Samuel H., Monroe, Matthew E., Overall, Christopher C., Kiebel, Gary R., Degan, Michael G., Gibbons, Bryson C., Fujimoto, Grant M., Purvine, Samuel O., Adkins, Joshua N., Lipton, Mary S., and Smith, Richard D. Tue . "The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity". United States. https://doi.org/10.1038/sdata.2015.41. https://www.osti.gov/servlets/purl/1213004.
@article{osti_1213004,
title = {The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity},
author = {Payne, Samuel H. and Monroe, Matthew E. and Overall, Christopher C. and Kiebel, Gary R. and Degan, Michael G. and Gibbons, Bryson C. and Fujimoto, Grant M. and Purvine, Samuel O. and Adkins, Joshua N. and Lipton, Mary S. and Smith, Richard D.},
abstractNote = {This dataset deposition announces the submission to public repositories of the PNNL Biodiversity Library, a large collection of global proteomics data for 112 bacterial and archaeal organisms. The data comprises 35,162 tandem mass spectrometry (MS/MS) datasets from ~10 years of research. All data has been searched, annotated and organized in a consistent manner to promote reuse by the community. Protein identifications were cross-referenced with KEGG functional annotations which allows for pathway oriented investigation. We present the data as a freely available community resource. A variety of data re-use options are described for computational modeling, proteomics assay design and bioengineering. Instrument data and analysis files are available at ProteomeXchange via the MassIVE partner repository under the identifiers PXD001860 and MSV000079053.},
doi = {10.1038/sdata.2015.41},
journal = {Scientific Data, 2:Art. No. 150041},
number = ,
volume = ,
place = {United States},
year = {Tue Aug 18 00:00:00 EDT 2015},
month = {Tue Aug 18 00:00:00 EDT 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 10 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
journal, January 2002


PRISM: A data management system for high-throughput proteomics
journal, March 2006


Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra
journal, May 2011

  • Frank, Ari M.; Monroe, Matthew E.; Shah, Anuj R.
  • Nature Methods, Vol. 8, Issue 7
  • DOI: 10.1038/nmeth.1609

Comparative proteogenomics: Combining mass spectrometry and comparative genomics to analyze multiple genomes
journal, July 2008


A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics
journal, June 2010


Proteogenomic Analysis of Bacteria and Archaea: A 46 Organism Case Study
journal, November 2011


Comparative Bacterial Proteomics: Analysis of the Core Genome Concept
journal, February 2008


Does Trypsin Cut Before Proline?
journal, January 2008

  • Rodriguez, Jesse; Gupta, Nitin; Smith, Richard D.
  • Journal of Proteome Research, Vol. 7, Issue 1
  • DOI: 10.1021/pr0705035

The Proteomics Identifications (PRIDE) database and associated tools: status in 2013
journal, November 2012

  • Vizcaíno, Juan Antonio; Côté, Richard G.; Csordas, Attila
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1262

mzML—a Community Standard for Mass Spectrometry Data
journal, August 2010

  • Martens, Lennart; Chambers, Matthew; Sturm, Marc
  • Molecular & Cellular Proteomics, Vol. 10, Issue 1
  • DOI: 10.1074/mcp.R110.000133

Mass spectrometry-based proteomics
journal, March 2003


The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search
journal, September 2010

  • Kim, Sangtae; Mischerikow, Nikolai; Bandeira, Nuno
  • Molecular & Cellular Proteomics, Vol. 9, Issue 12
  • DOI: 10.1074/mcp.M110.003731

The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results
journal, February 2012

  • Jones, Andrew R.; Eisenacher, Martin; Mayer, Gerhard
  • Molecular & Cellular Proteomics, Vol. 11, Issue 7
  • DOI: 10.1074/mcp.M111.014381

Estimating probabilities of correct identification from results of mass spectral library searches
journal, April 1994


Influence of Peptide Composition, Gas-Phase Basicity, and Chemical Modification on Fragmentation Efficiency:  Evidence for the Mobile Proton Model
journal, January 1996

  • Dongré, Ashok R.; Jones, Jennifer L.; Somogyi, Árpád
  • Journal of the American Chemical Society, Vol. 118, Issue 35
  • DOI: 10.1021/ja9542193

Bifurcating Fragmentation Behavior of Gas-Phase Tryptic Peptide Dications in Collisional Activation
journal, December 2008

  • Savitski, M.; Falth, M.; Fung, Y.
  • Journal of the American Society for Mass Spectrometry, Vol. 19, Issue 12
  • DOI: 10.1016/j.jasms.2008.08.003

Computational prediction of proteotypic peptides for quantitative proteomics
journal, December 2006

  • Mallick, Parag; Schirle, Markus; Chen, Sharon S.
  • Nature Biotechnology, Vol. 25, Issue 1
  • DOI: 10.1038/nbt1275

Open Mass Spectrometry Search Algorithm
journal, October 2004

  • Geer, Lewis Y.; Markey, Sanford P.; Kowalak, Jeffrey A.
  • Journal of Proteome Research, Vol. 3, Issue 5
  • DOI: 10.1021/pr0499491

MyriMatch:  Highly Accurate Tandem Mass Spectral Peptide Identification by Multivariate Hypergeometric Analysis
journal, February 2007

  • Tabb, David L.; Fernando, Christopher G.; Chambers, Matthew C.
  • Journal of Proteome Research, Vol. 6, Issue 2
  • DOI: 10.1021/pr0604054

Phosphorylation-Specific MS/MS Scoring for Rapid and Accurate Phosphoproteome Analysis
journal, August 2008

  • Payne, Samuel H.; Yau, Margaret; Smolka, Marcus B.
  • Journal of Proteome Research, Vol. 7, Issue 8
  • DOI: 10.1021/pr800129m

Mapping protein post-translational modifications with mass spectrometry
journal, September 2007

  • Witze, Eric S.; Old, William M.; Resing, Katheryn A.
  • Nature Methods, Vol. 4, Issue 10
  • DOI: 10.1038/nmeth1100

Top-down proteomics reveals a unique protein S-thiolation switch in Salmonella Typhimurium in response to infection-like conditions
journal, May 2013

  • Ansong, C.; Wu, S.; Meng, D.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 25
  • DOI: 10.1073/pnas.1221210110

Accurate Annotation of Peptide Modifications through Unrestrictive Database Search
journal, January 2008

  • Tanner, Stephen; Payne, Samuel H.; Dasari, Surendra
  • Journal of Proteome Research, Vol. 7, Issue 1
  • DOI: 10.1021/pr070444v

Phosphoproteome Analysis of E. coli Reveals Evolutionary Conservation of Bacterial Ser/Thr/Tyr Phosphorylation
journal, October 2007


A proteogenomic update to Yersinia: enhancing genome annotation
journal, August 2010


Ortho-proteogenomics: Multiple proteomes investigation through orthology and a new MS-based protocol
journal, October 2008


Using BiblioSpec for Creating and Searching Tandem MS Peptide Libraries
journal, December 2007


PRISM: A data management system for high-throughput proteomics
journal, March 2006


Estimating probabilities of correct identification from results of mass spectral library searches
journal, April 1994


Bifurcating Fragmentation Behavior of Gas-Phase Tryptic Peptide Dications in Collisional Activation
journal, December 2008

  • Savitski, M.; Falth, M.; Fung, Y.
  • Journal of the American Society for Mass Spectrometry, Vol. 19, Issue 12
  • DOI: 10.1016/j.jasms.2008.08.003

Accurate Annotation of Peptide Modifications through Unrestrictive Database Search
journal, January 2008

  • Tanner, Stephen; Payne, Samuel H.; Dasari, Surendra
  • Journal of Proteome Research, Vol. 7, Issue 1
  • DOI: 10.1021/pr070444v

Does Trypsin Cut Before Proline?
journal, January 2008

  • Rodriguez, Jesse; Gupta, Nitin; Smith, Richard D.
  • Journal of Proteome Research, Vol. 7, Issue 1
  • DOI: 10.1021/pr0705035

Mass spectrometry-based proteomics
journal, March 2003


Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra
journal, May 2011

  • Frank, Ari M.; Monroe, Matthew E.; Shah, Anuj R.
  • Nature Methods, Vol. 8, Issue 7
  • DOI: 10.1038/nmeth.1609

Mapping protein post-translational modifications with mass spectrometry
journal, September 2007

  • Witze, Eric S.; Old, William M.; Resing, Katheryn A.
  • Nature Methods, Vol. 4, Issue 10
  • DOI: 10.1038/nmeth1100

Top-down proteomics reveals a unique protein S-thiolation switch in Salmonella Typhimurium in response to infection-like conditions
journal, May 2013

  • Ansong, C.; Wu, S.; Meng, D.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 25
  • DOI: 10.1073/pnas.1221210110

Building and Searching Tandem Mass Spectral Libraries for Peptide Identification
journal, September 2011


The Proteomics Identifications (PRIDE) database and associated tools: status in 2013
journal, November 2012

  • Vizcaíno, Juan Antonio; Côté, Richard G.; Csordas, Attila
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1262

Ortho-proteogenomics: Multiple proteomes investigation through orthology and a new MS-based protocol
journal, October 2008


A proteogenomic update to Yersinia: enhancing genome annotation
journal, August 2010


Comparative Bacterial Proteomics: Analysis of the Core Genome Concept
journal, February 2008


Proteogenomic Analysis of Bacteria and Archaea: A 46 Organism Case Study
journal, November 2011


Works referencing / citing this record:

The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics
journal, July 2017

  • Pino, Lindsay K.; Searle, Brian C.; Bollinger, James G.
  • Mass Spectrometry Reviews
  • DOI: 10.1002/mas.21540

Proteomics of natural bacterial isolates powered by deep learning-based de novo identification
journal, November 2020

  • Lee, Joon-Yong; Mitchell, Hugh D.; Burnet, Meagan C.
  • Journal of Proteome Research
  • DOI: 10.1101/428334

A rapid methods development workflow for high-throughput quantitative proteomic applications
journal, February 2019


Fast Open Modification Spectral Library Searching Through Approximate Nearest Neighbor Indexing
text, January 2018


Fast Open Modification Spectral Library Searching Through Approximate Nearest Neighbor Indexing
text, January 2018


Fast Open Modification Spectral Library Searching Through Approximate Nearest Neighbor Indexing
text, January 2018


Fast Open Modification Spectral Library Searching Through Approximate Nearest Neighbor Indexing
text, January 2018


Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
journal, November 2016

  • Na, Seungjin; Payne, Samuel H.; Bandeira, Nuno
  • Molecular & Cellular Proteomics, Vol. 15, Issue 11
  • DOI: 10.1074/mcp.o116.060913

A rapid methods development workflow for high-throughput quantitative proteomic applications
journal, February 2019