skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Proteogenomic analysis of bacteria and archaea: A 46 organism case study

Journal Article · · PLoS One

Experimental evidence is increasingly being used to reassess the quality and accuracy of genome annotation. Proteomics data used for this purpose, called proteogenomics, can alleviate many of the problematic areas of genome annotation, e.g. short protein validation and start site assignment. We performed a proteogenomic analysis of 51 genomes spanning eight bacterial and archaeal phyla across the tree of life. These diverse datasets facilitated the development of a robust approach for proteogenomics that is functional across genomes varying in %GC, gene content, proteomic sampling depth, phylogeny, and genome size. In addition to finding evidence for 701 novel proteins, 1365 new start sites, and numerous dubious genes, we discovered sites of post-translational maturation in the form of proteolytic cleavage of 1095 signal peptides. Proteomics provides a powerful experimental data type to access and improve the quality of genome annotation. A key advantage is the direct correlation between protein annotation and a protein based assay. With the adoption of new sequencing technologies which have higher error rates than Sanger-based methods and the advances in proteomics, proteogenomics may become even more important in the future.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1033067
Report Number(s):
PNNL-SA-75723; KP1601010; TRN: US201202%%569
Journal Information:
PLoS One, Vol. 6, Issue 11
Country of Publication:
United States
Language:
English