skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

Abstract

Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR).405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models.This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST)more » data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.« less

Authors:
 [1];  [2];  [2];  [2];  [2];  [3]; ORCiD logo [4];  [5];  [2]
  1. Univ. of Liverpool (United Kingdom); Univ. of Manchester (United Kingdom)
  2. Univ. of Manchester (United Kingdom)
  3. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
  4. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
  5. Univ. of Liverpool (United Kingdom)
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1556877
Report Number(s):
PNNL-SA-65830
Journal ID: ISSN 1471-2164
Grant/Contract Number:  
AC05-76RL01830
Resource Type:
Accepted Manuscript
Journal Name:
BMC Genomics
Additional Journal Information:
Journal Volume: 10; Journal Issue: 1; Journal ID: ISSN 1471-2164
Publisher:
Springer
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; proteomics; annotation; Aspergillus niger; fungi; ascomycete

Citation Formats

Wright, James C., Sugden, Deana, Francis-McIntyre, Sue, Riba-Garcia, Isabel, Gaskell, Simon J., Grigoriev, Igor V., Baker, Scott E., Beynon, Robert J., and Hubbard, Simon J. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger. United States: N. p., 2009. Web. doi:10.1186/1471-2164-10-61.
Wright, James C., Sugden, Deana, Francis-McIntyre, Sue, Riba-Garcia, Isabel, Gaskell, Simon J., Grigoriev, Igor V., Baker, Scott E., Beynon, Robert J., & Hubbard, Simon J. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger. United States. doi:10.1186/1471-2164-10-61.
Wright, James C., Sugden, Deana, Francis-McIntyre, Sue, Riba-Garcia, Isabel, Gaskell, Simon J., Grigoriev, Igor V., Baker, Scott E., Beynon, Robert J., and Hubbard, Simon J. Wed . "Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger". United States. doi:10.1186/1471-2164-10-61. https://www.osti.gov/servlets/purl/1556877.
@article{osti_1556877,
title = {Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger},
author = {Wright, James C. and Sugden, Deana and Francis-McIntyre, Sue and Riba-Garcia, Isabel and Gaskell, Simon J. and Grigoriev, Igor V. and Baker, Scott E. and Beynon, Robert J. and Hubbard, Simon J.},
abstractNote = {Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR).405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models.This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.},
doi = {10.1186/1471-2164-10-61},
journal = {BMC Genomics},
number = 1,
volume = 10,
place = {United States},
year = {2009},
month = {2}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 25 works
Citation information provided by
Web of Science

Save / Share: