DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization

Abstract

Background: MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. Description: MannDB is a relational database that organizes data resulting from fully automated, highthroughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, ismore » performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. Conclusion: MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports. Access to MannDB is freely available at http://manndb.llnl.gov/.« less

Authors:
 [1];  [1];  [1];  [1];  [2];  [1];  [1];  [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Pathogen Bio-informatics
  2. Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States). Virginia Bioinformatics Inst.
Publication Date:
Research Org.:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division; US Department of Homeland Security (DHS)
OSTI Identifier:
1626328
Grant/Contract Number:  
AC52-07NA27344; W-7405-ENG-48
Resource Type:
Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 7; Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Biochemistry & Molecular Biology; Biotechnology & Applied Microbiology; Mathematical & Computational Biology

Citation Formats

Zhou, Carol L. Ecale, Lam, Marisa W., Smith, Jason R., Zemla, Adam T., Dyer, Matthew D., Kuczmarski, Thomas A., Vitalis, Elizabeth A., and Slezak, Thomas R. MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization. United States: N. p., 2006. Web. doi:10.1186/1471-2105-7-459.
Zhou, Carol L. Ecale, Lam, Marisa W., Smith, Jason R., Zemla, Adam T., Dyer, Matthew D., Kuczmarski, Thomas A., Vitalis, Elizabeth A., & Slezak, Thomas R. MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization. United States. https://doi.org/10.1186/1471-2105-7-459
Zhou, Carol L. Ecale, Lam, Marisa W., Smith, Jason R., Zemla, Adam T., Dyer, Matthew D., Kuczmarski, Thomas A., Vitalis, Elizabeth A., and Slezak, Thomas R. Tue . "MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization". United States. https://doi.org/10.1186/1471-2105-7-459. https://www.osti.gov/servlets/purl/1626328.
@article{osti_1626328,
title = {MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization},
author = {Zhou, Carol L. Ecale and Lam, Marisa W. and Smith, Jason R. and Zemla, Adam T. and Dyer, Matthew D. and Kuczmarski, Thomas A. and Vitalis, Elizabeth A. and Slezak, Thomas R.},
abstractNote = {Background: MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. Description: MannDB is a relational database that organizes data resulting from fully automated, highthroughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. Conclusion: MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports. Access to MannDB is freely available at http://manndb.llnl.gov/.},
doi = {10.1186/1471-2105-7-459},
journal = {BMC Bioinformatics},
number = 1,
volume = 7,
place = {United States},
year = {Tue Oct 17 00:00:00 EDT 2006},
month = {Tue Oct 17 00:00:00 EDT 2006}
}

Works referenced in this record:

Comparative genomics tools applied to bioterrorism defence
journal, January 2003


GenDB--an open source genome annotation system for prokaryote genomes
journal, April 2003


NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
journal, December 2004

  • Pruitt, K. D.
  • Nucleic Acids Research, Vol. 33, Issue Database issue
  • DOI: 10.1093/nar/gki025

Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid Sequence
journal, July 2000

  • Emanuelsson, Olof; Nielsen, Henrik; Brunak, Søren
  • Journal of Molecular Biology, Vol. 300, Issue 4
  • DOI: 10.1006/jmbi.2000.3903

BRIGEP--the BRIDGE-based genome-transcriptome-proteome browser
journal, July 2005

  • Goesmann, A.; Linke, B.; Bartels, D.
  • Nucleic Acids Research, Vol. 33, Issue Web Server
  • DOI: 10.1093/nar/gki400

The Comprehensive Microbial Resource
journal, January 2001


MaGe: a microbial genome annotation system supported by synteny results
journal, January 2006


Automated annotation of microbial proteomes in SWISS-PROT
journal, February 2003

  • Gattiker, Alexandre; Michoud, Karine; Rivoire, Catherine
  • Computational Biology and Chemistry, Vol. 27, Issue 1
  • DOI: 10.1016/S1476-9271(02)00094-4

BASys: a web server for automated bacterial genome annotation
journal, July 2005

  • Van Domselaar, G. H.; Stothard, P.; Shrivastava, S.
  • Nucleic Acids Research, Vol. 33, Issue Web Server
  • DOI: 10.1093/nar/gki593

Principles governing amino acid composition of integral membrane proteins: application to topology prediction 1 1Edited by J. Thornton
journal, October 1998

  • Tusnády, Gábor E.; Simon, István
  • Journal of Molecular Biology, Vol. 283, Issue 2
  • DOI: 10.1006/jmbi.1998.2107

Automated genome sequence analysis and annotation
journal, May 1999


Cleavage site analysis in picornaviral polyproteins: Discovering cellular targets by neural networks
journal, November 1996


Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
journal, January 2001

  • Krogh, Anders; Larsson, Björn; von Heijne, Gunnar
  • Journal of Molecular Biology, Vol. 305, Issue 3
  • DOI: 10.1006/jmbi.2000.4315

Functional and structural genomics using PEDANT
journal, January 2001


Cleavage site analysis in picornaviral polyproteins: Discovering cellular targets by neural networks
journal, November 1996


Principles governing amino acid composition of integral membrane proteins: application to topology prediction 1 1Edited by J. Thornton
journal, October 1998

  • Tusnády, Gábor E.; Simon, István
  • Journal of Molecular Biology, Vol. 283, Issue 2
  • DOI: 10.1006/jmbi.1998.2107

Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid Sequence
journal, July 2000

  • Emanuelsson, Olof; Nielsen, Henrik; Brunak, Søren
  • Journal of Molecular Biology, Vol. 300, Issue 4
  • DOI: 10.1006/jmbi.2000.3903

Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
journal, January 2001

  • Krogh, Anders; Larsson, Björn; von Heijne, Gunnar
  • Journal of Molecular Biology, Vol. 305, Issue 3
  • DOI: 10.1006/jmbi.2000.4315

Improved Prediction of Signal Peptides: SignalP 3.0
journal, July 2004

  • Dyrløv Bendtsen, Jannick; Nielsen, Henrik; von Heijne, Gunnar
  • Journal of Molecular Biology, Vol. 340, Issue 4, p. 783-795
  • DOI: 10.1016/j.jmb.2004.05.028

TopPred II: an improved software for membrane protein structure predictions
journal, January 1994


Functional and structural genomics using PEDANT
journal, January 2001


The Comprehensive Microbial Resource
journal, January 2001


GenDB--an open source genome annotation system for prokaryote genomes
journal, April 2003


NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
journal, December 2004

  • Pruitt, K. D.
  • Nucleic Acids Research, Vol. 33, Issue Database issue
  • DOI: 10.1093/nar/gki025

BRIGEP--the BRIDGE-based genome-transcriptome-proteome browser
journal, July 2005

  • Goesmann, A.; Linke, B.; Bartels, D.
  • Nucleic Acids Research, Vol. 33, Issue Web Server
  • DOI: 10.1093/nar/gki400

BASys: a web server for automated bacterial genome annotation
journal, July 2005

  • Van Domselaar, G. H.; Stothard, P.; Shrivastava, S.
  • Nucleic Acids Research, Vol. 33, Issue Web Server
  • DOI: 10.1093/nar/gki593

The integrated microbial genomes (IMG) system
journal, January 2006