MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization
Abstract
Background: MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. Description: MannDB is a relational database that organizes data resulting from fully automated, highthroughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, ismore »
- Authors:
-
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Pathogen Bio-informatics
- Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States). Virginia Bioinformatics Inst.
- Publication Date:
- Research Org.:
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division; US Department of Homeland Security (DHS)
- OSTI Identifier:
- 1626328
- Grant/Contract Number:
- AC52-07NA27344; W-7405-ENG-48
- Resource Type:
- Accepted Manuscript
- Journal Name:
- BMC Bioinformatics
- Additional Journal Information:
- Journal Volume: 7; Journal Issue: 1; Journal ID: ISSN 1471-2105
- Publisher:
- BioMed Central
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; Biochemistry & Molecular Biology; Biotechnology & Applied Microbiology; Mathematical & Computational Biology
Citation Formats
Zhou, Carol L. Ecale, Lam, Marisa W., Smith, Jason R., Zemla, Adam T., Dyer, Matthew D., Kuczmarski, Thomas A., Vitalis, Elizabeth A., and Slezak, Thomas R. MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization. United States: N. p., 2006.
Web. doi:10.1186/1471-2105-7-459.
Zhou, Carol L. Ecale, Lam, Marisa W., Smith, Jason R., Zemla, Adam T., Dyer, Matthew D., Kuczmarski, Thomas A., Vitalis, Elizabeth A., & Slezak, Thomas R. MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization. United States. https://doi.org/10.1186/1471-2105-7-459
Zhou, Carol L. Ecale, Lam, Marisa W., Smith, Jason R., Zemla, Adam T., Dyer, Matthew D., Kuczmarski, Thomas A., Vitalis, Elizabeth A., and Slezak, Thomas R. Tue .
"MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization". United States. https://doi.org/10.1186/1471-2105-7-459. https://www.osti.gov/servlets/purl/1626328.
@article{osti_1626328,
title = {MannDB – A microbial database of automated protein sequence analyses and evidence integration for protein characterization},
author = {Zhou, Carol L. Ecale and Lam, Marisa W. and Smith, Jason R. and Zemla, Adam T. and Dyer, Matthew D. and Kuczmarski, Thomas A. and Vitalis, Elizabeth A. and Slezak, Thomas R.},
abstractNote = {Background: MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. Description: MannDB is a relational database that organizes data resulting from fully automated, highthroughput protein-sequence analyses using open-source tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. Conclusion: MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports. Access to MannDB is freely available at http://manndb.llnl.gov/.},
doi = {10.1186/1471-2105-7-459},
journal = {BMC Bioinformatics},
number = 1,
volume = 7,
place = {United States},
year = {Tue Oct 17 00:00:00 EDT 2006},
month = {Tue Oct 17 00:00:00 EDT 2006}
}
Works referenced in this record:
Comparative genomics tools applied to bioterrorism defence
journal, January 2003
- Slezak, T.
- Briefings in Bioinformatics, Vol. 4, Issue 2
GenDB--an open source genome annotation system for prokaryote genomes
journal, April 2003
- Meyer, F.
- Nucleic Acids Research, Vol. 31, Issue 8
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
journal, December 2004
- Pruitt, K. D.
- Nucleic Acids Research, Vol. 33, Issue Database issue
Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid Sequence
journal, July 2000
- Emanuelsson, Olof; Nielsen, Henrik; Brunak, Søren
- Journal of Molecular Biology, Vol. 300, Issue 4
BRIGEP--the BRIDGE-based genome-transcriptome-proteome browser
journal, July 2005
- Goesmann, A.; Linke, B.; Bartels, D.
- Nucleic Acids Research, Vol. 33, Issue Web Server
The Comprehensive Microbial Resource
journal, January 2001
- Peterson, J. D.
- Nucleic Acids Research, Vol. 29, Issue 1
MaGe: a microbial genome annotation system supported by synteny results
journal, January 2006
- Vallenet, D.
- Nucleic Acids Research, Vol. 34, Issue 1
Automated annotation of microbial proteomes in SWISS-PROT
journal, February 2003
- Gattiker, Alexandre; Michoud, Karine; Rivoire, Catherine
- Computational Biology and Chemistry, Vol. 27, Issue 1
PSORTb v.2.0: Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis
journal, October 2004
- Gardy, J. L.; Laird, M. R.; Chen, F.
- Bioinformatics, Vol. 21, Issue 5
BASys: a web server for automated bacterial genome annotation
journal, July 2005
- Van Domselaar, G. H.; Stothard, P.; Shrivastava, S.
- Nucleic Acids Research, Vol. 33, Issue Web Server
Principles governing amino acid composition of integral membrane proteins: application to topology prediction 1 1Edited by J. Thornton
journal, October 1998
- Tusnády, Gábor E.; Simon, István
- Journal of Molecular Biology, Vol. 283, Issue 2
Automated genome sequence analysis and annotation
journal, May 1999
- Andrade, M. A.; Brown, N. P.; Leroy, C.
- Bioinformatics, Vol. 15, Issue 5
Cleavage site analysis in picornaviral polyproteins: Discovering cellular targets by neural networks
journal, November 1996
- Blom, Nikolaj; Hansen, Jan; Brunak, Søren
- Protein Science, Vol. 5, Issue 11
Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
journal, January 2001
- Krogh, Anders; Larsson, Björn; von Heijne, Gunnar
- Journal of Molecular Biology, Vol. 305, Issue 3
Functional and structural genomics using PEDANT
journal, January 2001
- Frishman, D.; Albermann, K.; Hani, J.
- Bioinformatics, Vol. 17, Issue 1
Cleavage site analysis in picornaviral polyproteins: Discovering cellular targets by neural networks
journal, November 1996
- Blom, Nikolaj; Hansen, Jan; Brunak, Søren
- Protein Science, Vol. 5, Issue 11
Principles governing amino acid composition of integral membrane proteins: application to topology prediction 1 1Edited by J. Thornton
journal, October 1998
- Tusnády, Gábor E.; Simon, István
- Journal of Molecular Biology, Vol. 283, Issue 2
Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid Sequence
journal, July 2000
- Emanuelsson, Olof; Nielsen, Henrik; Brunak, Søren
- Journal of Molecular Biology, Vol. 300, Issue 4
Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
journal, January 2001
- Krogh, Anders; Larsson, Björn; von Heijne, Gunnar
- Journal of Molecular Biology, Vol. 305, Issue 3
Improved Prediction of Signal Peptides: SignalP 3.0
journal, July 2004
- Dyrløv Bendtsen, Jannick; Nielsen, Henrik; von Heijne, Gunnar
- Journal of Molecular Biology, Vol. 340, Issue 4, p. 783-795
TopPred II: an improved software for membrane protein structure predictions
journal, January 1994
- Claros, Manuel G.; Heijne, Gunnar von
- Bioinformatics, Vol. 10, Issue 6
Functional and structural genomics using PEDANT
journal, January 2001
- Frishman, D.; Albermann, K.; Hani, J.
- Bioinformatics, Vol. 17, Issue 1
PSORTb v.2.0: Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis
journal, October 2004
- Gardy, J. L.; Laird, M. R.; Chen, F.
- Bioinformatics, Vol. 21, Issue 5
The Comprehensive Microbial Resource
journal, January 2001
- Peterson, J. D.
- Nucleic Acids Research, Vol. 29, Issue 1
GenDB--an open source genome annotation system for prokaryote genomes
journal, April 2003
- Meyer, F.
- Nucleic Acids Research, Vol. 31, Issue 8
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
journal, December 2004
- Pruitt, K. D.
- Nucleic Acids Research, Vol. 33, Issue Database issue
BRIGEP--the BRIDGE-based genome-transcriptome-proteome browser
journal, July 2005
- Goesmann, A.; Linke, B.; Bartels, D.
- Nucleic Acids Research, Vol. 33, Issue Web Server
BASys: a web server for automated bacterial genome annotation
journal, July 2005
- Van Domselaar, G. H.; Stothard, P.; Shrivastava, S.
- Nucleic Acids Research, Vol. 33, Issue Web Server
The integrated microbial genomes (IMG) system
journal, January 2006
- Markowitz, V. M.
- Nucleic Acids Research, Vol. 34, Issue 90001