Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations
Abstract
In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. As a result, in this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. In conclusion, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequence-based genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA.more »
- Authors:
- Publication Date:
- Research Org.:
- Lawrence Livermore National Security; Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1618521
- Alternate Identifier(s):
- OSTI ID: 1238774; OSTI ID: 1305875
- Report Number(s):
- LLNL-JRNL-664411
Journal ID: ISSN 1471-2105; 43; PII: 887
- Grant/Contract Number:
- AC52-07NA27344; PE0603384BP-B0946791; SCW1039
- Resource Type:
- Journal Article: Published Article
- Journal Name:
- BMC Bioinformatics
- Additional Journal Information:
- Journal Name: BMC Bioinformatics Journal Volume: 17 Journal Issue: 1; Journal ID: ISSN 1471-2105
- Publisher:
- Springer Science + Business Media
- Country of Publication:
- United Kingdom
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; 97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
Citation Formats
Leung, Elo, Huang, Amy, Cadag, Eithon, Montana, Aldrin, Soliman, Jan Lorenz, and Zhou, Carol L. Ecale. Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations. United Kingdom: N. p., 2016.
Web. doi:10.1186/s12859-016-0887-y.
Leung, Elo, Huang, Amy, Cadag, Eithon, Montana, Aldrin, Soliman, Jan Lorenz, & Zhou, Carol L. Ecale. Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations. United Kingdom. https://doi.org/10.1186/s12859-016-0887-y
Leung, Elo, Huang, Amy, Cadag, Eithon, Montana, Aldrin, Soliman, Jan Lorenz, and Zhou, Carol L. Ecale. 2016.
"Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations". United Kingdom. https://doi.org/10.1186/s12859-016-0887-y.
@article{osti_1618521,
title = {Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations},
author = {Leung, Elo and Huang, Amy and Cadag, Eithon and Montana, Aldrin and Soliman, Jan Lorenz and Zhou, Carol L. Ecale},
abstractNote = {In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. As a result, in this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. In conclusion, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequence-based genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.},
doi = {10.1186/s12859-016-0887-y},
url = {https://www.osti.gov/biblio/1618521},
journal = {BMC Bioinformatics},
issn = {1471-2105},
number = 1,
volume = 17,
place = {United Kingdom},
year = {Wed Jan 20 00:00:00 EST 2016},
month = {Wed Jan 20 00:00:00 EST 2016}
}
Web of Science
Works referenced in this record:
InterProScan 5: genome-scale protein function classification
journal, January 2014
- Jones, P.; Binns, D.; Chang, H. -Y.
- Bioinformatics, Vol. 30, Issue 9
iGepros: an integrated gene and protein annotation server for biological nature exploration
journal, December 2011
- Zheng, Guangyong; Wang, Haibo; Wei, Chaochun
- BMC Bioinformatics, Vol. 12, Issue S14
Combination of degradation pathways for naphthalene utilization in R hodococcus sp. strain TFB : Naphthalene degradation in
journal, December 2013
- Tomás-Gallardo, Laura; Gómez-Álvarez, Helena; Santero, Eduardo
- Microbial Biotechnology, Vol. 7, Issue 2
What's that gene (or protein)? Online resources for exploring functions of genes, transcripts, and proteins
journal, April 2014
- Hutchins, James R. A.
- Molecular Biology of the Cell, Vol. 25, Issue 8
EFICAz2.5: application of a high-precision enzyme function predictor to 396 proteomes
journal, August 2012
- Kumar, Narendra; Skolnick, Jeffrey
- Bioinformatics, Vol. 28, Issue 20
WImpiBLAST: Web Interface for mpiBLAST to Help Biologists Perform Large-Scale Annotation Using High Performance Computing
journal, June 2014
- Sharma, Parichit; Mantri, Shrikant S.
- PLoS ONE, Vol. 9, Issue 6
Unraveling the Complexities of Life Sciences Data
journal, March 2013
- Higdon, Roger; Haynes, Winston; Stanberry, Larissa
- Big Data, Vol. 1, Issue 1
The IGS Standard Operating Procedure for Automated Prokaryotic Annotation
journal, April 2011
- Galens, Kevin; Orvis, Joshua; Daugherty, Sean
- Standards in Genomic Sciences, Vol. 4, Issue 2
MvirDB--a microbial database of protein toxins, virulence factors and antibiotic resistance genes for bio-defence applications
journal, January 2007
- Zhou, C. E.; Smith, J.; Lam, M.
- Nucleic Acids Research, Vol. 35, Issue Database
ANNIE: integrated de novo protein sequence annotation
journal, April 2009
- Ooi, H. S.; Kwo, C. Y.; Wildpaner, M.
- Nucleic Acids Research, Vol. 37, Issue Web Server
Cloud computing and the DNA data race
journal, July 2010
- Schatz, Michael C.; Langmead, Ben; Salzberg, Steven L.
- Nature Biotechnology, Vol. 28, Issue 7
Data, information, knowledge and principle: back to metabolism in KEGG
journal, November 2013
- Kanehisa, Minoru; Goto, Susumu; Sato, Yoko
- Nucleic Acids Research, Vol. 42, Issue D1
STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012
- Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune
- Nucleic Acids Research, Vol. 41, Issue D1
Optimizing high performance computing workflow for protein functional annotation: HPC FOR PROTEIN ANNOTATION
journal, April 2014
- Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan
- Concurrency and Computation: Practice and Experience, Vol. 26, Issue 13
BLAST+: architecture and applications
journal, January 2009
- Camacho, Christiam; Coulouris, George; Avagyan, Vahram
- BMC Bioinformatics, Vol. 10, Issue 1
SignalP 4.0: discriminating signal peptides from transmembrane regions
journal, September 2011
- Petersen, Thomas Nordahl; Brunak, Søren; von Heijne, Gunnar
- Nature Methods, Vol. 8, Issue 10
The Earth Microbiome project: successes and aspirations
journal, August 2014
- Gilbert, Jack A.; Jansson, Janet K.; Knight, Rob
- BMC Biology, Vol. 12, Issue 1
ASAP: automated sequence annotation pipeline for web-based updating of sequence information with a local dynamic database
journal, March 2003
- Kossenkov, A.; Manion, F. J.; Korotkov, E.
- Bioinformatics, Vol. 19, Issue 5
MESSA: MEta-Server for protein Sequence Analysis
journal, October 2012
- Cong, Qian; Grishin, Nick V.
- BMC Biology, Vol. 10, Issue 1
Draft Genome Sequence of the Naphthalene Degrader Herbaspirillum sp. Strain RV1423
journal, March 2014
- Jauregui, R.; Rodelas, B.; Geffers, R.
- Genome Announcements, Vol. 2, Issue 2
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
journal, November 2013
- Overbeek, Ross; Olson, Robert; Pusch, Gordon D.
- Nucleic Acids Research, Vol. 42, Issue D1
IMG 4 version of the integrated microbial genomes comparative analysis system
journal, October 2013
- Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna
- Nucleic Acids Research, Vol. 42, Issue D1
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, November 2013
- Caspi, Ron; Altman, Tomer; Billington, Richard
- Nucleic Acids Research, Vol. 42, Issue D1
Towards the integration, annotation and association of historical microarray experiments with RNA-seq
journal, January 2013
- Chavan, Shweta S.; Bauer, Michael A.; Peterson, Erich A.
- BMC Bioinformatics, Vol. 14, Issue Suppl 14
The RAST Server: Rapid Annotations using Subsystems Technology
journal, January 2008
- Aziz, Ramy K.; Bartels, Daniela; Best, Aaron A.
- BMC Genomics, Vol. 9, Issue 1, Article No. 75
EC2KEGG: a command line tool for comparison of metabolic pathways
journal, September 2014
- Porollo, Aleksey
- Source Code for Biology and Medicine, Vol. 9, Issue 1