skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations

Abstract

In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. As a result, in this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. In conclusion, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequence-based genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA.more » PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less

Authors:
; ; ; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Security; Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1618521
Alternate Identifier(s):
OSTI ID: 1238774; OSTI ID: 1305875
Report Number(s):
LLNL-JRNL-664411
Journal ID: ISSN 1471-2105; 43; PII: 887
Grant/Contract Number:  
AC52-07NA27344; PE0603384BP-B0946791; SCW1039
Resource Type:
Journal Article: Published Article
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Name: BMC Bioinformatics Journal Volume: 17 Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
Springer Science + Business Media
Country of Publication:
United Kingdom
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; 97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE

Citation Formats

Leung, Elo, Huang, Amy, Cadag, Eithon, Montana, Aldrin, Soliman, Jan Lorenz, and Zhou, Carol L. Ecale. Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations. United Kingdom: N. p., 2016. Web. doi:10.1186/s12859-016-0887-y.
Leung, Elo, Huang, Amy, Cadag, Eithon, Montana, Aldrin, Soliman, Jan Lorenz, & Zhou, Carol L. Ecale. Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations. United Kingdom. https://doi.org/10.1186/s12859-016-0887-y
Leung, Elo, Huang, Amy, Cadag, Eithon, Montana, Aldrin, Soliman, Jan Lorenz, and Zhou, Carol L. Ecale. 2016. "Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations". United Kingdom. https://doi.org/10.1186/s12859-016-0887-y.
@article{osti_1618521,
title = {Protein Sequence Annotation Tool (PSAT): a centralized web-based meta-server for high-throughput sequence annotations},
author = {Leung, Elo and Huang, Amy and Cadag, Eithon and Montana, Aldrin and Soliman, Jan Lorenz and Zhou, Carol L. Ecale},
abstractNote = {In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. As a result, in this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. In conclusion, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequence-based genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.},
doi = {10.1186/s12859-016-0887-y},
url = {https://www.osti.gov/biblio/1618521}, journal = {BMC Bioinformatics},
issn = {1471-2105},
number = 1,
volume = 17,
place = {United Kingdom},
year = {Wed Jan 20 00:00:00 EST 2016},
month = {Wed Jan 20 00:00:00 EST 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record at https://doi.org/10.1186/s12859-016-0887-y

Citation Metrics:
Cited by: 6 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

InterProScan 5: genome-scale protein function classification
journal, January 2014


iGepros: an integrated gene and protein annotation server for biological nature exploration
journal, December 2011


Combination of degradation pathways for naphthalene utilization in R hodococcus sp. strain TFB : Naphthalene degradation in
journal, December 2013


EFICAz2.5: application of a high-precision enzyme function predictor to 396 proteomes
journal, August 2012


Unraveling the Complexities of Life Sciences Data
journal, March 2013


The IGS Standard Operating Procedure for Automated Prokaryotic Annotation
journal, April 2011


ANNIE: integrated de novo protein sequence annotation
journal, April 2009


Cloud computing and the DNA data race
journal, July 2010


Data, information, knowledge and principle: back to metabolism in KEGG
journal, November 2013


STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012


Optimizing high performance computing workflow for protein functional annotation: HPC FOR PROTEIN ANNOTATION
journal, April 2014


BLAST+: architecture and applications
journal, January 2009


SignalP 4.0: discriminating signal peptides from transmembrane regions
journal, September 2011


The Earth Microbiome project: successes and aspirations
journal, August 2014


MESSA: MEta-Server for protein Sequence Analysis
journal, October 2012


Draft Genome Sequence of the Naphthalene Degrader Herbaspirillum sp. Strain RV1423
journal, March 2014


The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
journal, November 2013


IMG 4 version of the integrated microbial genomes comparative analysis system
journal, October 2013


The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, November 2013


Towards the integration, annotation and association of historical microarray experiments with RNA-seq
journal, January 2013


The RAST Server: Rapid Annotations using Subsystems Technology
journal, January 2008


EC2KEGG: a command line tool for comparison of metabolic pathways
journal, September 2014