In search of genome annotation consistency: solid gene clusters and how to use them
- Univ. of Illinois at Urbana-Champaign, IL (United States). Inst. for Genomic Biology; OSTI
- Univ. of Illinois at Urbana-Champaign, IL (United States). Inst. for Genomic Biology; Univ. of Illinois at Urbana-Champaign, IL (United States). Dept. of Microbiology
- Fellowship for Interpretation of Genomes, Burr Ridge, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States). Mathematics and Computer Science
- Fellowship for Interpretation of Genomes, Burr Ridge, IL (United States)
- Argonne National Lab. (ANL), Argonne, IL (United States). Mathematics and Computer Science
Maintaining consistency in genome annotations is important for supporting many computational tasks, particularly metabolic modeling. The SEED project has implemented a process that improves annotation consistencies across microbial genomes for proteins with conserved sequences and genomic context. In this research report, we describe this process and show how this effort has resulted in improvements to microbial genome annotations in the SEED. We also compare SEED annotation consistencies with other commonly used resources such as IMG (the Joint Genome Institute’s Integrated Microbial Genomes system), RefSeq (the National Center for Biotechnology Information’s Reference Sequence Database), Swiss-Prot (the annotated protein sequence database of the Swiss Institute of Bioinformatics, European Molecular Biology Laboratory and the European Bioinformatics Institute) and TrEMBL (Translated European Molecular Biology Laboratory nucleotide sequence data Library). Our analysis indicates that manual and computational efforts are paying off for the databases where consistency is a major goal.
- Research Organization:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Organization:
- National Institutes of Health (NIH); USDOE Office of Science (SC), Biological and Environmental Research (BER)
- Grant/Contract Number:
- AC02-06CH11357
- OSTI ID:
- 1815515
- Journal Information:
- 3 Biotech, Journal Name: 3 Biotech Journal Issue: 3 Vol. 4; ISSN 2190-572X
- Publisher:
- SpringerCopyright Statement
- Country of Publication:
- United States
- Language:
- English
PATtyFams: Protein Families for the Microbial Genomes in the PATRIC Database
|
journal | February 2016 |
RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes
|
journal | February 2015 |
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
|
journal | November 2013 |
Similar Records
Sentra, a database of signal transduction proteins.
The DOE-JGI Standard Operating Procedure for the Annotations of the Microbial Genomes
The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)
Journal Article
·
Mon Dec 31 23:00:00 EST 2001
· Nucleic Acids Res.
·
OSTI ID:949424
The DOE-JGI Standard Operating Procedure for the Annotations of the Microbial Genomes
Journal Article
·
Wed May 20 00:00:00 EDT 2009
· Standards in Genomic Sciences
·
OSTI ID:974530
The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)
Journal Article
·
Tue Feb 23 19:00:00 EST 2016
· Standards in Genomic Sciences
·
OSTI ID:1618964