DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)

Abstract

© 2016 Huntemann et al. The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.

Authors:
; ; ; ; ; ; ; ; ; ; ; ;
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
OSTI Identifier:
1618953
Alternate Identifier(s):
OSTI ID: 1256940; OSTI ID: 1378607
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Published Article
Journal Name:
Standards in Genomic Sciences
Additional Journal Information:
Journal Name: Standards in Genomic Sciences Journal Volume: 10 Journal Issue: 1; Journal ID: ISSN 1944-3277
Publisher:
Springer Science + Business Media
Country of Publication:
United Kingdom
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Microbial Genome Annotation; SOP; IMG; JGI

Citation Formats

Huntemann, Marcel, Ivanova, Natalia N., Mavromatis, Konstantinos, Tripp, H. James, Paez-Espino, David, Palaniappan, Krishnaveni, Szeto, Ernest, Pillay, Manoj, Chen, I-Min A., Pati, Amrita, Nielsen, Torben, Markowitz, Victor M., and Kyrpides, Nikos C. The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). United Kingdom: N. p., 2015. Web. doi:10.1186/s40793-015-0077-y.
Huntemann, Marcel, Ivanova, Natalia N., Mavromatis, Konstantinos, Tripp, H. James, Paez-Espino, David, Palaniappan, Krishnaveni, Szeto, Ernest, Pillay, Manoj, Chen, I-Min A., Pati, Amrita, Nielsen, Torben, Markowitz, Victor M., & Kyrpides, Nikos C. The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). United Kingdom. https://doi.org/10.1186/s40793-015-0077-y
Huntemann, Marcel, Ivanova, Natalia N., Mavromatis, Konstantinos, Tripp, H. James, Paez-Espino, David, Palaniappan, Krishnaveni, Szeto, Ernest, Pillay, Manoj, Chen, I-Min A., Pati, Amrita, Nielsen, Torben, Markowitz, Victor M., and Kyrpides, Nikos C. Mon . "The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)". United Kingdom. https://doi.org/10.1186/s40793-015-0077-y.
@article{osti_1618953,
title = {The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)},
author = {Huntemann, Marcel and Ivanova, Natalia N. and Mavromatis, Konstantinos and Tripp, H. James and Paez-Espino, David and Palaniappan, Krishnaveni and Szeto, Ernest and Pillay, Manoj and Chen, I-Min A. and Pati, Amrita and Nielsen, Torben and Markowitz, Victor M. and Kyrpides, Nikos C.},
abstractNote = {© 2016 Huntemann et al. The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.},
doi = {10.1186/s40793-015-0077-y},
journal = {Standards in Genomic Sciences},
number = 1,
volume = 10,
place = {United Kingdom},
year = {Mon Oct 26 00:00:00 EDT 2015},
month = {Mon Oct 26 00:00:00 EDT 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1186/s40793-015-0077-y

Citation Metrics:
Cited by: 103 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

InterProScan 5: genome-scale protein function classification
journal, January 2014


PILER-CR: Fast and accurate identification of CRISPR repeats
journal, January 2007


Large-scale contamination of microbial isolate genomes by Illumina PhiX control
journal, March 2015

  • Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia
  • Standards in Genomic Sciences, Vol. 10, Issue 1
  • DOI: 10.1186/1944-3277-10-18

CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats
journal, June 2007

  • Bland, Charles; Ramsey, Teresa L.; Sabree, Fareedah
  • BMC Bioinformatics, Vol. 8, Issue 1
  • DOI: 10.1186/1471-2105-8-209

tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence
journal, March 1997


The Pfam protein families database
journal, November 2011

  • Punta, M.; Coggill, P. C.; Eberhardt, R. Y.
  • Nucleic Acids Research, Vol. 40, Issue D1
  • DOI: 10.1093/nar/gkr1065

CDD: a conserved domain database for interactive domain family analysis
journal, January 2007

  • Marchler-Bauer, A.; Anderson, J. B.; Derbyshire, M. K.
  • Nucleic Acids Research, Vol. 35, Issue Database
  • DOI: 10.1093/nar/gkl951

Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010


TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes
journal, January 2007

  • Selengut, J. D.; Haft, D. H.; Davidsen, T.
  • Nucleic Acids Research, Vol. 35, Issue Database
  • DOI: 10.1093/nar/gkl1043

The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification
journal, October 2014

  • Reddy, T. B. K.; Thomas, Alex D.; Stamatis, Dimitri
  • Nucleic Acids Research, Vol. 43, Issue D1
  • DOI: 10.1093/nar/gku950

IMG 4 version of the integrated microbial genomes comparative analysis system
journal, October 2013

  • Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt963

Accelerated Profile HMM Searches
journal, October 2011


The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, November 2013

  • Caspi, Ron; Altman, Tomer; Billington, Richard
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1103

Improving Microbial Genome Annotations in an Integrated Database Context
journal, February 2013


Rfam: annotating non-coding RNAs in complete genomes
journal, December 2004

  • Griffiths-Jones, S.
  • Nucleic Acids Research, Vol. 33, Issue Database issue
  • DOI: 10.1093/nar/gki081

Infernal 1.0: inference of RNA alignments
journal, March 2009


Data, information, knowledge and principle: back to metabolism in KEGG
journal, November 2013

  • Kanehisa, Minoru; Goto, Susumu; Sato, Yoko
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1076

Search and clustering orders of magnitude faster than BLAST
journal, August 2010


A Fast and Symmetric DUST Implementation to Mask Low-Complexity DNA Sequences
journal, June 2006

  • Morgulis, Aleksandr; Gertz, E. Michael; Schäffer, Alejandro A.
  • Journal of Computational Biology, Vol. 13, Issue 5
  • DOI: 10.1089/cmb.2006.13.1028

Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
journal, January 2001

  • Krogh, Anders; Larsson, Björn; von Heijne, Gunnar
  • Journal of Molecular Biology, Vol. 305, Issue 3
  • DOI: 10.1006/jmbi.2000.4315