GenomePeek—an online tool for prokaryotic genome and metagenome analysis
Abstract
As increases in prokaryotic sequencing take place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.
- Authors:
-
- San Diego State University, San Diego, CA (United States). Department of Computer Science; San Diego State University, San Diego, CA (United States). Department of Biology
- San Diego State University, San Diego, CA (United States). Department of Computer Science; San Diego State University, San Diego, CA (United States). Department of Biology; San Diego State University, San Diego, CA (United States). Computational Sciences Research Center; Argonne National Lab. (ANL), Argonne, IL (United States)
- Publication Date:
- Research Org.:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- USDOE
- Contributing Org.:
- National Science Foundation (NSF), Washington, DC (United States)
- OSTI Identifier:
- 1221899
- Grant/Contract Number:
- AC02-06CH11357
- Resource Type:
- Accepted Manuscript
- Journal Name:
- PeerJ
- Additional Journal Information:
- Journal Volume: 3; Journal ID: ISSN 2167-8359
- Publisher:
- PeerJ Inc.
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; Genome; Metagenome; Taxonomic; Bacteria; Sequencing; Population; Distribution; Archaea; Abundance
Citation Formats
McNair, Katelyn, and Edwards, Robert A. GenomePeek—an online tool for prokaryotic genome and metagenome analysis. United States: N. p., 2015.
Web. doi:10.7717/peerj.1025.
McNair, Katelyn, & Edwards, Robert A. GenomePeek—an online tool for prokaryotic genome and metagenome analysis. United States. https://doi.org/10.7717/peerj.1025
McNair, Katelyn, and Edwards, Robert A. Tue .
"GenomePeek—an online tool for prokaryotic genome and metagenome analysis". United States. https://doi.org/10.7717/peerj.1025. https://www.osti.gov/servlets/purl/1221899.
@article{osti_1221899,
title = {GenomePeek—an online tool for prokaryotic genome and metagenome analysis},
author = {McNair, Katelyn and Edwards, Robert A.},
abstractNote = {As increases in prokaryotic sequencing take place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.},
doi = {10.7717/peerj.1025},
journal = {PeerJ},
number = ,
volume = 3,
place = {United States},
year = {Tue Jun 16 00:00:00 EDT 2015},
month = {Tue Jun 16 00:00:00 EDT 2015}
}
Web of Science
Works referenced in this record:
Dissection of phylogenetic relationships among 19 rapidly growing Mycobacterium species by 16S rRNA, hsp65, sodA, recA and rpoB gene sequencing
journal, November 2004
- Adekambi, T.
- INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, Vol. 54, Issue 6
Mixture models for analysis of the taxonomic composition of metagenomes
journal, May 2011
- Meinicke, Peter; Aßhauer, Kathrin Petra; Lingner, Thomas
- Bioinformatics, Vol. 27, Issue 12
BLAT---The BLAST-Like Alignment Tool
journal, March 2002
- Kent, W. J.
- Genome Research, Vol. 12, Issue 4
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
journal, May 2006
- Li, W.; Godzik, A.
- Bioinformatics, Vol. 22, Issue 13
Abundant Human DNA Contamination Identified in Non-Primate Genome Databases
journal, February 2011
- Longo, Mark S.; O'Neill, Michael J.; O'Neill, Rachel J.
- PLoS ONE, Vol. 6, Issue 2
Comparison of 16S rRNA, nifD, recA, gyrB, rpoB and fusA genes within the family Geobacteraceae fam. nov.
journal, September 2004
- Holmes, D. E.
- INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, Vol. 54, Issue 5
Genetic Classification and Distinguishing of Staphylococcus Species Based on Different Partial gap, 16S rRNA, hsp60, rpoB, sodA, and tuf Gene Sequences
journal, January 2008
- Ghebremedhin, B.; Layer, F.; Konig, W.
- Journal of Clinical Microbiology, Vol. 46, Issue 3
CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction
journal, April 2014
- Angly, Florent E.; Dennis, Paul G.; Skarshewski, Adam
- Microbiome, Vol. 2, Issue 1
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
journal, September 1997
- Altschul, Stephen F.; Madden, Thomas L.; Schäffer, Alejandro A.
- Nucleic Acids Research, Vol. 25, Issue 17, p. 3389-3402
Next generation sequencing technology: Advances and applications
journal, October 2014
- Buermans, H. P. J.; den Dunnen, J. T.
- Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease, Vol. 1842, Issue 10
Prokaryotic and eukaryotic RNA polymerases have homologous core subunits.
journal, March 1987
- Sweetser, D.; Nonet, M.; Young, R. A.
- Proceedings of the National Academy of Sciences, Vol. 84, Issue 5
Protein length in eukaryotic and prokaryotic proteomes
journal, June 2005
- Brocchieri, L.
- Nucleic Acids Research, Vol. 33, Issue 10
Database resources of the National Center for Biotechnology Information
journal, January 2009
- Sayers, E. W.; Barrett, T.; Benson, D. A.
- Nucleic Acids Research, Vol. 37, Issue Database
FOCUS: an alignment-free model to identify organisms in metagenomes using non-negative least squares
journal, January 2014
- Silva, Genivaldo Gueiros Z.; Cuevas, Daniel A.; Dutilh, Bas E.
- PeerJ, Vol. 2
Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution
journal, October 2013
- Carr, Rogan; Shen-Orr, Shai S.; Borenstein, Elhanan
- PLoS Computational Biology, Vol. 9, Issue 10
CAP3: A DNA Sequence Assembly Program
journal, September 1999
- Huang, X.
- Genome Research, Vol. 9, Issue 9
Use of simulated data sets to evaluate the fidelity of metagenomic processing methods
journal, April 2007
- Mavromatis, Konstantinos; Ivanova, Natalia; Barry, Kerrie
- Nature Methods, Vol. 4, Issue 6
The RAST Server: Rapid Annotations using Subsystems Technology
journal, January 2008
- Aziz, Ramy K.; Bartels, Daniela; Best, Aaron A.
- BMC Genomics, Vol. 9, Issue 1, Article No. 75
Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses.
journal, October 1985
- Lane, D. J.; Pace, B.; Olsen, G. J.
- Proceedings of the National Academy of Sciences, Vol. 82, Issue 20
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
journal, December 2004
- Pruitt, K. D.
- Nucleic Acids Research, Vol. 33, Issue Database issue
Evaluating the Impact of Different Sequence Databases on Metaproteome Analysis: Insights from a Lab-Assembled Microbial Mixture
journal, December 2013
- Tanca, Alessandro; Palomba, Antonio; Deligios, Massimo
- PLoS ONE, Vol. 8, Issue 12
Functional metagenomic profiling of nine biomes
journal, March 2008
- Dinsdale, Elizabeth A.; Edwards, Robert A.; Hall, Dana
- Nature, Vol. 452, Issue 7187
Metagenomics: Read Length Matters
journal, January 2008
- Wommack, K. E.; Bhavsar, J.; Ravel, J.
- Applied and Environmental Microbiology, Vol. 74, Issue 5
Fast Identification and Removal of Sequence Contamination from Genomic and Metagenomic Datasets
journal, March 2011
- Schmieder, Robert; Edwards, Robert
- PLoS ONE, Vol. 6, Issue 3
UniProt: the Universal Protein knowledgebase
journal, January 2004
- Apweiler, R.
- Nucleic Acids Research, Vol. 32, Issue 90001
Ribosomal RNA cistrons in Euglena gracilis
journal, December 1973
- Scott, N. Steele
- Journal of Molecular Biology, Vol. 81, Issue 3
Phylogenetic structure of the prokaryotic domain: The primary kingdoms
journal, November 1977
- Woese, C. R.; Fox, G. E.
- Proceedings of the National Academy of Sciences, Vol. 74, Issue 11
Application of recA and rpoB sequence analysis on phylogeny and molecular identification of Geobacillus species
journal, August 2009
- Weng, F. Y.; Chiou, C. S.; Lin, P. H. P.
- Journal of Applied Microbiology, Vol. 107, Issue 2
Fast gapped-read alignment with Bowtie 2
journal, March 2012
- Langmead, Ben; Salzberg, Steven L.
- Nature Methods, Vol. 9, Issue 4
BLAST+: architecture and applications
journal, January 2009
- Camacho, Christiam; Coulouris, George; Avagyan, Vahram
- BMC Bioinformatics, Vol. 10, Issue 1
Classification of metagenomic sequences: methods and challenges
journal, September 2012
- Mande, S. S.; Mohammed, M. H.; Ghosh, T. S.
- Briefings in Bioinformatics, Vol. 13, Issue 6
GenBank
journal, January 2009
- Benson, D. A.; Karsch-Mizrachi, I.; Lipman, D. J.
- Nucleic Acids Research, Vol. 37, Issue Database
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)
journal, November 2013
- Overbeek, Ross; Olson, Robert; Pusch, Gordon D.
- Nucleic Acids Research, Vol. 42, Issue D1
The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes
journal, September 2008
- Meyer, F.; Paarmann, D.; D'Souza, M.
- BMC Bioinformatics, Vol. 9, Issue 1
Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012
- Segata, Nicola; Waldron, Levi; Ballarini, Annalisa
- Nature Methods, Vol. 9, Issue 8
Grinder: a versatile amplicon and shotgun sequence simulator
journal, March 2012
- Angly, Florent E.; Willner, Dana; Rohwer, Forest
- Nucleic Acids Research, Vol. 40, Issue 12
The SEED: a peer-to-peer environment for genome annotation
journal, November 2004
- Overbeek, Ross; Disz, Terry; Stevens, Rick
- Communications of the ACM, Vol. 47, Issue 11
The oral metagenome in health and disease
journal, June 2011
- Belda-Ferre, Pedro; Alcaraz, Luis David; Cabrera-Rubio, Raúl
- The ISME Journal, Vol. 6, Issue 1
Ribosomal RNA cistrons in Euglena gracilis
journal, December 1973
- Scott, N. Steele
- Journal of Molecular Biology, Vol. 81, Issue 3
Rad51 protein involved in repair and recombination in S. cerevisiae is a RecA-like protein
journal, May 1992
- Shinohara, Akira; Ogawa, Hideyuki; Ogawa, Tomoko
- Cell, Vol. 69, Issue 3
Nucleotide sequence of mouse HSP60 (chaperonin, GroEL homolog) cDNA
journal, November 1990
- Venner, Thomas J.; Gupta, Radhey S.
- Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression, Vol. 1087, Issue 3
The oral metagenome in health and disease
journal, June 2011
- Belda-Ferre, Pedro; Alcaraz, Luis David; Cabrera-Rubio, Raúl
- The ISME Journal, Vol. 6, Issue 1
Functional metagenomic profiling of nine biomes
journal, March 2008
- Dinsdale, Elizabeth A.; Edwards, Robert A.; Hall, Dana
- Nature, Vol. 452, Issue 7187
Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012
- Segata, Nicola; Waldron, Levi; Ballarini, Annalisa
- Nature Methods, Vol. 9, Issue 8
Characterizing the molecular regulation of inhibitory immune checkpoints with multimodal single-cell screens
journal, March 2021
- Papalexi, Efthymia; Mimitou, Eleni P.; Butler, Andrew W.
- Nature Genetics, Vol. 53, Issue 3
Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses.
journal, October 1985
- Lane, D. J.; Pace, B.; Olsen, G. J.
- Proceedings of the National Academy of Sciences, Vol. 82, Issue 20
Prokaryotic and eukaryotic RNA polymerases have homologous core subunits.
journal, March 1987
- Sweetser, D.; Nonet, M.; Young, R. A.
- Proceedings of the National Academy of Sciences, Vol. 84, Issue 5
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
journal, May 2006
- Li, W.; Godzik, A.
- Bioinformatics, Vol. 22, Issue 13
Mixture models for analysis of the taxonomic composition of metagenomes
journal, May 2011
- Meinicke, Peter; Aßhauer, Kathrin Petra; Lingner, Thomas
- Bioinformatics, Vol. 27, Issue 12
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
journal, December 2004
- Pruitt, K. D.
- Nucleic Acids Research, Vol. 33, Issue Database issue
Protein length in eukaryotic and prokaryotic proteomes
journal, June 2005
- Brocchieri, L.
- Nucleic Acids Research, Vol. 33, Issue 10
The European Nucleotide Archive
journal, October 2010
- Leinonen, R.; Akhtar, R.; Birney, E.
- Nucleic Acids Research, Vol. 39, Issue Database
Grinder: a versatile amplicon and shotgun sequence simulator
journal, March 2012
- Angly, Florent E.; Willner, Dana; Rohwer, Forest
- Nucleic Acids Research, Vol. 40, Issue 12
Comparison of 16S rRNA, nifD, recA, gyrB, rpoB and fusA genes within the family Geobacteraceae fam. nov.
journal, September 2004
- Holmes, D. E.
- INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, Vol. 54, Issue 5
Dissection of phylogenetic relationships among 19 rapidly growing Mycobacterium species by 16S rRNA, hsp65, sodA, recA and rpoB gene sequencing
journal, November 2004
- Adekambi, T.
- INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, Vol. 54, Issue 6
CAP3: A DNA Sequence Assembly Program
journal, September 1999
- Huang, X.
- Genome Research, Vol. 9, Issue 9
Application of recA and rpoB sequence analysis on phylogeny and molecular identification of Geobacillus species
journal, August 2009
- Weng, F. Y.; Chiou, C. S.; Lin, P. H. P.
- Journal of Applied Microbiology, Vol. 107, Issue 2
The SEED: a peer-to-peer environment for genome annotation
journal, November 2004
- Overbeek, Ross; Disz, Terry; Stevens, Rick
- Communications of the ACM, Vol. 47, Issue 11
BLAST+: architecture and applications
journal, January 2009
- Camacho, Christiam; Coulouris, George; Avagyan, Vahram
- BMC Bioinformatics, Vol. 10, Issue 1
The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes
journal, September 2008
- Meyer, F.; Paarmann, D.; D'Souza, M.
- BMC Bioinformatics, Vol. 9, Issue 1
The RAST Server: Rapid Annotations using Subsystems Technology
journal, January 2008
- Aziz, Ramy K.; Bartels, Daniela; Best, Aaron A.
- BMC Genomics, Vol. 9, Issue 1, Article No. 75
CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction
journal, April 2014
- Angly, Florent E.; Dennis, Paul G.; Skarshewski, Adam
- Microbiome, Vol. 2, Issue 1
Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution
journal, October 2013
- Carr, Rogan; Shen-Orr, Shai S.; Borenstein, Elhanan
- PLoS Computational Biology, Vol. 9, Issue 10
Abundant Human DNA Contamination Identified in Non-Primate Genome Databases
journal, February 2011
- Longo, Mark S.; O'Neill, Michael J.; O'Neill, Rachel J.
- PLoS ONE, Vol. 6, Issue 2
Fast Identification and Removal of Sequence Contamination from Genomic and Metagenomic Datasets
journal, March 2011
- Schmieder, Robert; Edwards, Robert
- PLoS ONE, Vol. 6, Issue 3
TERA: the Toxicological Effect and Risk Assessment Knowledge Graph
preprint, January 2019
- Myklebust, Erik Bryhn; Jimenez-Ruiz, Ernesto; Chen, Jiaoyan
- arXiv
Works referencing / citing this record:
Diel population and functional synchrony of microbial communities on coral reefs
journal, April 2019
- Kelly, Linda Wegley; Nelson, Craig E.; Haas, Andreas F.
- Nature Communications, Vol. 10, Issue 1
Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem
journal, February 2018
- Louca, Stilianos; Doebeli, Michael; Parfrey, Laura Wegener
- Microbiome, Vol. 6, Issue 1
Intermediate-Salinity Systems at High Altitudes in the Peruvian Andes Unveil a High Diversity and Abundance of Bacteria and Viruses
journal, November 2019
- Castelán-Sánchez, Hugo Gildardo; Elorrieta, Paola; Romoacca, Pedro
- Genes, Vol. 10, Issue 11
Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem
text, January 2018
- Louca, Stilianos; Doebeli, Michael; Parfrey, Laura W.
- BioMed Central
Diel population and functional synchrony of microbial communities on coral reefs
journal, April 2019
- Kelly, Linda Wegley; Nelson, Craig E.; Haas, Andreas F.
- Nature Communications, Vol. 10, Issue 1