Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance
Abstract
Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.
- Authors:
-
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division
- Publication Date:
- Research Org.:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1185410
- Grant/Contract Number:
- DE-AC05-00OR22725
- Resource Type:
- Journal Article: Accepted Manuscript
- Journal Name:
- Bioinformatics
- Additional Journal Information:
- Journal Volume: 31; Journal Issue: 2; Journal ID: ISSN 1367-4803
- Publisher:
- Oxford University Press
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 60 APPLIED LIFE SCIENCES; 59 BASIC BIOLOGICAL SCIENCES
Citation Formats
Ahn, Tae-Hyuk, Chai, Juanjuan, and Pan, Chongle. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance. United States: N. p., 2014.
Web. doi:10.1093/bioinformatics/btu641.
Ahn, Tae-Hyuk, Chai, Juanjuan, & Pan, Chongle. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance. United States. https://doi.org/10.1093/bioinformatics/btu641
Ahn, Tae-Hyuk, Chai, Juanjuan, and Pan, Chongle. 2014.
"Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance". United States. https://doi.org/10.1093/bioinformatics/btu641. https://www.osti.gov/servlets/purl/1185410.
@article{osti_1185410,
title = {Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance},
author = {Ahn, Tae-Hyuk and Chai, Juanjuan and Pan, Chongle},
abstractNote = {Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.},
doi = {10.1093/bioinformatics/btu641},
url = {https://www.osti.gov/biblio/1185410},
journal = {Bioinformatics},
issn = {1367-4803},
number = 2,
volume = 31,
place = {United States},
year = {Mon Sep 29 00:00:00 EDT 2014},
month = {Mon Sep 29 00:00:00 EDT 2014}
}
Web of Science
Works referenced in this record:
Genomic Comparison of Escherichia coli O104:H4 Isolates from 2009 and 2011 Reveals Plasmid, and Prophage Heterogeneity, Including Shiga Toxin Encoding Phage stx2
journal, November 2012
- Ahmed, Sanaa A.; Awosika, Joy; Baldwin, Carson
- PLoS ONE, Vol. 7, Issue 11
PhymmBL expanded: confidence scores, custom databases, parallelization and more
journal, April 2011
- Brady, Arthur; Salzberg, Steven
- Nature Methods, Vol. 8, Issue 5
TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach
journal, February 2009
- Diaz, Naryttza N.; Krause, Lutz; Goesmann, Alexander
- BMC Bioinformatics, Vol. 10, Issue 1
Biosurveillance plan unveiled
journal, November 2012
- Fox, Jeffrey L.
- Nature Biotechnology, Vol. 30, Issue 11
Pathoscope: Species identification and strain attribution with unassembled sequencing data
journal, July 2013
- Francis, O. E.; Bendall, M.; Manimaran, S.
- Genome Research, Vol. 23, Issue 10
DNA–DNA hybridization values and their relationship to whole-genome sequence similarities
journal, January 2007
- Klappenbach, Joel A.; Goris, Johan; Vandamme, Peter
- International Journal of Systematic and Evolutionary Microbiology, Vol. 57, Issue 1
MEGAN analysis of metagenomic data
journal, February 2007
- Huson, D. H.; Auch, A. F.; Qi, J.
- Genome Research, Vol. 17, Issue 3
Fast gapped-read alignment with Bowtie 2
journal, March 2012
- Langmead, Ben; Salzberg, Steven L.
- Nature Methods, Vol. 9, Issue 4
The Sequence Alignment/Map format and SAMtools
journal, June 2009
- Li, H.; Handsaker, B.; Wysoker, A.
- Bioinformatics, Vol. 25, Issue 16
Metagenomic abundance estimation and diagnostic testing on species level
journal, August 2012
- Lindner, Martin S.; Renard, Bernhard Y.
- Nucleic Acids Research, Vol. 41, Issue 1
Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, January 2011
- Liu, Bo; Gibbons, Theodore; Ghodsi, Mohammad
- BMC Genomics, Vol. 12, Issue Suppl 2
Performance comparison of benchtop high-throughput sequencing platforms
journal, April 2012
- Loman, Nicholas J.; Misra, Raju V.; Dallman, Timothy J.
- Nature Biotechnology, Vol. 30, Issue 5
SOrt-ITEMS: Sequence orthology based approach for improved taxonomic estimation of metagenomic sequences
journal, May 2009
- Monzoorul Haque, M.; Ghosh, Tarini Shankar; Komanduri, Dinakar
- Bioinformatics, Vol. 25, Issue 14
Taxonomic metagenome sequence assignment with structured output models
journal, February 2011
- Patil, Kaustubh R.; Haider, Peter; Pope, Phillip B.
- Nature Methods, Vol. 8, Issue 3
NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
journal, November 2011
- Pruitt, K. D.; Tatusova, T.; Brown, G. R.
- Nucleic Acids Research, Vol. 40, Issue D1
MetaSim—A Sequencing Simulator for Genomics and Metagenomics
journal, October 2008
- Richter, Daniel C.; Ott, Felix; Auch, Alexander F.
- PLoS ONE, Vol. 3, Issue 10
Metagenome Fragment Classification Using -Mer Frequency Profiles
journal, January 2008
- Rosen, Gail; Garbarine, Elaine; Caseiro, Diamantino
- Advances in Bioinformatics, Vol. 2008
Escherichia coli (STEC) serotype O104 outbreak causing haemolytic syndrome (HUS) in Germany and France
journal, July 2011
- Rubino, Salvatore; Cappuccinelli, Piero; Kelvin, David J.
- The Journal of Infection in Developing Countries, Vol. 5, Issue 06
Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012
- Segata, Nicola; Waldron, Levi; Ballarini, Annalisa
- Nature Methods, Vol. 9, Issue 8
On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming
journal, April 2005
- Wächter, Andreas; Biegler, Lorenz T.
- Mathematical Programming, Vol. 106, Issue 1
Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2
journal, February 2012
- Wu, Martin; Scott, Alexandra J.
- Bioinformatics, Vol. 28, Issue 7
Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
journal, December 2011
- Xia, Li C.; Cram, Jacob A.; Chen, Ting
- PLoS ONE, Vol. 6, Issue 12
Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, September 2011
- Liu, Bo; Gibbons, Theodore; Ghodsi, Mohammad
- Genome Biology, Vol. 12, Issue S1
Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, September 2011
- Liu, Bo; Gibbons, Theodore; Ghodsi, Mohammad
- Genome Biology, Vol. 12, Issue S1
The human microbiome: there is much left to do
journal, June 2022
- Ley, Ruth
- Nature, Vol. 606, Issue 7914
Read and assembly metrics inconsequential for clinical utility of whole-genome sequencing in mapping outbreaks
journal, July 2013
- Harris, Simon R.; Török, M. E.; Cartwright, Edward J. P.
- Nature Biotechnology, Vol. 31, Issue 7
Fast gapped-read alignment with Bowtie 2
journal, March 2012
- Langmead, Ben; Salzberg, Steven L.
- Nature Methods, Vol. 9, Issue 4
Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012
- Segata, Nicola; Waldron, Levi; Ballarini, Annalisa
- Nature Methods, Vol. 9, Issue 8
PhymmBL expanded: confidence scores, custom databases, parallelization and more
journal, April 2011
- Brady, Arthur; Salzberg, Steven
- Nature Methods, Vol. 8, Issue 5
SOrt-ITEMS: Sequence orthology based approach for improved taxonomic estimation of metagenomic sequences
journal, May 2009
- Monzoorul Haque, M.; Ghosh, Tarini Shankar; Komanduri, Dinakar
- Bioinformatics, Vol. 25, Issue 14
The Sequence Alignment/Map format and SAMtools
journal, June 2009
- Li, H.; Handsaker, B.; Wysoker, A.
- Bioinformatics, Vol. 25, Issue 16
Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2
journal, February 2012
- Wu, Martin; Scott, Alexandra J.
- Bioinformatics, Vol. 28, Issue 7
NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
journal, November 2011
- Pruitt, K. D.; Tatusova, T.; Brown, G. R.
- Nucleic Acids Research, Vol. 40, Issue D1
DNA–DNA hybridization values and their relationship to whole-genome sequence similarities
journal, January 2007
- Klappenbach, Joel A.; Goris, Johan; Vandamme, Peter
- International Journal of Systematic and Evolutionary Microbiology, Vol. 57, Issue 1
Pathoscope: Species identification and strain attribution with unassembled sequencing data
journal, July 2013
- Francis, O. E.; Bendall, M.; Manimaran, S.
- Genome Research, Vol. 23, Issue 10
TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach
journal, February 2009
- Diaz, Naryttza N.; Krause, Lutz; Goesmann, Alexander
- BMC Bioinformatics, Vol. 10, Issue 1
Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, September 2011
- Liu, Bo; Gibbons, Theodore; Ghodsi, Mohammad
- Genome Biology, Vol. 12, Issue S1
MetaSim—A Sequencing Simulator for Genomics and Metagenomics
journal, October 2008
- Richter, Daniel C.; Ott, Felix; Auch, Alexander F.
- PLoS ONE, Vol. 3, Issue 10
Genomic Comparison of Escherichia coli O104:H4 Isolates from 2009 and 2011 Reveals Plasmid, and Prophage Heterogeneity, Including Shiga Toxin Encoding Phage stx2
journal, November 2012
- Ahmed, Sanaa A.; Awosika, Joy; Baldwin, Carson
- PLoS ONE, Vol. 7, Issue 11
Works referencing / citing this record:
ConStrains identifies microbial strains in metagenomic datasets
journal, September 2015
- Luo, Chengwei; Knight, Rob; Siljander, Heli
- Nature Biotechnology, Vol. 33, Issue 10
Strain profiling and epidemiology of bacterial species from metagenomic sequencing
journal, December 2017
- Albanese, Davide; Donati, Claudio
- Nature Communications, Vol. 8, Issue 1
Widespread RNA editing dysregulation in brains from autistic individuals
journal, December 2018
- Tran, Stephen S.; Jun, Hyun-Ik; Bahn, Jae Hoon
- Nature Neuroscience, Vol. 22, Issue 1
Regulation of RNA editing by RNA-binding proteins in human cells
journal, January 2019
- Quinones-Valdez, Giovanni; Tran, Stephen S.; Jun, Hyun-Ik
- Communications Biology, Vol. 2, Issue 1
MetaMLST: multi-locus strain-level bacterial typing from metagenomic samples
journal, September 2016
- Zolfo, Moreno; Tett, Adrian; Jousson, Olivier
- Nucleic Acids Research, Vol. 45, Issue 2
QuantTB – A method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
posted_content, June 2019
- Anyansi, Christine; Keo, Arlin; Walker, Bruce
- BMC Genomics
Genomic Microdiversity of Bifidobacterium pseudocatenulatum Underlying Differential Strain-Level Responses to Dietary Carbohydrate Intervention
journal, February 2017
- Wu, Guojun; Zhang, Chenhong; Wu, Huan
- mBio, Vol. 8, Issue 1
Cluster oligonucleotide signatures for rapid identification by sequencing
journal, October 2018
- Zahariev, Manuel; Chen, Wen; Visagie, Cobus M.
- BMC Bioinformatics, Vol. 19, Issue 1
QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
journal, January 2020
- Anyansi, Christine; Keo, Arlin; Walker, Bruce J.
- BMC Genomics, Vol. 21, Issue 1
Experimental design and quantitative analysis of microbial community multiomics
journal, November 2017
- Mallick, Himel; Ma, Siyuan; Franzosa, Eric A.
- Genome Biology, Vol. 18, Issue 1
Massive metagenomic data analysis using abundance-based machine learning
journal, August 2019
- Harris, Zachary N.; Dhungel, Eliza; Mosior, Matthew
- Biology Direct, Vol. 14, Issue 1
Multi-scale characterization of symbiont diversity in the pea aphid complex through metagenomic approaches
journal, October 2018
- Guyomar, Cervin; Legeai, Fabrice; Jousselin, Emmanuelle
- Microbiome, Vol. 6, Issue 1
Comprehensive analysis of chromosomal mobile genetic elements in the gut microbiome reveals phylum-level niche-adaptive gene pools
journal, December 2019
- Jiang, Xiaofang; Hall, Andrew Brantley; Xavier, Ramnik J.
- PLOS ONE, Vol. 14, Issue 12
Beyond 16S rRNA Community Profiling: Intra-Species Diversity in the Gut Microbiota
journal, September 2016
- Ellegaard, Kirsten M.; Engel, Philipp
- Frontiers in Microbiology, Vol. 7
Metagenomics: The Next Culture-Independent Game Changer
journal, July 2017
- Forbes, Jessica D.; Knox, Natalie C.; Ronholm, Jennifer
- Frontiers in Microbiology, Vol. 8
ConStrains identifies microbial strains in metagenomic datasets
journal, September 2015
- Luo, Chengwei; Knight, Rob; Siljander, Heli
- Nature Biotechnology, Vol. 33, Issue 10
Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis
journal, July 2017
- Sahraeian, Sayed Mohammad Ebrahim; Mohiyuddin, Marghoob; Sebra, Robert
- Nature Communications, Vol. 8, Issue 1
Strain profiling and epidemiology of bacterial species from metagenomic sequencing
journal, December 2017
- Albanese, Davide; Donati, Claudio
- Nature Communications, Vol. 8, Issue 1
Cluster oligonucleotide signatures for rapid identification by sequencing
journal, October 2018
- Zahariev, Manuel; Chen, Wen; Visagie, Cobus M.
- BMC Bioinformatics, Vol. 19, Issue 1
Experimental design and quantitative analysis of microbial community multiomics
journal, November 2017
- Mallick, Himel; Ma, Siyuan; Franzosa, Eric A.
- Genome Biology, Vol. 18, Issue 1
Massive metagenomic data analysis using abundance-based machine learning
journal, August 2019
- Harris, Zachary N.; Dhungel, Eliza; Mosior, Matthew
- Biology Direct, Vol. 14, Issue 1
Multi-scale characterization of symbiont diversity in the pea aphid complex through metagenomic approaches
journal, October 2018
- Guyomar, Cervin; Legeai, Fabrice; Jousselin, Emmanuelle
- Microbiome, Vol. 6, Issue 1
Comprehensive analysis of chromosomal mobile genetic elements in the gut microbiome reveals phylum-level niche-adaptive gene pools
journal, December 2019
- Jiang, Xiaofang; Hall, Andrew Brantley; Xavier, Ramnik J.
- PLOS ONE, Vol. 14, Issue 12
PAIPline: pathogen identification in metagenomic and clinical next generation sequencing samples
text, January 2018
- Andrusch, Andreas; Dabrowski, Piotr Wojtek; Klenner, Jeanette
- Robert Koch-Institut
Tracking Strains in the Microbiome: Insights from Metagenomics and Models
journal, May 2016
- Brito, Ilana L.; Alm, Eric J.
- Frontiers in Microbiology, Vol. 7
Metagenomics: The Next Culture-Independent Game Changer
journal, July 2017
- Forbes, Jessica D.; Knox, Natalie C.; Ronholm, Jennifer
- Frontiers in Microbiology, Vol. 8
StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees
journal, January 2017
- Roosaare, Märt; Vaher, Mihkel; Kaplinski, Lauris
- PeerJ, Vol. 5
imGLAD: accurate detection and quantification of target organisms in metagenomes
journal, November 2018
- Castro, Juan C.; Rodriguez-R, Luis M.; Harvey, William T.
- PeerJ, Vol. 6