skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

Abstract

Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.

Authors:
 [1];  [1];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1185410
Grant/Contract Number:  
DE-AC05-00OR22725
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Bioinformatics
Additional Journal Information:
Journal Volume: 31; Journal Issue: 2; Journal ID: ISSN 1367-4803
Publisher:
Oxford University Press
Country of Publication:
United States
Language:
English
Subject:
60 APPLIED LIFE SCIENCES; 59 BASIC BIOLOGICAL SCIENCES

Citation Formats

Ahn, Tae-Hyuk, Chai, Juanjuan, and Pan, Chongle. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance. United States: N. p., 2014. Web. doi:10.1093/bioinformatics/btu641.
Ahn, Tae-Hyuk, Chai, Juanjuan, & Pan, Chongle. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance. United States. https://doi.org/10.1093/bioinformatics/btu641
Ahn, Tae-Hyuk, Chai, Juanjuan, and Pan, Chongle. 2014. "Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance". United States. https://doi.org/10.1093/bioinformatics/btu641. https://www.osti.gov/servlets/purl/1185410.
@article{osti_1185410,
title = {Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance},
author = {Ahn, Tae-Hyuk and Chai, Juanjuan and Pan, Chongle},
abstractNote = {Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.},
doi = {10.1093/bioinformatics/btu641},
url = {https://www.osti.gov/biblio/1185410}, journal = {Bioinformatics},
issn = {1367-4803},
number = 2,
volume = 31,
place = {United States},
year = {Mon Sep 29 00:00:00 EDT 2014},
month = {Mon Sep 29 00:00:00 EDT 2014}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 59 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

PhymmBL expanded: confidence scores, custom databases, parallelization and more
journal, April 2011


TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach
journal, February 2009


Biosurveillance plan unveiled
journal, November 2012


Pathoscope: Species identification and strain attribution with unassembled sequencing data
journal, July 2013


DNA–DNA hybridization values and their relationship to whole-genome sequence similarities
journal, January 2007


MEGAN analysis of metagenomic data
journal, February 2007


Fast gapped-read alignment with Bowtie 2
journal, March 2012


The Sequence Alignment/Map format and SAMtools
journal, June 2009


Metagenomic abundance estimation and diagnostic testing on species level
journal, August 2012


Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, January 2011


Performance comparison of benchtop high-throughput sequencing platforms
journal, April 2012


SOrt-ITEMS: Sequence orthology based approach for improved taxonomic estimation of metagenomic sequences
journal, May 2009


Taxonomic metagenome sequence assignment with structured output models
journal, February 2011


NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
journal, November 2011


MetaSim—A Sequencing Simulator for Genomics and Metagenomics
journal, October 2008


Metagenome Fragment Classification Using -Mer Frequency Profiles
journal, January 2008


Escherichia coli (STEC) serotype O104 outbreak causing haemolytic syndrome (HUS) in Germany and France
journal, July 2011


Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012


On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming
journal, April 2005


Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2
journal, February 2012


Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads
journal, December 2011


Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, September 2011


Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, September 2011


The human microbiome: there is much left to do
journal, June 2022


Read and assembly metrics inconsequential for clinical utility of whole-genome sequencing in mapping outbreaks
journal, July 2013


Fast gapped-read alignment with Bowtie 2
journal, March 2012


Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012


PhymmBL expanded: confidence scores, custom databases, parallelization and more
journal, April 2011


SOrt-ITEMS: Sequence orthology based approach for improved taxonomic estimation of metagenomic sequences
journal, May 2009


The Sequence Alignment/Map format and SAMtools
journal, June 2009


Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2
journal, February 2012


NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
journal, November 2011


DNA–DNA hybridization values and their relationship to whole-genome sequence similarities
journal, January 2007


Pathoscope: Species identification and strain attribution with unassembled sequencing data
journal, July 2013


TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach
journal, February 2009


Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences
journal, September 2011


MetaSim—A Sequencing Simulator for Genomics and Metagenomics
journal, October 2008


Works referencing / citing this record:

ConStrains identifies microbial strains in metagenomic datasets
journal, September 2015


Strain profiling and epidemiology of bacterial species from metagenomic sequencing
journal, December 2017


Widespread RNA editing dysregulation in brains from autistic individuals
journal, December 2018


Regulation of RNA editing by RNA-binding proteins in human cells
journal, January 2019


MetaMLST: multi-locus strain-level bacterial typing from metagenomic samples
journal, September 2016


Cluster oligonucleotide signatures for rapid identification by sequencing
journal, October 2018


QuantTB – a method to classify mixed Mycobacterium tuberculosis infections within whole genome sequencing data
journal, January 2020


Experimental design and quantitative analysis of microbial community multiomics
journal, November 2017


Massive metagenomic data analysis using abundance-based machine learning
journal, August 2019


Multi-scale characterization of symbiont diversity in the pea aphid complex through metagenomic approaches
journal, October 2018


Beyond 16S rRNA Community Profiling: Intra-Species Diversity in the Gut Microbiota
journal, September 2016


Metagenomics: The Next Culture-Independent Game Changer
journal, July 2017


ConStrains identifies microbial strains in metagenomic datasets
journal, September 2015


Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis
journal, July 2017


Strain profiling and epidemiology of bacterial species from metagenomic sequencing
journal, December 2017


Cluster oligonucleotide signatures for rapid identification by sequencing
journal, October 2018


Experimental design and quantitative analysis of microbial community multiomics
journal, November 2017


Massive metagenomic data analysis using abundance-based machine learning
journal, August 2019


Multi-scale characterization of symbiont diversity in the pea aphid complex through metagenomic approaches
journal, October 2018


PAIPline: pathogen identification in metagenomic and clinical next generation sequencing samples
text, January 2018


Tracking Strains in the Microbiome: Insights from Metagenomics and Models
journal, May 2016


Metagenomics: The Next Culture-Independent Game Changer
journal, July 2017


imGLAD: accurate detection and quantification of target organisms in metagenomes
journal, November 2018