skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

Journal Article · · Nucleic Acids Research
DOI:https://doi.org/10.1093/nar/gkw900· OSTI ID:1356171
ORCiD logo [1]; ORCiD logo [2]; ORCiD logo [3]
  1. Georgia Inst. of Technology, Atlanta, GA (United States). School of Civil and Environmental Engineering
  2. Georgia Inst. of Technology, Atlanta, GA (United States). Center for Bioinformatics and Computational Genomics. School of Biological Sciences
  3. Georgia Inst. of Technology, Atlanta, GA (United States). School of Civil and Environmental Engineering. Center for Bioinformatics and Computational Genomics. School of Biological Sciences

Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific,most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles and related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N2O, to inert N2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.

Research Organization:
Georgia Institute of Technology, Atlanta, GA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER); National Science Foundation (NSF)
Grant/Contract Number:
SC0006662; 1241046; 1356288
OSTI ID:
1356171
Alternate ID(s):
OSTI ID: 1362281
Journal Information:
Nucleic Acids Research, Vol. 45, Issue 3; ISSN 0305-1048
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 31 works
Citation information provided by
Web of Science

References (47)

Grinder: a versatile amplicon and shotgun sequence simulator journal March 2012
pROC: an open-source package for R and S+ to analyze and compare ROC curves journal March 2011
Metagenomics reveals sediment microbial community response to Deepwater Horizon oil spill journal January 2014
MEGAN analysis of metagenomic data journal February 2007
A Format for Phylogenetic Placements journal February 2012
Cross-biome metagenomic analyses of soil microbial communities and their functional attributes journal December 2012
GRASP: Guided Reference-based Assembly of Short Peptides journal November 2014
Archaea predominate among ammonia-oxidizing prokaryotes in soils journal August 2006
Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega journal January 2011
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability journal January 2013
MUSCLE: multiple sequence alignment with high accuracy and high throughput journal March 2004
UProC: tools for ultra-fast protein domain classification journal December 2014
Performance, Accuracy, and Web Server for Evolutionary Placement of Short Sequence Reads under Maximum Likelihood journal March 2011
Opitutus terrae gen. nov., sp. nov., to accommodate novel strains of the division 'Verrucomicrobia' isolated from rice paddy soil journal November 2001
Bacteria rather than Archaea dominate microbial ammonia oxidation in an agricultural soil journal July 2009
Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy journal April 2011
FunGene: the functional gene pipeline and repository journal January 2013
The human microbiome: there is much left to do journal June 2022
The ammonia monooxygenase structural gene amoA as a functional marker: molecular fine-scale analysis of natural ammonia-oxidizing populations. journal January 1997
‘Candidatus Competibacter’-lineage genomes retrieved from metagenomes reveal functional metabolic diversity journal October 2013
Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw journal November 2011
A Bioinformatician's Guide to Metagenomics journal December 2008
Taxonomic classification of metagenomic shotgun sequences with CARMA3 journal May 2011
Detecting Nitrous Oxide Reductase (nosZ) Genes in Soil Metagenomes: Method Development and Implications for the Nitrogen Cycle journal June 2014
Analysis Tool Web Services from the EMBL-EBI journal May 2013
BLAST+: architecture and applications journal January 2009
Fast and sensitive protein alignment using DIAMOND journal November 2014
Accelerated Profile HMM Searches journal October 2011
FragGeneScan: predicting genes in short and error-prone reads journal August 2010
Soil Microbial Community Responses to a Decade of Warming as Revealed by Comparative Metagenomics journal December 2013
Microbial community successional patterns in beach sands impacted by the Deepwater Horizon oil spill journal February 2015
SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data journal September 2010
RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models journal August 2006
Unexpected nondenitrifier nitrous oxide reductase gene diversity and abundance in soils journal November 2012
The unaccounted yet abundant nitrous oxide-reducing microbial community: a potential nitrous oxide sink journal November 2012
Xander: employing a novel method for efficient gene-targeted metagenomic assembly journal August 2015
Type-2 copper-containing enzymes journal September 2007
Crystal structure of a membrane-bound metalloenzyme that catalyses the biological oxidation of methane journal January 2005
Isolation of an autotrophic ammonia-oxidizing marine archaeon journal September 2005
A novel family of functional operons encoding methane/ammonia monooxygenase-related proteins in gammaproteobacterial methanotrophs: Novel monooxygenase in Gamma-MOB journal February 2011
Structural conservation of the B subunit in the ammonia monooxygenase/particulate methane monooxygenase superfamily: Structure of an Archaeal amoB Domain journal March 2014
Comparative in silico analysis of PCR primers suited for diagnostics and cloning of ammonia monooxygenase genes from ammonia-oxidizing bacteria journal April 2008
Evidence that participate methane monooxygenase and ammonia monooxygenase may be evolutionarily related journal October 1995
Quantitative Detection of the nosZ Gene, Encoding Nitrous Oxide Reductase, and Comparison of the Abundances of 16S rRNA, narG, nirK, and nosZ Genes in Soils journal August 2006
Insights into the Effect of Soil pH on N2O and N2 Emissions and Denitrifier Community Size and Activity journal January 2010
Abundance of microbial genes associated with nitrogen cycling as indices of biogeochemical process rates across a vegetation gradient in Alaska: Functional genes predict potential N cycling rates journal January 2012
Characterization and Description of Anaeromyxobacter dehalogenans gen. nov., sp. nov., an Aryl-Halorespiring Facultative Anaerobic Myxobacterium journal February 2002

Cited By (4)