ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores
- Georgia Inst. of Technology, Atlanta, GA (United States). School of Civil and Environmental Engineering
- Georgia Inst. of Technology, Atlanta, GA (United States). Center for Bioinformatics and Computational Genomics. School of Biological Sciences
- Georgia Inst. of Technology, Atlanta, GA (United States). School of Civil and Environmental Engineering. Center for Bioinformatics and Computational Genomics. School of Biological Sciences
Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific,most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles and related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N2O, to inert N2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.
- Research Organization:
- Georgia Institute of Technology, Atlanta, GA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Biological and Environmental Research (BER); National Science Foundation (NSF)
- Grant/Contract Number:
- SC0006662; 1241046; 1356288
- OSTI ID:
- 1356171
- Alternate ID(s):
- OSTI ID: 1362281
- Journal Information:
- Nucleic Acids Research, Vol. 45, Issue 3; ISSN 0305-1048
- Publisher:
- Oxford University PressCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Quantification of nosZ genes and transcripts in activated sludge microbiomes with novel group-specific qPCR methods validated with metagenomic analyses
|
journal | January 2020 |
Comparing DNA, RNA and protein levels for measuring microbial dynamics in soil microcosms amended with nitrogen fertilizer
|
journal | November 2019 |
Review, Evaluation, and Directions for Gene-Targeted Assembly for Ecological Analyses of Metagenomes
|
journal | October 2019 |
Niche differentiation among annually recurrent coastal Marine Group II Euryarchaeota
|
journal | August 2019 |
Similar Records
Year-Round Shotgun Metagenomes Reveal Stable Microbial Communities in Agricultural Soils and Novel Ammonia Oxidizers Responding to Fertilization
MetaFunPrimer: an Environment-Specific, High-Throughput Primer Design Tool for Improved Quantification of Target Genes