DOE Data Explorer title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Impact of BBDuk metagenomic read trimming and decontamination

Abstract

Background Investigators using metagenomic sequencing to study their microbiomes are often provided data that has been trimmed and decontaminated or do it themselves without knowing the effect these procedures can have on their downstream analyses. Here we evaluated the impact that JGI trimming and decontamination procedures had on assembly and binning metrics, placement of metagenome assembled genomes into species trees, and functional profiles of metagenome-assembled genomes (MAGs) extracted from twenty three complex rhizosphere metagenomes. We also investigated how more aggressive trimming impacts these binning metrics. Results We found that JGI trimmed and decontamination of input reads had some significant impacts in assembly and binning metrics compared to raw reads, and that differences in placement of MAGs in species trees increased with decreasing completeness and contamination thresholds. More aggressive trimming beyond those used by JGI were found to reduce MAG counts. Conclusions Mild trimming and decontamination of metagenomics reads prior to assembly can change an investigator’s answer to the questions, “Who is there and what are they doing? However, mild trimming and decontamination of metagenomic reads with high quality scores is recommended for those who elect to do so.

Authors:
ORCiD logo
  1. North Carolina State Univ., Raleigh, NC (United States)
Publication Date:
Research Org.:
North Carolina State University, Raleigh, NC (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Subject:
59 BASIC BIOLOGICAL SCIENCES
Keywords:
metagenomics, decontamination, assembly, binning, phylogenomics, functional analysis
OSTI Identifier:
1779218
DOI:
https://doi.org/10.25982/77705.1341/1779218

Citation Formats

Whitham, Jason. Impact of BBDuk metagenomic read trimming and decontamination. United States: N. p., 2021. Web. doi:10.25982/77705.1341/1779218.
Whitham, Jason. Impact of BBDuk metagenomic read trimming and decontamination. United States. doi:https://doi.org/10.25982/77705.1341/1779218
Whitham, Jason. 2021. "Impact of BBDuk metagenomic read trimming and decontamination". United States. doi:https://doi.org/10.25982/77705.1341/1779218. https://www.osti.gov/servlets/purl/1779218. Pub date:Fri Jan 01 00:00:00 EST 2021
@article{osti_1779218,
title = {Impact of BBDuk metagenomic read trimming and decontamination},
author = {Whitham, Jason},
abstractNote = {Background Investigators using metagenomic sequencing to study their microbiomes are often provided data that has been trimmed and decontaminated or do it themselves without knowing the effect these procedures can have on their downstream analyses. Here we evaluated the impact that JGI trimming and decontamination procedures had on assembly and binning metrics, placement of metagenome assembled genomes into species trees, and functional profiles of metagenome-assembled genomes (MAGs) extracted from twenty three complex rhizosphere metagenomes. We also investigated how more aggressive trimming impacts these binning metrics. Results We found that JGI trimmed and decontamination of input reads had some significant impacts in assembly and binning metrics compared to raw reads, and that differences in placement of MAGs in species trees increased with decreasing completeness and contamination thresholds. More aggressive trimming beyond those used by JGI were found to reduce MAG counts. Conclusions Mild trimming and decontamination of metagenomics reads prior to assembly can change an investigator’s answer to the questions, “Who is there and what are they doing? However, mild trimming and decontamination of metagenomic reads with high quality scores is recommended for those who elect to do so.},
doi = {10.25982/77705.1341/1779218},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2021},
month = {1}
}

Works referenced in this record:

QUAST: quality assessment tool for genome assemblies
journal, February 2013


HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks
journal, January 2018


KBase: The United States Department of Energy Systems Biology Knowledgebase
journal, July 2018


Icarus: visualizer for de novo assembly evaluation
journal, July 2016


CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes
journal, May 2015


Using SPAdes De Novo Assembler
journal, June 2020


Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea
journal, August 2017


The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities
journal, October 2020


Pfam: The protein families database in 2021
journal, October 2020


Trace gas oxidizers are widespread and active members of soil microbial communities
journal, January 2021


Jupyter Notebooks – a publishing format for reproducible computational workflows
book, January 2021


The Importance of Accounting for Correlated Observations
journal, September 2010


ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data
journal, February 2016


FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments
journal, March 2010


MetaQUAST: evaluation of metagenome assemblies
journal, November 2015


TIGRFAMs: a protein family resource for the functional identification of proteins
journal, January 2001


MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets
journal, October 2015


Microbial Community Analysis with Ribosomal Gene Fragments from Shotgun Metagenomes
journal, October 2015


COG database update: focus on microbial diversity, model organisms, and widespread pathogens
journal, November 2020


Genomes OnLine Database (GOLD) v.8: overview and updates
journal, November 2020


FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments
journal, March 2010