DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies

Journal Article · · PeerJ
DOI: https://doi.org/10.7717/peerj.7359 · OSTI ID:1559796
 [1];  [2]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1];  [2]; ORCiD logo [3]
  1. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
  2. Univ. of Science and Technology of China, Hefei (China)
  3. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California at Merced, Merced, CA (United States)

We previously reported on MetaBAT, an automated metagenome binning software tool to reconstruct single genomes from microbial communities for subsequent analyses of uncultivated microbial species. MetaBAT has become one of the most popular binning tools largely due to its computational efficiency and ease of use, especially in binning experiments with a large number of samples and a large assembly. MetaBAT requires users to choose parameters to fine-tune its sensitivity and specificity. If those parameters are not chosen properly, binning accuracy can suffer, especially on assemblies of poor quality. Here, we developed MetaBAT 2 to overcome this problem. MetaBAT 2 uses a new adaptive binning algorithm to eliminate manual parameter tuning. We also performed extensive software engineering optimization to increase both computational and memory efficiency. Comparing MetaBAT 2 to alternative software tools on over 100 real world metagenome assemblies shows superior accuracy and computing speed. Binning a typical metagenome assembly takes only a few minutes on a single commodity workstation. We therefore recommend the community adopts MetaBAT 2 for their metagenome binning experiments. MetaBAT 2 is open source software and available at https://bitbucket.org/berkeleylab/metabat.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1559796
Journal Information:
PeerJ, Journal Name: PeerJ Vol. 7; ISSN 2167-8359
Publisher:
PeerJ Inc.Copyright Statement
Country of Publication:
United States
Language:
English

References (43)

Salinimonas marina sp. nov. Isolated from Jeju Island Marine Sediment journal June 2021
Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle journal January 2019
Ten years of next-generation sequencing technology journal September 2014
Metagenomics: DNA sequencing of environmental samples journal October 2005
A phylogenomic and ecological analysis of the globally abundant Marine Group II archaea (Ca. Poseidoniales ord. nov.) journal October 2018
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life journal September 2017
Structure and function of the global topsoil microbiome journal August 2018
Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes journal April 2016
MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets journal October 2015
COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge journal June 2016
tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence journal March 1997
IMG 4 version of the integrated microbial genomes comparative analysis system journal October 2013
metaSPAdes: a new versatile metagenomic assembler journal March 2017
Reconstructing single genomes from complex microbial communities journal January 2016
Critical Assessment of Metagenome Interpretation - A benchmark of metagenomics software text January 2017
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities journal January 2015
Diversity and composition of the North Sikkim hot spring mycobiome using a culture-independent method journal March 2021
Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle journal January 2019
Ten years of next-generation sequencing technology journal September 2014
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea journal August 2017
Binning metagenomic contigs by coverage and composition journal September 2014
Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software journal October 2017
Metagenomics: DNA sequencing of environmental samples journal October 2005
A phylogenomic and ecological analysis of the globally abundant Marine Group II archaea (Ca. Poseidoniales ord. nov.) journal October 2018
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life journal September 2017
Structure and function of the global topsoil microbiome journal August 2018
Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes journal April 2016
MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets journal October 2015
COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge journal June 2016
tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence journal March 1997
tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence journal March 1997
IMG: the integrated microbial genomes database and comparative analysis system journal December 2011
IMG 4 version of the integrated microbial genomes comparative analysis system journal October 2013
IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes journal October 2018
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes journal May 2015
metaSPAdes: a new versatile metagenomic assembler journal March 2017
Improving contig binning of metagenomic data using d 2 S $$ {d}_2^S $$ oligonucleotide frequency dissimilarity journal September 2017
Reconstructing single genomes from complex microbial communities journal January 2016
A phylogenomic and ecological analysis of the globally abundant Marine Group II archaea (Ca. Poseidoniales ord. nov.) text January 2018
The Binning of Metagenomic Contigs for Microbial Physiology of Mixed Cultures journal January 2012
Critical Assessment of Metagenome Interpretation - A benchmark of metagenomics software text January 2017
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities journal January 2015
BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation journal January 2017

Cited By (75)

Gene Expression Changes and Community Turnover Differentially Shape the Global Ocean Metatranscriptome journal November 2019
Characteristics of Wetting-Induced Bacteriophage Blooms in Biological Soil Crust journal December 2019
Metagenome-Assembled Genome Sequences of Three Uncultured Planktomarina sp. Strains from the Northeast Atlantic Ocean journal March 2020
Metagenome-Assembled Genome Sequences of Five Strains from the Microtus ochrogaster (Prairie Vole) Fecal Microbiome journal January 2020
The Signal and the Noise: Characteristics of Antisense RNA in Complex Microbial Communities journal February 2020
Simulating metagenomic stable isotope probing datasets with MetaSIPSim journal January 2020
An Integrated Metagenome Catalog Reveals New Insights into the Murine Gut Microbiome text January 2020
GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs text January 2020
Additional file 10 of Expanding magnetic organelle biogenesis in the domain Bacteria image January 2020
Additional file 5 of Expanding magnetic organelle biogenesis in the domain Bacteria image January 2020
Additional file 6 of Expanding magnetic organelle biogenesis in the domain Bacteria image January 2020
Additional file 7 of Expanding magnetic organelle biogenesis in the domain Bacteria image January 2020
Additional file 8 of Expanding magnetic organelle biogenesis in the domain Bacteria image January 2020
Additional file 9 of Expanding magnetic organelle biogenesis in the domain Bacteria image January 2020
Additional file 3 of Revealing taxon-specific heavy metal-resistance mechanisms in denitrifying phosphorus removal sludge using genome-centric metaproteomics dataset January 2021
Additional file 4 of Revealing taxon-specific heavy metal-resistance mechanisms in denitrifying phosphorus removal sludge using genome-centric metaproteomics dataset January 2021
Additional file 5 of Revealing taxon-specific heavy metal-resistance mechanisms in denitrifying phosphorus removal sludge using genome-centric metaproteomics dataset January 2021
Additional file 13 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 14 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 15 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 16 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 17 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 18 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 19 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 20 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 21 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 22 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 23 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 24 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 25 of MetaDecoder: a novel method for clustering metagenomic contigs dataset January 2022
Additional file 10 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line image January 2022
Additional file 11 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 12 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 13 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 14 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 1 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 2 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 3 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 4 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 5 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 6 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 7 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 8 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 9 of Lignocellulose degradation in Protaetia brevitarsis larvae digestive tract: refining on a tightly designed microbial fermentation production line dataset January 2022
Additional file 2 of Metagenomes of rectal swabs in larger, advanced stage cervical cancers have enhanced mucus degrading functionalities and distinct taxonomic structure dataset January 2022
Additional file 1 of Functional differentiation determines the molecular basis of the symbiotic lifestyle of Ca. Nanohaloarchaeota dataset January 2022
Additional file 4 of Functional differentiation determines the molecular basis of the symbiotic lifestyle of Ca. Nanohaloarchaeota dataset January 2022
Additional file 1 of Metagenomics reveals the habitat specificity of biosynthetic potential of secondary metabolites in global food fermentations dataset January 2023
Additional file 2 of Metagenomics reveals the habitat specificity of biosynthetic potential of secondary metabolites in global food fermentations dataset January 2023
Additional file 3 of Metagenomics reveals the habitat specificity of biosynthetic potential of secondary metabolites in global food fermentations dataset January 2023
Additional file 4 of Metagenomics reveals the habitat specificity of biosynthetic potential of secondary metabolites in global food fermentations dataset January 2023
Additional file 5 of Metagenomics reveals the habitat specificity of biosynthetic potential of secondary metabolites in global food fermentations dataset January 2023
Additional file 6 of Metagenomics reveals the habitat specificity of biosynthetic potential of secondary metabolites in global food fermentations dataset January 2023
Additional file 2 of Stratified microbial communities in Australia’s only anchialine cave are taxonomically novel and drive chemotrophic energy production via coupled nitrogen-sulphur cycling dataset January 2023
Additional file 1 of Metagenome-wide analysis uncovers gut microbial signatures and implicates taxon-specific functions in end-stage renal disease dataset January 2023
Additional file 1 of Metagenome-assembled genomes reveal greatly expanded taxonomic and functional diversification of the abundant marine Roseobacter RCA cluster dataset January 2023
Additional file 1 of Metagenomic comparison of the faecal and environmental resistome on Irish commercial pig farms with and without zinc oxide and antimicrobial usage dataset January 2023
Additional file 1 of MAGICIAN: MAG simulation for investigating criteria for bioinformatic analysis dataset January 2024
Additional file 1 of Horizontal gene transfer after faecal microbiota transplantation in adolescents with obesity dataset January 2024
Additional file 1 of Evaluating and improving the representation of bacterial contents in long-read metagenome assemblies dataset January 2024
Additional file 1 of Integrating multi-platform assembly to recover MAGs from hot spring biofilms: insights into microbial diversity, biofilm formation, and carbohydrate degradation dataset January 2024
Additional file 1 of Metagenomic insights into Heimdallarchaeia clades from the deep-sea cold seep and hydrothermal vent dataset January 2024
Additional file 1 of Unraveling the habitat preferences, ecological drivers, potential hosts, and auxiliary metabolism of soil giant viruses across China dataset January 2024
Phylogenomic analysis of 589 metagenome-assembled genomes encompassing all major prokaryotic lineages from the gut of higher termites journal January 2020
From an extremophilic community to an electroautotrophic production strain: identifying a novel Knallgas bacterium as cathodic biofilm biocatalyst journal January 2020
Depth-discrete metagenomics reveals the roles of microbes in biogeochemical cycling in the tropical freshwater Lake Tanganyika journal February 2021
Binning unassembled short reads based on k-mer abundance covariance using sparse coding journal March 2020
Dynamic Genome Evolution and Blueprint of Complex Virocell Metabolism in Globally-Distributed Giant Viruses posted_content January 2019
Characteristics of Wetting-Induced Bacteriophage Blooms in Biological Soil Crust journal December 2019
Metagenome-Assembled Genome Sequences of Five Strains from the Microtus ochrogaster (Prairie Vole) Fecal Microbiome journal January 2020
Simulating metagenomic stable isotope probing datasets with MetaSIPSim journal January 2020
Optimizing sequencing protocols for leaderboard metagenomics by combining long and short reads journal October 2019
Analysis of 1321 Eubacterium rectale genomes from metagenomes uncovers complex phylogeographic population structure and subspecies functional adaptations journal June 2020
A pipeline for targeted metagenomics of environmental bacteria journal February 2020
Optimizing de novo genome assembly from PCR-amplified metagenomes journal January 2019