skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Recovering complete and draft population genomes from metagenome datasets

Journal Article · · Microbiome
ORCiD logo [1];  [2];  [3]
  1. Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, IL (United States)
  2. Argonne National Lab. (ANL), Argonne, IL (United States)
  3. Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, IL (United States); Marine Biological Lab., Woods Hole, MA (United States)

Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1258642
Journal Information:
Microbiome, Vol. 4, Issue 1; ISSN 2049-2618
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 143 works
Citation information provided by
Web of Science

References (94)

Biogeography: An Emerging Cornerstone for Understanding Prokaryotic Diversity, Ecology, and Evolution journal November 2006
Community structure and metabolism through reconstruction of microbial genomes from the environment journal February 2004
Ecological roles of dominant and rare prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics journal November 2014
Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization journal August 2012
Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen journal January 2011
Untangling Genomes from Metagenomes: Revealing an Uncultured Class of Marine Euryarchaeota journal February 2012
Small Genomes and Sparse Metabolisms of Sediment-Associated Bacteria from Four Candidate Phyla journal October 2013
Strain recovery from metagenomes journal October 2015
Strain-resolved community genomic analysis of gut microbial colonization in a premature infant journal December 2010
Niche and host-associated functional signatures of the root surface microbiome journal September 2014
Microbial Metagenomics: Beyond the Genome journal January 2011
The complete genome sequence for putative H 2 - and S-oxidizer C andidatus Sulfuricurvum sp., assembled de novo from an aquifer-derived metagenome : Complete genome of journal April 2014
Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw journal November 2011
Fermentation, Hydrogen, and Sulfur Metabolism in Multiple Uncultivated Bacterial Phyla journal September 2012
Extraordinary phylogenetic diversity and metabolic versatility in aquifer sediment journal August 2013
Arsenic rich Himalayan hot spring metagenomics reveal genetically novel predator-prey genotypes: Metagenomic recovery of predator prey genotypes journal July 2015
Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data journal February 2012
Repetitive DNA and next-generation sequencing: computational challenges and solutions journal November 2011
Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut journal January 2014
MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads journal July 2012
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing journal May 2012
Ray Meta: scalable de novo metagenome assembly and profiling journal January 2012
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth journal April 2012
MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph journal January 2015
Scaling metagenome sequence assembly with probabilistic de Bruijn graphs journal July 2012
Improved Assemblies Using a Source-Agnostic Pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of Contigs journal October 2014
Using cascading Bloom filters to improve the memory usage for de Brujin graphs journal January 2014
ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies journal January 2013
REAPR: a universal tool for genome assembly evaluation journal January 2013
Individual genome assembly from complex community short-read metagenomic datasets journal October 2011
Minimus: a fast, lightweight genome assembler journal January 2007
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs journal February 2008
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler journal December 2012
Ray: Simultaneous Assembly of Reads from a Mix of High-Throughput Sequencing Technologies journal November 2010
An ensemble strategy that significantly improves de novo assembly of microbial genomes from metagenomic next-generation sequencing data journal January 2015
ABySS: A parallel assembler for short read sequence data journal February 2009
CAP3: A DNA Sequence Assembly Program journal September 1999
Evaluating the Fidelity of De Novo Short Read Metagenomic Assembly Using Simulated Data journal May 2011
Nonpareil: a redundancy-based approach to assess the level of coverage in metagenomic datasets journal October 2013
Use of simulated data sets to evaluate the fidelity of metagenomic processing methods journal April 2007
A General Coverage Theory for Shotgun DNA Sequencing journal July 2006
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea journal December 2009
Estimating coverage in metagenomic data sets and why it matters journal May 2014
Key roles for freshwater Actinobacteria revealed by deep metagenomic sequencing journal November 2014
Ecological Succession and Viability of Human-Associated Microbiota on Restroom Surfaces journal November 2014
Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes journal May 2013
Reevaluating Assembly Evaluations with Feature Response Curves: GAGE and Assemblathons journal December 2012
Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes journal July 2014
A metagenome-wide association study of gut microbiota in type 2 diabetes journal September 2012
Genome Project Standards in a New Era of Sequencing journal October 2009
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities journal January 2015
Binning metagenomic contigs by coverage and composition journal September 2014
MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets journal October 2015
GroopM: an automated tool for the recovery of population genomes from related metagenomes journal January 2014
Alignathon: a competitive assessment of whole-genome alignment methods journal October 2014
GAGE: A critical evaluation of genome assemblies and assembly algorithms journal January 2012
Automated ensemble assembly and validation of microbial genomes journal May 2014
GAM-NGS: genomic assemblies merger for next generation sequencing journal April 2013
RAIphy: Phylogenetic classification of metagenomics samples using iterative refinement of relative abundance index profiles journal January 2011
Applying Shannon's information theory to bacterial and phage genomes and metagenomes journal January 2013
Unusual biology across a group comprising more than 15% of domain Bacteria journal June 2015
The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria journal October 2013
Genomic resolution of linkages in carbon, nitrogen, and sulfur cycling among widespread estuary sediment bacteria journal April 2015
A simple, fast, and accurate method of phylogenomic inference journal January 2008
Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2 journal February 2012
The Pfam protein families database journal November 2011
The TIGRFAMs database of protein families journal January 2003
Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage journal December 2011
The comprehensive microbial resource journal November 2009
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes journal May 2015
BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs journal June 2015
Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data journal April 2013
Environmental shaping of codon usage and functional adaptation across microbial communities journal August 2013
Variation in global codon usage bias among prokaryotic organisms is associated with their lifestyles journal January 2011
Genome sequence of Staphylococcus lugdunensis N920143 allows identification of putative colonization and virulence factors: Staphylococcus lugdunensis genome sequence journal July 2011
Sulfuricurvum kujiense gen. nov., sp. nov., a facultatively anaerobic, chemolithoautotrophic, sulfur-oxidizing bacterium isolated from an underground crude-oil storage cavity journal November 2004
Codon Deviation Coefficient: a novel measure for estimating codon usage bias and its statistical significance journal January 2012
Pvclust: an R package for assessing the uncertainty in hierarchical clustering journal April 2006
Community-wide analysis of microbial genome sequence signatures journal January 2009
Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition journal December 2011
MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm journal August 2014
Diversity and composition of the North Sikkim hot spring mycobiome using a culture-independent method journal March 2021
Automated ensemble assembly and validation of microbial genomes journal February 2014
Evaluation of short read metagenomic assembly conference December 2010
Disk Compression of k-mer Sets text January 2020
Biogeography: An Emerging Cornerstone for Understanding Prokaryotic Diversity, Ecology, and Evolution text January 2007
Using Cascading Bloom Filters to Improve the Memory Usage for de Brujin Graphs book January 2013
Automated ensemble assembly and validation of microbial genomes text January 2014
MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph preprint January 2014
Salinimonas marina sp. nov. Isolated from Jeju Island Marine Sediment journal June 2021
Binning Metagenomic Contigs Using Unsupervised Clustering and Reference Databases journal May 2022
Evaluation of short read metagenomic assembly journal January 2011
Scaling metagenome sequence assembly with probabilistic de Bruijn graphs text January 2011
Reevaluating Assembly Evaluations with Feature Response Curves: GAGE and Assemblathons text January 2012

Cited By (57)

The genital tract and rectal microbiomes: their role in HIV susceptibility and prevention in women journal May 2019
MetaG: a graph-based metagenomic gene analysis for big DNA data journal July 2016
Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software journal October 2017
Lysogeny is prevalent and widely distributed in the murine gut microbiota journal February 2018
Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen journal February 2018
Best practices for analysing microbiomes journal May 2018
Hybrid metagenomic assembly enables high-resolution analysis of resistance determinants and mobile elements in human microbiomes journal July 2019
Antimicrobial activity and carbohydrate metabolism in the bacterial metagenome of the soil-living invertebrate Folsomia candida journal May 2019
A review of methods and databases for metagenomic classification and assembly journal September 2017
BMC3C: binning metagenomic contigs using codon usage, sequence composition and read coverage journal June 2018
BOARD INVITED REVIEW: The pig microbiota and the potential for harnessing the power of the microbiome to improve growth and health1 journal June 2019
IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes journal October 2018
Autometa: automated extraction of microbial genomes from individual shotgun metagenomes journal March 2019
Dynamic Genome Evolution and Blueprint of Complex Virocell Metabolism in Globally-Distributed Giant Viruses posted_content January 2019
Elucidation of complexity and prediction of interactions in microbial communities journal August 2017
CoMet: a workflow using contig coverage and composition for binning a metagenomic sample with high precision journal December 2017
Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes journal November 2017
Comparative genomics of Bacteria commonly identified in the built environment journal January 2019
Assembly of hundreds of novel bacterial genomes from the chicken caecum journal February 2020
Genomes from uncultivated prokaryotes: a comparison of metagenome-assembled and single-amplified genomes journal September 2018
Prokaryotic horizontal gene transfer within the human holobiont: ecological-evolutionary inferences, implications and possibilities journal September 2018
MetaCHIP: community-level horizontal gene transfer identification through the combination of best-match and phylogenetic approaches journal March 2019
GAMOLA2, a Comprehensive Software Package for the Annotation and Curation of Draft and Complete Microbial Genomes journal March 2017
Untangling Genomes of Novel Planctomycetal and Verrucomicrobial Species from Monterey Bay Kelp Forest Metagenomes by Refined Binning journal March 2017
Genome-Centric Analysis of a Thermophilic and Cellulolytic Bacterial Consortium Derived from Composting journal April 2017
Evolutionary Biology Needs Wild Microbiomes journal April 2017
Overview of Virus Metagenomic Classification Methods and Their Biological Applications journal April 2018
New Biological Insights Into How Deforestation in Amazonia Affects Soil Microbial Communities Using Metagenomics and Metagenome-Assembled Genomes journal July 2018
Bacterial and Archaeal Viruses of Himalayan Hot Springs at Manikaran Modulate Host Genomes journal December 2018
Applying Genome-Resolved Metagenomics to Deconvolute the Halophilic Microbiome journal March 2019
Effect of Long-Term Farming Practices on Agricultural Soil Microbiome Members Represented by Metagenomically Assembled Genomes (MAGs) and Their Predicted Plant-Beneficial Genes journal June 2019
Metagenomic Insights into the Phylogenetic and Metabolic Diversity of the Prokaryotic Community Dwelling in Hypersaline Soils from the Odiel Saltmarshes (SW Spain) journal March 2018
Interpreting Microbial Biosynthesis in the Genomic Age: Biological and Practical Considerations journal June 2017
Metagenome to phenome approach enables isolation and genomics characterization of Kalamiella piersonii gen. nov., sp. nov. from the International Space Station journal April 2019
Rhizosphere microbiome structure alters to enable wilt resistance in tomato journal October 2018
Issues and current standards of controls in microbiome research journal April 2019
Constraint-based stoichiometric modelling from single organisms to microbial communities journal November 2016
Genomic exploration of the diversity, ecology, and evolution of the archaeal domain of life journal August 2017
ICGRM: integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset journal December 2019
Surveillance of Foodborne Pathogens: Towards Diagnostic Metagenomics of Fecal Samples journal January 2018
Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity journal January 2017
MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis journal September 2018
metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics journal December 2020
A de novo approach to disentangle partner identity and function in holobiont systems journal June 2018
Accurate and complete genomes from metagenomes journal March 2020
Critical Assessment of Metagenome Interpretation - A benchmark of metagenomics software text January 2017
Translational metagenomics and the human resistome: confronting the menace of the new millennium journal October 2016
MEBS, a software platform to evaluate large (meta)genomic collections according to their metabolic machinery: unraveling the sulfur cycle journal September 2017
Beneficial microbial signals from alternative feed ingredients: a way to improve sustainability of broiler production? journal August 2017
Targeted in situ metatranscriptomics for selected taxa from mesophilic and thermophilic biogas plants journal December 2017
Differential Functional Constraints Cause Strain-Level Endemism in Polynucleobacter Populations journal June 2016
Advancing Genome-Resolved Metagenomics beyond the Shotgun journal June 2019
Recovering genomics clusters of secondary metabolites from lakes using genome-resolved metagenomics text January 2020
A Metagenomic Approach to Cyanobacterial Genomics journal May 2017
Benefits of Genomic Insights and CRISPR-Cas Signatures to Monitor Potential Pathogens across Drinking Water Production and Distribution Systems journal October 2017
Detection of bacterial contaminants and hybrid sequences in the genome of the kelp Saccharina japonica using Taxoblast journal January 2017
virMine: automated detection of viral sequences from complex metagenomic samples journal January 2019