Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Accurate and complete genomes from metagenomes

Journal Article · · Genome Research
 [1];  [2];  [3];  [4];  [5]
  1. Univ. of California, Berkeley, CA (United States). Dept. of Earth and Planetary Sciences; DOE/OSTI
  2. Univ. of California, Berkeley, CA (United States). Dept. of Earth and Planetary Sciences; Univ. of Wisconsin, Madison, WI (United States). Dept. of Bacteriology
  3. Univ. of Chicago, IL (United States). Graduate Program in Biophysical Sciences; Univ. of Chicago, IL (United States). Dept. of Medicine
  4. Univ. of Chicago, IL (United States). Dept. of Medicine; Marine Biological Laboratory, Woods Hole, MA (United States). Bay Paul Center
  5. Univ. of California, Berkeley, CA (United States). Dept. of Earth and Planetary Sciences; Univ. of California, Berkeley, CA (United States). Dept. of Environmental Science, Policy, and Management; Univ. of California, Berkeley, CA (United States). Earth and Environmental Sciences

Genomes are an integral component of the biological information about an organism; thus, the more complete the genome, the more informative it is. Historically, bacterial and archaeal genomes were reconstructed from pure (monoclonal) cultures, and the first reported sequences were manually curated to completion. However, the bottleneck imposed by the requirement for isolates precluded genomic insights for the vast majority of microbial life. Shotgun sequencing of microbial communities, referred to initially as community genomics and subsequently as genome-resolved metagenomics, can circumvent this limitation by obtaining metagenome-assembled genomes (MAGs); but gaps, local assembly errors, chimeras, and contamination by fragments from other genomes limit the value of these genomes. Here, we discuss genome curation to improve and, in some cases, achieve complete (circularized, no gaps) MAGs (CMAGs). To date, few CMAGs have been generated, although notably some are from very complex systems such as soil and sediment. Through analysis of about 7000 published complete bacterial isolate genomes, we verify the value of cumulative GC skew in combination with other metrics to establish bacterial genome sequence accuracy. The analysis of cumulative GC skew identified potential misassemblies in some reference genomes of isolated bacteria and the repeat sequences that likely gave rise to them. We discuss methods that could be implemented in bioinformatic approaches for curation to ensure that metabolic and evolutionary analyses can be based on very high-quality genomes.

Research Organization:
Univ. of California, Oakland, CA (United States); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1625640
Alternate ID(s):
OSTI ID: 1756327
Journal Information:
Genome Research, Journal Name: Genome Research Journal Issue: 3 Vol. 30; ISSN 1088-9051
Publisher:
Cold Spring Harbor Laboratory PressCopyright Statement
Country of Publication:
United States
Language:
English

References (163)

Data from: Single-cell genomics reveals hundreds of coexisting subpopulations in wild Prochlorococcus dataset January 2015
Additional file 1: Table S1. of Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies dataset January 2017
Basic Local Alignment Search Tool journal October 1990
Identification and sequencing of 59 highly polymorphic microhaplotypes for analysis of DNA mixtures journal January 2021
Assessing the performance of the Oxford Nanopore Technologies MinION journal March 2015
A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping journal December 2014
A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping journal July 2015
Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle journal January 2019
Geospatial Resolution of Human and Bacterial Diversity with City-Scale Metagenomics journal July 2015
Geospatial Resolution of Human and Bacterial Diversity with City-Scale Metagenomics journal July 2015
Lack of Evidence for Plague or Anthrax on the New York City Subway journal July 2015
Genomic Expansion of Domain Archaea Highlights Roles for Organisms from New Phyla in Anaerobic Carbon Cycling journal March 2015
Single cell genomics: an individual look at microbes journal October 2012
Metagenome Analysis Exploiting High-Throughput Chromosome Conformation Capture (3C) Data journal December 2015
Recombination-dependent concatemeric viral DNA replication journal September 2011
Genome-Resolved Meta-Omics Ties Microbial Dynamics to Process Performance in Biotechnology for Thiocyanate Degradation journal February 2017
Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS journal September 2000
Single-cell genomics-based analysis of virus–host interactions in marine surface bacterioplankton journal April 2015
Potential for microbial H2 and metal transformations associated with novel bacteria and archaea in deep terrestrial subsurface sediments journal March 2017
Community structure and metabolism through reconstruction of microbial genomes from the environment journal February 2004
Strain-resolved community proteomics reveals recombining genomes of acidophilic bacteria journal March 2007
The Human Microbiome Project journal October 2007
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea journal December 2009
Insights into the phylogeny and coding potential of microbial dark matter journal July 2013
Unusual biology across a group comprising more than 15% of domain Bacteria journal June 2015
Complete nitrification by a single microorganism journal November 2015
Complete nitrification by Nitrospira bacteria journal November 2015
Integrative genomics viewer journal January 2011
Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes journal May 2013
Measurement of bacterial replication rates in microbial communities journal November 2016
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea journal August 2017
A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life journal August 2018
Single-cell genomics journal June 2006
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system journal October 2016
Extraordinary phylogenetic diversity and metabolic versatility in aquifer sediment journal August 2013
Diverse uncultivated ultra-small bacterial cells in groundwater journal February 2015
Fast gapped-read alignment with Bowtie 2 journal March 2012
Binning metagenomic contigs by coverage and composition journal September 2014
The trajectory of microbial single-cell sequencing journal October 2017
Single-cell genomics journal March 2011
A new view of the tree of life journal April 2016
Contrasting patterns of genome-level diversity across distinct co-occurring bacterial populations journal December 2017
Expanded diversity of microbial groups that shape the dissimilatory sulfur cycle journal February 2018
Novel prosthecate bacteria from the candidate phylum Acetothermia journal June 2018
Linking the resistome and plasmidome to the microbiome journal May 2019
Evolutionary stasis of a deep subsurface microbial lineage journal April 2021
Strain-resolved analysis of hospital rooms and infants reveals overlap between the human and room microbiome journal November 2017
The Wolbachia mobilome in Culex pipiens includes a putative plasmid journal March 2019
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life journal September 2017
Differential depth distribution of microbial function and putative symbionts through sediment-hosted aquifers in the deep terrestrial subsurface journal January 2018
Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy journal May 2018
Nitrogen-fixing populations of Planctomycetes and Proteobacteria are abundant in surface ocean metagenomes journal June 2018
Megaphages infect Prevotella and variants are widespread in gut microbiomes journal January 2019
A new genomic blueprint of the human gut microbiota journal February 2019
New insights from uncultivated genomes of the global human gut microbiome journal March 2019
Clades of huge phages from across Earth’s ecosystems journal February 2020
Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads journal December 2020
Tracing DNA paths and RNA profiles in cultured cells and tissues with ORCA journal February 2021
The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans journal January 2018
FAMSA: Fast and accurate multiple sequence alignment of huge protein families journal September 2016
Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome journal September 2016
Universal replication biases in bacteria journal April 1999
Complete genome of the uncultured Termite Group 1 bacteria in a single host protist cell journal April 2008
Enigmatic, ultrasmall, uncultivated Archaea journal April 2010
UGA is an additional glycine codon in uncultured SR1 bacteria from the human microbiota journal March 2013
PNAS Plus: From the Cover: Oligotyping analysis of the human oral microbiome journal June 2014
Cultivation of a human-associated TM7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle journal December 2014
Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade journal November 2015
No evidence for extensive horizontal gene transfer from the draft genome of a tardigrade journal May 2016
Numerous uncharacterized and highly divergent microbes which colonize humans are revealed by circulating cell-free DNA journal August 2017
Chemosynthetic symbiont with a drastically reduced genome serves as primary energy storage in the marine flatworm Paracatenula journal April 2019
Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes journal August 2017
DeepMAsED: evaluating the quality of metagenomic assemblies journal February 2020
trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses journal June 2009
The Sequence Alignment/Map format and SAMtools journal June 2009
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth journal April 2012
Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data journal April 2012
Snakemake--a scalable bioinformatics workflow engine journal August 2012
hybridSPAdes: an algorithm for hybrid assembly of short and long reads journal November 2015
Snakemake—a scalable bioinformatics workflow engine journal May 2018
Complete Genome Structure of the Thermophilic Cyanobacterium Thermosynechococcus elongatus BP-1 journal January 2002
Homologous Recombination and Transposon Propagation Shape the Population Structure of an Organism from the Deep Subsurface with Minimal Metabolism journal March 2018
Ultra-deep, long-read nanopore sequencing of mock microbial community standards journal May 2019
IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies journal November 2014
Analyzing genomes with cumulative skew diagrams journal May 1998
The COG database: a tool for genome-scale analysis of protein functions and evolution journal January 2000
tRNAscan-SE On-line: integrating search and context for analysis of transfer RNA genes journal May 2016
A General Empirical Model of Protein Evolution Derived from Multiple Protein Families Using a Maximum-Likelihood Approach journal May 2001
Complete Genome Sequence of Citrus Huanglongbing Bacterium, ‘ Candidatus Liberibacter asiaticus’ Obtained Through Metagenomics journal August 2009
The Reconstruction of 2,631 Draft Metagenome-Assembled Genomes from the Global Oceans posted_content July 2017
Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization journal August 2012
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes journal May 2015
Identical bacterial populations colonize premature infant gut, skin, and oral microbiomes and exhibit different in situ growth rates journal January 2017
metaSPAdes: a new versatile metagenomic assembler journal March 2017
Genome-reconstruction for eukaryotes from complex natural microbial communities journal March 2018
The complete genome sequence for putative H 2 - and S-oxidizer C andidatus Sulfuricurvum sp., assembled de novo from an aquifer-derived metagenome : Complete genome of journal April 2014
Metagenome sequence of E laphomyces granulatus from sporocarp tissue reveals Ascomycota ectomycorrhizal fingerprints of genome expansion and a Proteobacteria -rich microbiome journal April 2015
Capturing Chromosome Conformation journal February 2002
Environmental Genomics Reveals a Single-Species Ecosystem Deep Within Earth journal October 2008
Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome journal October 2009
Potential for Chemolithoautotrophy Among Ubiquitous Bacteria Lineages in the Dark Ocean journal September 2011
Untangling Genomes from Metagenomes: Revealing an Uncultured Class of Marine Euryarchaeota journal February 2012
Fermentation, Hydrogen, and Sulfur Metabolism in Multiple Uncultivated Bacterial Phyla journal September 2012
High-Resolution Mapping of the Spatial Organization of a Bacterial Chromosome journal October 2013
Genomes from Metagenomics journal November 2013
Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus journal April 2014
Origin of Replication of Mycoplasma genitalium journal May 1996
Growth dynamics of gut microbiota in health and disease inferred from single metagenomic samples journal July 2015
Phylogenetic Ecology of the Freshwater Actinobacteria acI Lineage journal September 2007
"Candidatus Cloacamonas Acidaminovorans": Genome Sequence Reconstruction Provides a First Glimpse of a New Bacterial Division journal February 2008
The Value of Complete Microbial Genome Sequencing (You Get What You Pay For) journal December 2002
DNA Replication in the Archaea journal December 2006
Complete 4.55-Megabase-Pair Genome of “Candidatus Fluviicola riflensis,” Curated from Short-Read Metagenomic Sequences journal November 2017
Small Genomes and Sparse Metabolisms of Sediment-Associated Bacteria from Four Candidate Phyla journal October 2013
Composite Metagenome-Assembled Genomes Reduce the Quality of Public Genome Repositories journal June 2019
Patient-Specific Bacteroides Genome Variants in Pouchitis journal November 2016
Unusual Metabolism and Hypervariation in the Genome of a Gracilibacterium (BD1-5) from an Oil-Degrading Community journal November 2019
Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes journal June 2016
Wide Distribution of Phage That Infect Freshwater SAR11 Bacteria journal October 2019
Complete Genome Sequence of an Uncultured Bacterium of the Candidate Phylum Bipolaricaulota journal July 2019
SAR11 Bacteria: The Most Abundant Plankton in the Oceans journal January 2017
Saccharibacteria (TM7) in the Human Oral Microbiome journal February 2019
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
GapFiller: a de novo assembly approach to fill the gap within paired reads journal September 2012
MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm journal August 2014
Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data journal March 2016
Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies journal April 2017
DESMAN: a new tool for de novo extraction of strains from metagenomes journal September 2017
From squiggle to basepair: computational approaches for improving nanopore sequencing read accuracy journal July 2018
bin3C: exploiting Hi-C sequencing data to accurately resolve metagenome-assembled genomes journal February 2019
Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation journal August 2019
VizBin - an application for reference-independent visualization and human-augmented binning of metagenomic data journal January 2015
Recovering complete and draft population genomes from metagenome datasets journal March 2016
Stable isotope informed genome-resolved metagenomics reveals that Saccharibacteria utilize microbially-processed plant-derived carbon journal July 2018
MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis journal September 2018
Genome-resolved metagenomics of eukaryotic populations during early colonization of premature infants and in hospital rooms journal February 2019
Annotated bacterial chromosomes from frame-shift-corrected long-read metagenomic data journal April 2019
Towards long-read metagenomics: complete assembly of three novel genomes from bacteria dependent on a diazotrophic cyanobacterium in a freshwater lake co-culture journal January 2017
Metagenomic Chromosome Conformation Capture (3C): techniques, applications, and challenges journal January 2015
Population Genomic Analysis of Strain Variation in Leptospirillum Group II Bacteria Involved in Acid Mine Drainage Formation journal July 2008
Accelerated Profile HMM Searches journal October 2011
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads journal June 2017
One Bacterial Cell, One Complete Genome journal April 2010
The blood DNA virome in 8,000 humans journal March 2017
MetaTOR: A Computational Pipeline to Recover High-Quality Metagenomic Bins From Mammalian Gut Proximity-Ligation (meta3C) Libraries journal August 2019
Dynamics of tongue microbial communities with single-nucleotide resolution using oligotyping journal November 2014
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle journal July 2015
Fungi Contribute Critical but Spatially Varying Roles in Nitrogen and Carbon Cycling in Acid Mine Drainage journal March 2016
Rokubacteria: Genomic Giants among the Uncultured Bacterial Phyla journal November 2017
The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans dataset January 2018
The Reconstruction of 2,631 Draft Metagenome-Assembled Genomes from the Global Oceans dataset January 2017
Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data collection January 2016
Towards long-read metagenomics: complete assembly of three novel genomes from bacteria dependent on a diazotrophic cyanobacterium in a freshwater lake co-culture collection January 2017
Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies collection January 2017
DESMAN: a new tool for de novo extraction of strains from metagenomes collection January 2017
The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria journal October 2013
Metagenomic chromosome conformation capture (meta3C) unveils the diversity of chromosome organization in microorganisms journal December 2014
Gut bacteria are rarely shared by co-hospitalized premature infants, regardless of necrotizing enterocolitis development journal March 2015
Single-amino acid variants reveal evolutionary processes that shape the biogeography of a global SAR11 subclade journal September 2019
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities journal January 2015
Anvi’o: an advanced analysis and visualization platform for ‘omics data journal January 2015
Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies journal January 2016
VirSorter: mining viral signal from microbial genomic data journal January 2015

Cited By (11)

Additional file 2 of High-quality bacterial genomes of a partial-nitritation/anammox system by an iterative hybrid assembly method dataset January 2020
Additional file 2 of Linking genomic and physiological characteristics of psychrophilic Arthrobacter to metagenomic data to explain global environmental distribution dataset January 2021
Additional file 3 of Linking genomic and physiological characteristics of psychrophilic Arthrobacter to metagenomic data to explain global environmental distribution dataset January 2021
Additional file 4 of Linking genomic and physiological characteristics of psychrophilic Arthrobacter to metagenomic data to explain global environmental distribution dataset January 2021
Additional file 5 of Linking genomic and physiological characteristics of psychrophilic Arthrobacter to metagenomic data to explain global environmental distribution dataset January 2021
Additional file 6 of Linking genomic and physiological characteristics of psychrophilic Arthrobacter to metagenomic data to explain global environmental distribution dataset January 2021
Additional file 7 of Linking genomic and physiological characteristics of psychrophilic Arthrobacter to metagenomic data to explain global environmental distribution dataset January 2021
Additional file 8 of Linking genomic and physiological characteristics of psychrophilic Arthrobacter to metagenomic data to explain global environmental distribution dataset January 2021
Additional file 1 of MAGICIAN: MAG simulation for investigating criteria for bioinformatic analysis dataset January 2024
Microbial genomes from non-human primate gut metagenomes expand the primate-associated bacterial tree of life with over 1000 novel species journal December 2019
Strain-resolved microbiome sequencing reveals mobile elements that drive bacterial competition on a clinical timescale journal May 2020