skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A genomic catalog of Earth’s microbiomes

Abstract

The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth’s continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.

Authors:
; ; ; ; ORCiD logo; ; ORCiD logo; ; ORCiD logo; ORCiD logo; ORCiD logo; ; ; ORCiD logo; ORCiD logo; ; ; ; ; more »; ; ORCiD logo; ; ORCiD logo; ; ORCiD logo; ORCiD logo; ORCiD logo; ORCiD logo; ORCiD logo; ORCiD logo; ORCiD logo « less
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Argonne National Lab. (ANL), Argonne, IL (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE Advanced Research Projects Agency - Energy (ARPA-E); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities Division
Contributing Org.:
IMG/M Data Consortium
OSTI Identifier:
1735390
Alternate Identifier(s):
OSTI ID: 1764548; OSTI ID: 1777128; OSTI ID: 1809934; OSTI ID: 1813713
Report Number(s):
LLNL-JRNL-814365
Journal ID: ISSN 1087-0156; PII: 718
Grant/Contract Number:  
AC02-05CH11231; AC02-06CH11357; AC05-00OR22725; AC02-98CH10886; AC52-07NA27344
Resource Type:
Published Article
Journal Name:
Nature Biotechnology
Additional Journal Information:
Journal Name: Nature Biotechnology; Journal ID: ISSN 1087-0156
Publisher:
Springer Nature
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Computational biology and bioinformatics; Microbiology; Biological and medical sciences, Environmental sciences

Citation Formats

IMG/M Data Consortium, Nayfach, Stephen, Roux, Simon, Seshadri, Rekha, Udwary, Daniel, Varghese, Neha, Schulz, Frederik, Wu, Dongying, Paez-Espino, David, Chen, I-Min, Huntemann, Marcel, Palaniappan, Krishna, Ladau, Joshua, Mukherjee, Supratim, Reddy, T. B. K., Nielsen, Torben, Kirton, Edward, Faria, José P., Edirisinghe, Janaka N., Henry, Christopher S., Jungbluth, Sean P., Chivian, Dylan, Dehal, Paramvir, Wood-Charlson, Elisha M., Arkin, Adam P., Tringe, Susannah G., Visel, Axel, Woyke, Tanja, Mouncey, Nigel J., Ivanova, Natalia N., Kyrpides, Nikos C., and Eloe-Fadrosh, Emiley A.. A genomic catalog of Earth’s microbiomes. United States: N. p., 2020. Web. https://doi.org/10.1038/s41587-020-0718-6.
IMG/M Data Consortium, Nayfach, Stephen, Roux, Simon, Seshadri, Rekha, Udwary, Daniel, Varghese, Neha, Schulz, Frederik, Wu, Dongying, Paez-Espino, David, Chen, I-Min, Huntemann, Marcel, Palaniappan, Krishna, Ladau, Joshua, Mukherjee, Supratim, Reddy, T. B. K., Nielsen, Torben, Kirton, Edward, Faria, José P., Edirisinghe, Janaka N., Henry, Christopher S., Jungbluth, Sean P., Chivian, Dylan, Dehal, Paramvir, Wood-Charlson, Elisha M., Arkin, Adam P., Tringe, Susannah G., Visel, Axel, Woyke, Tanja, Mouncey, Nigel J., Ivanova, Natalia N., Kyrpides, Nikos C., & Eloe-Fadrosh, Emiley A.. A genomic catalog of Earth’s microbiomes. United States. https://doi.org/10.1038/s41587-020-0718-6
IMG/M Data Consortium, Nayfach, Stephen, Roux, Simon, Seshadri, Rekha, Udwary, Daniel, Varghese, Neha, Schulz, Frederik, Wu, Dongying, Paez-Espino, David, Chen, I-Min, Huntemann, Marcel, Palaniappan, Krishna, Ladau, Joshua, Mukherjee, Supratim, Reddy, T. B. K., Nielsen, Torben, Kirton, Edward, Faria, José P., Edirisinghe, Janaka N., Henry, Christopher S., Jungbluth, Sean P., Chivian, Dylan, Dehal, Paramvir, Wood-Charlson, Elisha M., Arkin, Adam P., Tringe, Susannah G., Visel, Axel, Woyke, Tanja, Mouncey, Nigel J., Ivanova, Natalia N., Kyrpides, Nikos C., and Eloe-Fadrosh, Emiley A.. Mon . "A genomic catalog of Earth’s microbiomes". United States. https://doi.org/10.1038/s41587-020-0718-6.
@article{osti_1735390,
title = {A genomic catalog of Earth’s microbiomes},
author = {IMG/M Data Consortium and Nayfach, Stephen and Roux, Simon and Seshadri, Rekha and Udwary, Daniel and Varghese, Neha and Schulz, Frederik and Wu, Dongying and Paez-Espino, David and Chen, I-Min and Huntemann, Marcel and Palaniappan, Krishna and Ladau, Joshua and Mukherjee, Supratim and Reddy, T. B. K. and Nielsen, Torben and Kirton, Edward and Faria, José P. and Edirisinghe, Janaka N. and Henry, Christopher S. and Jungbluth, Sean P. and Chivian, Dylan and Dehal, Paramvir and Wood-Charlson, Elisha M. and Arkin, Adam P. and Tringe, Susannah G. and Visel, Axel and Woyke, Tanja and Mouncey, Nigel J. and Ivanova, Natalia N. and Kyrpides, Nikos C. and Eloe-Fadrosh, Emiley A.},
abstractNote = {The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth’s continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.},
doi = {10.1038/s41587-020-0718-6},
journal = {Nature Biotechnology},
number = ,
volume = ,
place = {United States},
year = {2020},
month = {11}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1038/s41587-020-0718-6

Save / Share:

Works referenced in this record:

A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria
journal, January 2018

  • Kauffman, Kathryn M.; Hussain, Fatima A.; Yang, Joy
  • Nature, Vol. 554, Issue 7690
  • DOI: 10.1038/nature25474

PILER-CR: Fast and accurate identification of CRISPR repeats
journal, January 2007


Ecology and exploration of the rare biosphere
journal, March 2015

  • Lynch, Michael D. J.; Neufeld, Josh D.
  • Nature Reviews Microbiology, Vol. 13, Issue 4
  • DOI: 10.1038/nrmicro3400

Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics
journal, October 2015


Shifting the genomic gold standard for the prokaryotic species definition
journal, October 2009

  • Richter, Michael; Rosselló-Móra, Ramon
  • Proceedings of the National Academy of Sciences, Vol. 106, Issue 45
  • DOI: 10.1073/pnas.0906412106

A metagenomics roadmap to the uncultured genome diversity in hypersaline soda lake sediments
journal, September 2018

  • Vavourakis, Charlotte D.; Andrei, Adrian-Stefan; Mehrshad, Maliheh
  • Microbiome, Vol. 6, Issue 1
  • DOI: 10.1186/s40168-018-0548-7

Genomes OnLine database (GOLD) v.7: updates and new features
journal, October 2018

  • Mukherjee, Supratim; Stamatis, Dimitri; Bertsch, Jon
  • Nucleic Acids Research, Vol. 47, Issue D1
  • DOI: 10.1093/nar/gky977

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea
journal, August 2017

  • Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas
  • Nature Biotechnology, Vol. 35, Issue 8
  • DOI: 10.1038/nbt.3893

A Catalog of Reference Genomes from the Human Microbiome
journal, May 2010


Expanding anaerobic alkane metabolism in the domain of Archaea
journal, March 2019


Soil Viruses Are Underexplored Players in Ecosystem Carbon Processing
journal, October 2018


Disentangling the impact of environmental and phylogenetic constraints on prokaryotic within-species diversity
journal, February 2020

  • Maistrenko, Oleksandr M.; Mende, Daniel R.; Luetge, Mechthild
  • The ISME Journal, Vol. 14, Issue 5
  • DOI: 10.1038/s41396-020-0600-z

1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life
journal, June 2017

  • Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.
  • Nature Biotechnology, Vol. 35, Issue 7
  • DOI: 10.1038/nbt.3886

Deep mitochondrial origin outside the sampled alphaproteobacteria
journal, April 2018


Fast and accurate short read alignment with Burrows-Wheeler transform
journal, May 2009


Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software
journal, October 2017

  • Sczyrba, Alexander; Hofmann, Peter; Belmann, Peter
  • Nature Methods, Vol. 14, Issue 11
  • DOI: 10.1038/nmeth.4458

A thermostable Cas9 with increased lifetime in human plasma
journal, November 2017

  • Harrington, Lucas B.; Paez-Espino, David; Staahl, Brett T.
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/s41467-017-01408-4

Farming, Q fever and public health: agricultural practices and beyond
journal, January 2018


Reconstructing 16S rRNA genes in metagenomic data
journal, June 2015


The species concept for prokaryotes
journal, January 2001


New insights from uncultivated genomes of the global human gut microbiome
journal, March 2019


tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence
journal, March 1997


FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments
journal, March 2010


Differential depth distribution of microbial function and putative symbionts through sediment-hosted aquifers in the deep terrestrial subsurface
journal, January 2018

  • Probst, Alexander J.; Ladd, Bethany; Jarett, Jessica K.
  • Nature Microbiology, Vol. 3, Issue 3
  • DOI: 10.1038/s41564-017-0098-y

Genomic exploration of the diversity, ecology, and evolution of the archaeal domain of life
journal, August 2017


MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities
journal, January 2015


VirSorter: mining viral signal from microbial genomic data
journal, January 2015

  • Roux, Simon; Enault, Francois; Hurwitz, Bonnie L.
  • PeerJ, Vol. 3
  • DOI: 10.7717/peerj.985

IMG-ABC v.5.0: an update to the IMG/Atlas of Biosynthetic Gene Clusters Knowledgebase
journal, October 2019

  • Palaniappan, Krishnaveni; Chen, I-Min A.; Chu, Ken
  • Nucleic Acids Research
  • DOI: 10.1093/nar/gkz932

A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life
journal, August 2018

  • Parks, Donovan H.; Chuvochina, Maria; Waite, David W.
  • Nature Biotechnology, Vol. 36, Issue 10
  • DOI: 10.1038/nbt.4229

Microbial species delineation using whole genome sequences
journal, July 2015

  • Varghese, Neha J.; Mukherjee, Supratim; Ivanova, Natalia
  • Nucleic Acids Research, Vol. 43, Issue 14
  • DOI: 10.1093/nar/gkv657

CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes
journal, May 2015

  • Parks, Donovan H.; Imelfort, Michael; Skennerton, Connor T.
  • Genome Research, Vol. 25, Issue 7
  • DOI: 10.1101/gr.186072.114

A new genomic blueprint of the human gut microbiota
journal, February 2019


An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography
journal, October 2016

  • Nayfach, Stephen; Rodriguez-Mueller, Beltran; Garud, Nandita
  • Genome Research, Vol. 26, Issue 11
  • DOI: 10.1101/gr.201863.115

FAMSA: Fast and accurate multiple sequence alignment of huge protein families
journal, September 2016

  • Deorowicz, Sebastian; Debudaj-Grabysz, Agnieszka; Gudyś, Adam
  • Scientific Reports, Vol. 6, Issue 1
  • DOI: 10.1038/srep33964

BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs
journal, June 2019

  • Meleshko, Dmitry; Mohimani, Hosein; Tracanna, Vittorio
  • Genome Research, Vol. 29, Issue 8
  • DOI: 10.1101/gr.243477.118

MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
journal, January 2013

  • Katoh, K.; Standley, D. M.
  • Molecular Biology and Evolution, Vol. 30, Issue 4
  • DOI: 10.1093/molbev/mst010

Genome-centric view of carbon processing in thawing permafrost
journal, July 2018


Natural products from myxobacteria: novel metabolites and bioactivities
journal, January 2017

  • Herrmann, J.; Fayad, A. Abou; Müller, R.
  • Natural Product Reports, Vol. 34, Issue 2
  • DOI: 10.1039/C6NP00106H

IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies
journal, November 2014

  • Nguyen, Lam-Tung; Schmidt, Heiko A.; von Haeseler, Arndt
  • Molecular Biology and Evolution, Vol. 32, Issue 1
  • DOI: 10.1093/molbev/msu300

Mash: fast genome and metagenome distance estimation using MinHash
journal, June 2016


Adaptive seeds tame genomic sequence comparison
journal, January 2011


Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families
journal, November 2017

  • Kalvari, Ioanna; Argasinska, Joanna; Quinones-Olvera, Natalia
  • Nucleic Acids Research, Vol. 46, Issue D1
  • DOI: 10.1093/nar/gkx1038

Tackling soil diversity with the assembly of large, complex metagenomes
journal, March 2014

  • Howe, Adina Chuang; Jansson, Janet K.; Malfatti, Stephanie A.
  • Proceedings of the National Academy of Sciences, Vol. 111, Issue 13
  • DOI: 10.1073/pnas.1402564111

MIBiG 2.0: a repository for biosynthetic gene clusters of known function
journal, October 2019

  • Kautsar, Satria A.; Blin, Kai; Shaw, Simon
  • Nucleic Acids Research
  • DOI: 10.1093/nar/gkz882

Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle
journal, January 2019


KBase: The United States Department of Energy Systems Biology Knowledgebase
journal, July 2018

  • Arkin, Adam P.; Cottingham, Robert W.; Henry, Christopher S.
  • Nature Biotechnology, Vol. 36, Issue 7
  • DOI: 10.1038/nbt.4163

MUMmer4: A fast and versatile genome alignment system
journal, January 2018

  • Marçais, Guillaume; Delcher, Arthur L.; Phillippy, Adam M.
  • PLOS Computational Biology, Vol. 14, Issue 1
  • DOI: 10.1371/journal.pcbi.1005944

Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis
journal, June 2018

  • Crits-Christoph, Alexander; Diamond, Spencer; Butterfield, Cristina N.
  • Nature, Vol. 558, Issue 7710
  • DOI: 10.1038/s41586-018-0207-y

A library of human gut bacterial isolates paired with longitudinal multiomics data enables mechanistic microbiome research
journal, September 2019


CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats
journal, June 2007

  • Bland, Charles; Ramsey, Teresa L.; Sabree, Fareedah
  • BMC Bioinformatics, Vol. 8, Issue 1
  • DOI: 10.1186/1471-2105-8-209

Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity
journal, April 2018


GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database
journal, November 2019


Wide diversity of methane and short-chain alkane metabolisms in uncultured archaea
journal, March 2019


Accelerated Profile HMM Searches
journal, October 2011


Unusual biology across a group comprising more than 15% of domain Bacteria
journal, June 2015

  • Brown, Christopher T.; Hug, Laura A.; Thomas, Brian C.
  • Nature, Vol. 523, Issue 7559
  • DOI: 10.1038/nature14486

trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
journal, June 2009


Towards a Genome-Based Taxonomy for Prokaryotes
journal, September 2005


IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes
journal, November 2018

  • Paez-Espino, David; Roux, Simon; Chen, I-Min A.
  • Nucleic Acids Research, Vol. 47, Issue D1
  • DOI: 10.1093/nar/gky1127

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
journal, November 2015

  • O'Leary, Nuala A.; Wright, Mathew W.; Brister, J. Rodney
  • Nucleic Acids Research, Vol. 44, Issue D1
  • DOI: 10.1093/nar/gkv1189

Fast gapped-read alignment with Bowtie 2
journal, March 2012

  • Langmead, Ben; Salzberg, Steven L.
  • Nature Methods, Vol. 9, Issue 4
  • DOI: 10.1038/nmeth.1923

Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen
journal, February 2018


BLAST+: architecture and applications
journal, January 2009

  • Camacho, Christiam; Coulouris, George; Avagyan, Vahram
  • BMC Bioinformatics, Vol. 10, Issue 1
  • DOI: 10.1186/1471-2105-10-421

Charting the Complexity of the Marine Microbiome through Single-Cell Genomics
journal, December 2019


Community structure and metabolism through reconstruction of microbial genomes from the environment
journal, February 2004

  • Tyson, Gene W.; Chapman, Jarrod; Hugenholtz, Philip
  • Nature, Vol. 428, Issue 6978
  • DOI: 10.1038/nature02340

A computational framework to explore large-scale biosynthetic diversity
journal, November 2019

  • Navarro-Muñoz, Jorge C.; Selem-Mojica, Nelly; Mullowney, Michael W.
  • Nature Chemical Biology, Vol. 16, Issue 1
  • DOI: 10.1038/s41589-019-0400-9

Assembling metagenomes, one community at a time
journal, July 2017

  • van der Walt, Andries Johannes; van Goethem, Marc Warwick; Ramond, Jean-Baptiste
  • BMC Genomics, Vol. 18, Issue 1
  • DOI: 10.1186/s12864-017-3918-9

Expansive microbial metabolic versatility and biodiversity in dynamic Guaymas Basin hydrothermal sediments
journal, November 2018


antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline
journal, April 2019

  • Blin, Kai; Shaw, Simon; Steinke, Katharina
  • Nucleic Acids Research, Vol. 47, Issue W1
  • DOI: 10.1093/nar/gkz310

Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life
journal, September 2017

  • Parks, Donovan H.; Rinke, Christian; Chuvochina, Maria
  • Nature Microbiology, Vol. 2, Issue 11
  • DOI: 10.1038/s41564-017-0012-7

Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes
journal, July 2019


IMG/M v.5.0: an integrated data management and comparative analysis system for microbial genomes and microbiomes
journal, October 2018

  • Chen, I-Min A.; Chu, Ken; Palaniappan, Krishna
  • Nucleic Acids Research, Vol. 47, Issue D1
  • DOI: 10.1093/nar/gky901

Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system
journal, October 2016

  • Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.
  • Nature Communications, Vol. 7, Issue 1
  • DOI: 10.1038/ncomms13219

Insights into the phylogeny and coding potential of microbial dark matter
journal, July 2013

  • Rinke, Christian; Schwientek, Patrick; Sczyrba, Alexander
  • Nature, Vol. 499, Issue 7459
  • DOI: 10.1038/nature12352

The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans
journal, January 2018

  • Tully, Benjamin J.; Graham, Elaina D.; Heidelberg, John F.
  • Scientific Data, Vol. 5, Issue 1
  • DOI: 10.1038/sdata.2017.203

Infernal 1.0: inference of RNA alignments
journal, March 2009


The Sequence Alignment/Map format and SAMtools
journal, June 2009


Extraordinary phylogenetic diversity and metabolic versatility in aquifer sediment
journal, August 2013

  • Castelle, Cindy J.; Hug, Laura A.; Wrighton, Kelly C.
  • Nature Communications, Vol. 4, Issue 1
  • DOI: 10.1038/ncomms3120

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
journal, December 2009

  • Wu, Dongying; Hugenholtz, Philip; Mavromatis, Konstantinos
  • Nature, Vol. 462, Issue 7276
  • DOI: 10.1038/nature08656

Status of the Archaeal and Bacterial Census: an Update
journal, May 2016


Atmospheric trace gases support primary production in Antarctic desert surface soil
journal, December 2017

  • Ji, Mukan; Greening, Chris; Vanwonterghem, Inka
  • Nature, Vol. 552, Issue 7685
  • DOI: 10.1038/nature25014

Multiple origins of viral capsid proteins from cellular ancestors
journal, March 2017

  • Krupovic, Mart; Koonin, Eugene V.
  • Proceedings of the National Academy of Sciences, Vol. 114, Issue 12
  • DOI: 10.1073/pnas.1621061114