skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life

Abstract

Challenges in cultivating microorganisms have limited the phylogenetic diversity of currently available microbial genomes. This is being addressed by advances in sequencing throughput and computational techniques that allow for the cultivation-independent recovery of genomes from metagenomes. Here, we report the reconstruction of 7,903 bacterial and archaeal genomes from >1,500 public metagenomes. All genomes are estimated to be ≥50% complete and nearly half are ≥90% complete with ≤5% contamination. These genomes increase the phylogenetic diversity of bacterial and archaeal genome trees by >30% and provide the first representatives of 17 bacterial and three archaeal candidate phyla. We also recovered 245 genomes from the Patescibacteria superphylum (also known as the Candidate Phyla Radiation) and find that the relative diversity of this group varies substantially with different protein marker sets. The scale and quality of this data set demonstrate that recovering genomes from metagenomes provides an expedient path forward to exploring microbial dark matter.

Authors:
ORCiD logo [1]; ORCiD logo [1];  [1];  [1];  [1];  [1]; ORCiD logo [1];  [1]
  1. Univ. of Queensland (Australia)
Publication Date:
Research Org.:
Univ. of Arizona, Tucson, AZ (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
OSTI Identifier:
1500024
Grant/Contract Number:  
[SC0010580]
Resource Type:
Accepted Manuscript
Journal Name:
Nature Microbiology
Additional Journal Information:
[ Journal Volume: 2; Journal Issue: 11]; Journal ID: ISSN 2058-5276
Publisher:
Nature Publishing Group
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES

Citation Formats

Parks, Donovan H., Rinke, Christian, Chuvochina, Maria, Chaumeil, Pierre-Alain, Woodcroft, Ben J., Evans, Paul N., Hugenholtz, Philip, and Tyson, Gene W. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. United States: N. p., 2017. Web. doi:10.1038/s41564-017-0012-7.
Parks, Donovan H., Rinke, Christian, Chuvochina, Maria, Chaumeil, Pierre-Alain, Woodcroft, Ben J., Evans, Paul N., Hugenholtz, Philip, & Tyson, Gene W. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. United States. doi:10.1038/s41564-017-0012-7.
Parks, Donovan H., Rinke, Christian, Chuvochina, Maria, Chaumeil, Pierre-Alain, Woodcroft, Ben J., Evans, Paul N., Hugenholtz, Philip, and Tyson, Gene W. Mon . "Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life". United States. doi:10.1038/s41564-017-0012-7. https://www.osti.gov/servlets/purl/1500024.
@article{osti_1500024,
title = {Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life},
author = {Parks, Donovan H. and Rinke, Christian and Chuvochina, Maria and Chaumeil, Pierre-Alain and Woodcroft, Ben J. and Evans, Paul N. and Hugenholtz, Philip and Tyson, Gene W.},
abstractNote = {Challenges in cultivating microorganisms have limited the phylogenetic diversity of currently available microbial genomes. This is being addressed by advances in sequencing throughput and computational techniques that allow for the cultivation-independent recovery of genomes from metagenomes. Here, we report the reconstruction of 7,903 bacterial and archaeal genomes from >1,500 public metagenomes. All genomes are estimated to be ≥50% complete and nearly half are ≥90% complete with ≤5% contamination. These genomes increase the phylogenetic diversity of bacterial and archaeal genome trees by >30% and provide the first representatives of 17 bacterial and three archaeal candidate phyla. We also recovered 245 genomes from the Patescibacteria superphylum (also known as the Candidate Phyla Radiation) and find that the relative diversity of this group varies substantially with different protein marker sets. The scale and quality of this data set demonstrate that recovering genomes from metagenomes provides an expedient path forward to exploring microbial dark matter.},
doi = {10.1038/s41564-017-0012-7},
journal = {Nature Microbiology},
number = [11],
volume = [2],
place = {United States},
year = {2017},
month = {9}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 210 works
Citation information provided by
Web of Science

Figures / Tables:

Fig. 1 Fig. 1: Assessment of genome quality. a, Estimated completeness and contamination of 7,903 genomes recovered from public metagenomes. Genome quality was defined as completeness − 5 × contamination, and only genomes with a quality of ≥ 50 were retained. Near-complete genomes (completeness ≥ 90%; contamination ≤ 5%) are shown inmore » red, medium-quality genomes (completeness ≥ 70%; contamination ≤ 10%) in blue, and partial genomes (completeness ≥ 50%; contamination ≤ 4%) in grey. Histograms along the x and y axes show the percentage of genomes at varying levels of completeness and contamination, respectively. Notably, only 171 of the 7,903 (2.2%) UBA genomes have > 5% contamination. b, Number of scaffolds comprising each of the 7,903 genomes with colours indicating genome quality. c, Number of tRNAs for each of the 20 standard amino acids identified within each of the 7,903 genomes with colours indicating genome quality.« less

Save / Share:

Works referenced in this record:

rrnDB: improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development
journal, November 2014

  • Stoddard, Steven F.; Smith, Byron J.; Hein, Robert
  • Nucleic Acids Research, Vol. 43, Issue D1
  • DOI: 10.1093/nar/gku1201

IMG/M 4 version of the integrated metagenome comparative analysis system
journal, October 2013

  • Markowitz, Victor M.; Chen, I-Min A.; Chu, Ken
  • Nucleic Acids Research, Vol. 42, Issue D1, p. D568-D573
  • DOI: 10.1093/nar/gkt919

Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics
journal, October 2015


The Binning of Metagenomic Contigs for Microbial Physiology of Mixed Cultures
journal, January 2012


The MG-RAST metagenomics database and portal in 2015
journal, December 2015

  • Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang
  • Nucleic Acids Research, Vol. 44, Issue D1
  • DOI: 10.1093/nar/gkv1322

Genome Project Standards in a New Era of Sequencing
journal, October 2009


A Catalog of Reference Genomes from the Human Microbiome
journal, May 2010


1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life
journal, June 2017

  • Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.
  • Nature Biotechnology, Vol. 35, Issue 7
  • DOI: 10.1038/nbt.3886

Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization
journal, August 2012

  • Sharon, I.; Morowitz, M. J.; Thomas, B. C.
  • Genome Research, Vol. 23, Issue 1
  • DOI: 10.1101/gr.142315.112

Fast and accurate short read alignment with Burrows-Wheeler transform
journal, May 2009


Fermentation, Hydrogen, and Sulfur Metabolism in Multiple Uncultivated Bacterial Phyla
journal, September 2012


RefSeq microbial genomes database: new representation and annotation strategy
journal, December 2013

  • Tatusova, Tatiana; Ciufo, Stacy; Fedorov, Boris
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1274

An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea
journal, December 2011

  • McDonald, Daniel; Price, Morgan N.; Goodrich, Julia
  • The ISME Journal, Vol. 6, Issue 3
  • DOI: 10.1038/ismej.2011.139

Reconstructing 16S rRNA genes in metagenomic data
journal, June 2015


First genomic insights into members of a candidate bacterial phylum responsible for wastewater bulking
journal, January 2015

  • Sekiguchi, Yuji; Ohashi, Akiko; Parks, Donovan H.
  • PeerJ, Vol. 3
  • DOI: 10.7717/peerj.740

An Expanded Genomic Representation of the Phylum Cyanobacteria
journal, May 2014

  • Soo, Rochelle M.; Skennerton, Connor T.; Sekiguchi, Yuji
  • Genome Biology and Evolution, Vol. 6, Issue 5
  • DOI: 10.1093/gbe/evu073

The Sequence Read Archive
journal, November 2010

  • Leinonen, R.; Sugawara, H.; Shumway, M.
  • Nucleic Acids Research, Vol. 39, Issue Database
  • DOI: 10.1093/nar/gkq1019

Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences
journal, August 2014

  • Yarza, Pablo; Yilmaz, Pelin; Pruesse, Elmar
  • Nature Reviews Microbiology, Vol. 12, Issue 9
  • DOI: 10.1038/nrmicro3330

tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence
journal, March 1997


Genomic Encyclopedia of Type Strains, Phase I: The one thousand microbial genomes (KMG-I) project
journal, December 2013

  • Kyrpides, Nikos C.; Woyke, Tanja; Eisen, Jonathan A.
  • Standards in Genomic Sciences, Vol. 9, Issue 3
  • DOI: 10.4056/sigs.5068949

Bacterial transfer RNAs
journal, March 2015

  • Shepherd, Jennifer; Ibba, Michael
  • FEMS Microbiology Reviews, Vol. 39, Issue 3
  • DOI: 10.1093/femsre/fuv004

Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010


Microbial species delineation using whole genome sequences
journal, July 2015

  • Varghese, Neha J.; Mukherjee, Supratim; Ivanova, Natalia
  • Nucleic Acids Research, Vol. 43, Issue 14
  • DOI: 10.1093/nar/gkv657

CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes
journal, May 2015

  • Parks, Donovan H.; Imelfort, Michael; Skennerton, Connor T.
  • Genome Research, Vol. 25, Issue 7
  • DOI: 10.1101/gr.186072.114

Sequencing depth and coverage: key considerations in genomic analyses
journal, January 2014

  • Sims, David; Sudbery, Ian; Ilott, Nicholas E.
  • Nature Reviews Genetics, Vol. 15, Issue 2
  • DOI: 10.1038/nrg3642

Editorial: Microbiotechnology Based Surfactants and Their Applications
journal, December 2015

  • Rahman, Pattanathu K. S. M.; Randhawa, Kamaljeet K. Sekhon
  • Frontiers in Microbiology, Vol. 6
  • DOI: 10.3389/fmicb.2015.01344

Small Genomes and Sparse Metabolisms of Sediment-Associated Bacteria from Four Candidate Phyla
journal, October 2013


Genomic Expansion of Domain Archaea Highlights Roles for Organisms from New Phyla in Anaerobic Carbon Cycling
journal, March 2015

  • Castelle, Cindy J.; Wrighton, Kelly C.; Thomas, Brian C.
  • Current Biology, Vol. 25, Issue 6
  • DOI: 10.1016/j.cub.2015.01.014

Comparative Genomics of Candidate Phylum TM6 Suggests That Parasitism Is Widespread and Ancestral in This Lineage
journal, November 2015

  • Yeoh, Yun Kit; Sekiguchi, Yuji; Parks, Donovan H.
  • Molecular Biology and Evolution, Vol. 33, Issue 4
  • DOI: 10.1093/molbev/msv281

Genomic Encyclopedia of Bacterial and Archaeal Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains
journal, May 2015

  • Whitman, William B.; Woyke, Tanja; Klenk, Hans-Peter
  • Standards in Genomic Sciences, Vol. 10, Issue 1
  • DOI: 10.1186/s40793-015-0017-x

An archaeal origin of eukaryotes supports only two primary domains of life
journal, December 2013

  • Williams, Tom A.; Foster, Peter G.; Cox, Cymon J.
  • Nature, Vol. 504, Issue 7479
  • DOI: 10.1038/nature12779

The TIGRFAMs database of protein families
journal, January 2003


ARB: a software environment for sequence data
journal, February 2004


A catalogue of 136 microbial draft genomes from Red Sea metagenomes
journal, July 2016

  • Haroon, Mohamed F.; Thompson, Luke R.; Parks, Donovan H.
  • Scientific Data, Vol. 3, Issue 1
  • DOI: 10.1038/sdata.2016.50

Unusual biology across a group comprising more than 15% of domain Bacteria
journal, June 2015

  • Brown, Christopher T.; Hug, Laura A.; Thomas, Brian C.
  • Nature, Vol. 523, Issue 7559
  • DOI: 10.1038/nature14486

Genome-centric resolution of microbial diversity, metabolism and interactions in anaerobic digestion: Genome-centric resolution through deep metagenomics
journal, July 2016

  • Vanwonterghem, Inka; Jensen, Paul D.; Rabaey, Korneel
  • Environmental Microbiology, Vol. 18, Issue 9
  • DOI: 10.1111/1462-2920.13382

Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes
journal, May 2013

  • Albertsen, Mads; Hugenholtz, Philip; Skarshewski, Adam
  • Nature Biotechnology, Vol. 31, Issue 6
  • DOI: 10.1038/nbt.2579

Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
journal, October 2009

  • Schloss, P. D.; Westcott, S. L.; Ryabin, T.
  • Applied and Environmental Microbiology, Vol. 75, Issue 23, p. 7537-7541
  • DOI: 10.1128/AEM.01541-09

Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes
journal, July 2014

  • Nielsen, H. Bjørn; Almeida, Mathieu; Juncker, Agnieszka Sierakowska
  • Nature Biotechnology, Vol. 32, Issue 8
  • DOI: 10.1038/nbt.2939

BLAST+: architecture and applications
journal, January 2009

  • Camacho, Christiam; Coulouris, George; Avagyan, Vahram
  • BMC Bioinformatics, Vol. 10, Issue 1
  • DOI: 10.1186/1471-2105-10-421

Community structure and metabolism through reconstruction of microbial genomes from the environment
journal, February 2004

  • Tyson, Gene W.; Chapman, Jarrod; Hugenholtz, Philip
  • Nature, Vol. 428, Issue 6978
  • DOI: 10.1038/nature02340

Single-cell genome sequencing: current state of the science
journal, January 2016

  • Gawad, Charles; Koh, Winston; Quake, Stephen R.
  • Nature Reviews Genetics, Vol. 17, Issue 3
  • DOI: 10.1038/nrg.2015.16

Methylotrophic methanogenesis discovered in the archaeal phylum Verstraetearchaeota
journal, October 2016


The bright side of microbial dark matter: lessons learned from the uncultivated majority
journal, June 2016


Complex archaea that bridge the gap between prokaryotes and eukaryotes
journal, May 2015

  • Spang, Anja; Saw, Jimmy H.; Jørgensen, Steffen L.
  • Nature, Vol. 521, Issue 7551
  • DOI: 10.1038/nature14447

GroopM: an automated tool for the recovery of population genomes from related metagenomes
journal, January 2014

  • Imelfort, Michael; Parks, Donovan; Woodcroft, Ben J.
  • PeerJ, Vol. 2
  • DOI: 10.7717/peerj.603

Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system
journal, October 2016

  • Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.
  • Nature Communications, Vol. 7, Issue 1
  • DOI: 10.1038/ncomms13219

Insights into the phylogeny and coding potential of microbial dark matter
journal, July 2013

  • Rinke, Christian; Schwientek, Patrick; Sczyrba, Alexander
  • Nature, Vol. 499, Issue 7459
  • DOI: 10.1038/nature12352

Pfam: the protein families database
journal, November 2013

  • Finn, Robert D.; Bateman, Alex; Clements, Jody
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1223

FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix
journal, April 2009

  • Price, M. N.; Dehal, P. S.; Arkin, A. P.
  • Molecular Biology and Evolution, Vol. 26, Issue 7
  • DOI: 10.1093/molbev/msp077

ConStrains identifies microbial strains in metagenomic datasets
journal, September 2015

  • Luo, Chengwei; Knight, Rob; Siljander, Heli
  • Nature Biotechnology, Vol. 33, Issue 10
  • DOI: 10.1038/nbt.3319

Infernal 1.0: inference of RNA alignments
journal, March 2009


Dissecting biological "dark matter" with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth
journal, July 2007

  • Marcy, Y.; Ouverney, C.; Bik, E. M.
  • Proceedings of the National Academy of Sciences, Vol. 104, Issue 29, p. 11889-11894
  • DOI: 10.1073/pnas.0704662104

Database resources of the National Center for Biotechnology Information
journal, May 2009

  • Sayers, E. W.; Barrett, T.; Benson, D. A.
  • Nucleic Acids Research, Vol. 37, Issue 9
  • DOI: 10.1093/nar/gkp382

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
journal, December 2009

  • Wu, Dongying; Hugenholtz, Philip; Mavromatis, Konstantinos
  • Nature, Vol. 462, Issue 7276
  • DOI: 10.1038/nature08656

    Works referencing / citing this record:

    A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life
    journal, August 2018

    • Parks, Donovan H.; Chuvochina, Maria; Waite, David W.
    • Nature Biotechnology, Vol. 36, Issue 10
    • DOI: 10.1038/nbt.4229

    Discovery of several novel, widespread, and ecologically distinct marine Thaumarchaeota viruses that encode amoC nitrification genes
    journal, October 2018


    The future of environmental microbiology: a perspective
    journal, June 2018

    • Wackett, Lawrence P.; Robinson, Serina L.
    • Environmental Microbiology, Vol. 20, Issue 6
    • DOI: 10.1111/1462-2920.14256

    Quantitatively Partitioning Microbial Genomic Traits among Taxonomic Ranks across the Microbial Tree of Life
    journal, August 2019


    Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem
    journal, February 2018


    Co-culture and biogeography of Prochlorococcus and SAR11
    journal, February 2019


    Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes
    journal, July 2019


    The social network of microorganisms — how auxotrophies shape complex communities
    journal, March 2018


    On the maverick Planctomycetes
    journal, July 2018

    • Wiegand, Sandra; Jogler, Mareike; Jogler, Christian
    • FEMS Microbiology Reviews, Vol. 42, Issue 6
    • DOI: 10.1093/femsre/fuy029

    PhySpeTree: an automated pipeline for reconstructing phylogenetic species trees
    journal, December 2019


    Evolutionary Implications of Anoxygenic Phototrophy in the Bacterial Phylum Candidatus Eremiobacterota (WPS-2)
    journal, July 2019

    • Ward, Lewis M.; Cardona, Tanai; Holland-Moritz, Hannah
    • Frontiers in Microbiology, Vol. 10
    • DOI: 10.3389/fmicb.2019.01658

      Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.