skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

Journal Article · · Nature Biotechnology
DOI:https://doi.org/10.1038/nbt.3886· OSTI ID:1379902
 [1];  [2];  [2];  [2]; ORCiD logo [3]; ORCiD logo [3];  [2];  [2]; ORCiD logo [2]; ORCiD logo [2];  [2]; ORCiD logo [2];  [4];  [5];  [6]; ORCiD logo [7];  [2];  [2];  [2];  [8] more »;  [2] « less
  1. Leibniz Inst. of German Collection of Microorganisms and Cell Cultures, Braunschweig (Germany); USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
  2. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
  3. Leibniz Inst. of German Collection of Microorganisms and Cell Cultures, Braunschweig (Germany)
  4. Univ. of Georgia, Athens, GA (United States). Dept. of Microbiology
  5. Michigan State Univ., East Lansing, MI (United States). Dept. of Microbiology and Molecular Genetics; Namesforlife LLC, East Lansing, MI (United States)
  6. Univ. of California, Davis, CA (United States). Genome Center
  7. Univ. of Queensland, Brisbane (Australia). Australian Centre for Ecogenomics
  8. Newcastle Univ., Tyne (United Kingdom). School of Biology

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1379902
Journal Information:
Nature Biotechnology, Vol. 35, Issue 7; ISSN 1087-0156
Publisher:
Springer NatureCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 141 works
Citation information provided by
Web of Science

References (71)

A New Genomics-Driven Taxonomy of Bacteria and Archaea: Are We There Yet? journal May 2016
Divorcing Strain Classification from Species Names journal June 2016
Genomes OnLine Database (GOLD) v.6: data updates and feature enhancements journal October 2016
Toward a standard in structural genome annotation for prokaryotes journal July 2015
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4) journal October 2015
The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families journal March 2007
The Amazon continuum dataset: quantitative metagenomic and metatranscriptomic inventories of the Amazon River plume, June 2010 journal January 2014
Conservation of Salmonella Infection Mechanisms in Plants and Animals journal September 2011
FastME 2.0: A Comprehensive, Accurate, and Fast Distance-Based Phylogeny Inference Program: Table 1. journal June 2015
Characterization of a novel phenazine antibiotic gene cluster in Erwinia herbicola Eh1087: Erwinia herbicola phenazine antibiotic gene cluster journal July 2002
Common themes and variations in the rhodanese superfamily journal January 2007
Complete genome sequence of Coraliomargarita akajimensis type strain (04OKA010-24T) journal June 2010
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
Microbial species delineation using whole genome sequences journal July 2015
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes journal May 2015
Kinetics and Strain Specificity of Rhizosphere and Endophytic Colonization by Enteric Bacteria on Seedlings of Medicago sativa and Medicago truncatula journal March 2003
Non-contiguous finished genome sequence and contextual data of the filamentous soil bacterium Ktedonobacter racemifer type strain (SOSP1-21T) journal October 2011
Adaptive seeds tame genomic sequence comparison journal January 2011
Uncovering Earth’s virome journal August 2016
Complete genome sequence of Treponema succinifaciens type strain (6091T) journal June 2011
Complete genome sequence of Sphaerobacter thermophilus type strain (S 6022T) journal January 2010
Complete genome sequence of the termite hindgut bacterium Spirochaeta coccoides type strain (SPN1T), reclassification in the genus Sphaerochaeta as Sphaerochaeta coccoides comb. nov. and emendations of the family Spirochaetaceae and the genus Sphaerochaeta journal May 2012
Structure and function of the global ocean microbiome journal May 2015
Phylogeny-driven target selection for large-scale genome-sequencing (and other) projects journal May 2013
A new view of the tree of life journal April 2016
Solirubrobacter soli sp. nov., isolated from soil of a ginseng field journal July 2007
Insights into Secondary Metabolism from a Global Analysis of Prokaryotic Biosynthetic Gene Clusters journal July 2014
antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters journal May 2015
Insights into the phylogeny and coding potential of microbial dark matter journal July 2013
Bioactive Microbial Metabolites: A Personal View journal January 2005
IMG/M: integrated genome and metagenome comparative data analysis system journal October 2016
Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions journal April 2013
The Fast Changing Landscape of Sequencing Technologies and Their Impact on Microbial Genome Assemblies and Annotation journal December 2012
Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences journal July 2006
Highly parallelized inference of large genome-based phylogenies: GENOME-BASED PHYLOGENIES
  • Meier-Kolthoff, Jan P.; Auch, Alexander F.; Klenk, Hans-Peter
  • Concurrency and Computation: Practice and Experience, Vol. 26, Issue 10 https://doi.org/10.1002/cpe.3112
journal August 2013
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea journal December 2009
GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes journal May 2010
Bioprospecting metagenomes: glycosyl hydrolases for converting biomass journal January 2009
Characterization of a metagenome-derived halotolerant cellulase journal October 2006
Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream journal July 2009
Diverse Bacteria Associated with Root Nodules of Spontaneous Legumes in Tunisia and First Report for nifH-like Gene within the Genera Microbacterium and Starkeya journal April 2006
Diversity of endophytic bacteria within nodules of the Sphaerophysa salsula in different regions of Loess Plateau in China: Sphaerophysa salsula endophytic bacteria journal March 2011
Genomic Encyclopedia of Type Strains, Phase I: The one thousand microbial genomes (KMG-I) project journal December 2013
Genomic Encyclopedia of Bacteria and Archaea: Sequencing a Myriad of Type Strains journal August 2014
Subsistence strategies in traditional societies distinguish gut microbiomes journal March 2015
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs journal February 2008
The regulation of the secondary metabolism of Streptomyces: new links and experimental advances journal January 2011
Objective: biochemical function journal July 2014
The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific journal March 2007
ALLPATHS: De novo assembly of whole-genome shotgun microreads journal February 2008
New Anticancer Antibiotics Pelagiomicins, Produced by a New Marine Bacterium Pelagiobacter variabilis. journal January 1997
IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites journal July 2015
The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4) journal February 2016
Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs journal January 2016
Rudaea cellulosilytica gen. nov., sp. nov., isolated from soil journal July 2009
kClust: fast and sensitive clustering of large protein sequence databases journal January 2013
BLAST+: architecture and applications journal January 2009
Dealing with incongruence in phylogenomic analyses journal October 2008
Discovery of Reactive Microbiota-Derived Metabolites that Inhibit Host Proteases journal January 2017
The All-Species Living Tree project: A 16S rRNA-based phylogenetic tree of all sequenced type strains journal September 2008
PhyloSift: phylogenetic analysis of genomes and metagenomes journal January 2014
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system journal October 2016
Impact of Genotypic Studies on Mycobacterial Taxonomy: the New Mycobacteria of the 1990s journal April 2003
Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy journal April 2011
Proteogenomic Analysis of a Thermophilic Bacterial Consortium Adapted to Deconstruct Switchgrass journal July 2013
Metagenomic Analysis Indicates Epsilonproteobacteria as a Potential Cause of Microbial Corrosion in Pipelines Injected with Bisulfite journal January 2016
En route to a genome-based classification of Archaea and Bacteria? journal June 2010
Novel Diagnostic Algorithm for Identification of Mycobacteria Using Genus-Specific Amplification of the 16S-23S rRNA Gene Spacer and Restriction Endonucleases journal January 2000
PAS Domains: Internal Sensors of Oxygen, Redox Potential, and Light journal June 1999
Coriander Genomics Database: a genomic, transcriptomic, and metabolic database for coriander journal April 2020
Reconstructing Rare Soil Microbial Genomes using in situ Enrichments and Metagenomics dataset January 2015

Cited By (60)

Genome analysis of the marine bacterium Kiloniella laminariae and first insights into comparative genomics with related Kiloniella species journal December 2019
New voyages to explore the natural product galaxy journal January 2019
Natural product drug discovery in the genomic era: realities, conjectures, misconceptions, and opportunities journal November 2018
Genomic and physiological analyses reveal that extremely thermophilic Caldicellulosiruptor changbaiensis deploys uncommon cellulose attachment mechanisms journal August 2019
Cultivation and sequencing of rumen microbiome members from the Hungate1000 Collection journal March 2018
Novel prosthecate bacteria from the candidate phylum Acetothermia journal June 2018
Co-culture and biogeography of Prochlorococcus and SAR11 journal February 2019
Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen journal February 2018
TYGS is an automated high-throughput platform for state-of-the-art genome-based taxonomy journal May 2019
Best practices for analysing microbiomes journal May 2018
Fungi in aquatic ecosystems journal March 2019
A human gut bacterial genome and culture collection for improved metagenomic analyses journal February 2019
Integrated analysis of population genomics, transcriptomics and virulence provides novel insights into Streptococcus pyogenes pathogenesis journal February 2019
Genome-based classification of micromonosporae with a focus on their biotechnological and ecological potential journal January 2018
Biosynthesis and incorporation of an alkylproline-derivative (APD) precursor into complex natural products journal January 2018
Discovery of recombinases enables genome mining of cryptic biosynthetic gene clusters in Burkholderiales species journal April 2018
Cheesomics: the future pathway to understanding cheese flavour and quality journal October 2018
Convergent Evolution among Ruminant-Pathogenic Mycoplasma Involved Extensive Gene Content Changes journal August 2018
HAMAP as SPARQL rules—A portable annotation pipeline for genomes and proteomes journal February 2020
gcMeta: a Global Catalogue of Metagenomics platform to support the archiving, standardization and analysis of microbiome data journal October 2018
Genomes OnLine database (GOLD) v.7: updates and new features journal October 2018
Proposal for a new classification of a deep branching bacterial phylogenetic lineage: transfer of Coprothermobacter proteolyticus and Coprothermobacter platensis to Coprothermobacteraceae fam. nov., within Coprothermobacterales ord. nov., Coprothermobacteria classis nov. and Coprothermobacterota phyl. nov. and emended description of the family Thermodesulfobiaceae journal May 2018
Using average nucleotide identity to improve taxonomic assignments in prokaryotic genomes at the NCBI journal July 2018
BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs journal June 2019
Comprehensive functional characterization of the glycoside hydrolase family 3 enzymes from Cellvibrio japonicus reveals unique metabolic roles in biomass saccharification : Complex glucan utilization in journal December 2017
Detecting signatures of a sponge-associated lifestyle in bacterial genomes: Sponge-associated lifestyle in bacterial genomes journal July 2018
A Genus Definition for Bacteria and Archaea Based on a Standard Genome Relatedness Index journal January 2020
Draft Genome Sequences of Seven Chryseobacterium Type Strains journal January 2019
Critical Assessment of Metagenome Interpretation Enters the Second Round journal July 2018
Unique Patterns and Biogeochemical Relevance of Two-Component Sensing in Marine Bacteria journal February 2019
eRP arrangement: a strategy for assembled genomic contig rearrangement based on replication profiling in bacteria journal October 2017
Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains journal January 2018
Comparative genomics of Bacteria commonly identified in the built environment journal January 2019
Antibacterial and anticancer activities of orphan biosynthetic gene clusters from Atlantis II Red Sea brine pool journal March 2019
Metagenomic reconstructions of gut microbial metabolism in weanling pigs journal March 2019
Genome-Based Taxonomic Classification of the Phylum Actinobacteria journal August 2018
Analysis of 1,000 Type-Strain Genomes Improves Taxonomic Classification of Bacteroidetes journal September 2019
Carboxylic Ester Hydrolases in Bacteria: Active Site, Structure, Function and Application journal November 2019
Recent Advances in Targeted and Untargeted Metabolomics by NMR and MS/NMR Methods journal April 2018
Comparison of Phylogenetic Tree Topologies for Nitrogen Associated Genes Partially Reconstruct the Evolutionary History of Saccharomyces cerevisiae journal December 2019
CRAGE enables rapid activation of biosynthetic gene clusters in undomesticated bacteria journal October 2019
CAMITAX: Taxon labels for microbial genomes journal January 2020
Scaling up: A guide to high-throughput genomic approaches for biodiversity analysis journal January 2018
Excisionase in Pf filamentous prophage controls lysis‐lysogeny decision‐making in Pseudomonas aeruginosa journal December 2018
A global atlas of the dominant bacteria found in soil journal January 2018
Multilevel social structure and diet shape the gut microbiota of the gelada monkey, the only grazing primate journal May 2018
Cyanobacteria and Alphaproteobacteria May Facilitate Cooperative Interactions in Niche Communities journal October 2017
Phylogenomic Analysis of the Gammaproteobacterial Methanotrophs (Order Methylococcales) Calls for the Reclassification of Members at the Genus and Species Levels journal December 2018
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life journal September 2017
High-quality draft genome sequences of Pseudomonas monteilii DSM 14164T, Pseudomonas mosselii DSM 17497T, Pseudomonas plecoglossicida DSM 15088T, Pseudomonas taiwanensis DSM 21245T and Pseudomonas vranovensis DSM 16006T: taxonomic considerations journal December 2019
Genome‐based evolutionary history of Pseudomonas spp journal June 2018
A review of computational tools for design and reconstruction of metabolic pathways journal December 2017
The global catalogue of microorganisms 10K type strain sequencing project: closing the genomic gaps for the validly published prokaryotic and fungi species journal March 2018
Establishment of recombineering genome editing system in Paraburkholderia megapolitana empowers activation of silent biosynthetic gene clusters journal February 2020
Draft Genome Sequence of Rubricoccus marinus SG-29 T , a Marine Bacterium within the Family Rhodothermaceae , Which Contains Two Different Rhodopsin Genes journal September 2017
Genomic Variations Underlying Speciation and Niche Specialization of Shewanella baltica journal October 2019
Draft genome sequence of Actinotignum schaalii DSM 15541T: Genetic insights into the lifestyle, cell fitness and virulence journal December 2017
The Composite 259-kb Plasmid of Martelella mediterranea DSM 17316T–A Natural Replicon with Functional RepABC Modules from Rhodobacteraceae and Rhizobiaceae journal September 2017
Commentary: Genome-Based Taxonomic Classification of the Phylum Actinobacteria journal February 2019
Identification of Molecular Markers That Are Specific to the Class Thermoleophilia journal May 2019