skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Strategies to improve reference databases for soil microbiomes

Journal Article · · The ISME Journal
ORCiD logo [1];  [1]; ORCiD logo [2]; ORCiD logo [3];  [4];  [1];  [1];  [4]; ORCiD logo [5];  [1];  [1]
  1. Iowa State Univ., Ames, IA (United States). Dept. of Agricultural and Biosystems Engineering
  2. Bigelow Lab. for Ocean Sciences, East Boothbay, ME (United States)
  3. Univ. of British Columbia, Vancouver, BC (Canada). Dept. of Microbiology & Immunology
  4. Michigan State Univ., East Lansing, MI (United States). Center for Microbial Ecology
  5. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Environmental Molecular Sciences Lab. (EMSL); Iowa State Univ., Ames, IA (United States). Dept. of Ecology, Evolution and Organismal Biology

A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 928 genomes of soil-associated organisms (888 bacteria, 34 archaea, and 6 fungi). Using this database as a representation of the current state of knowledge of soil microbes that are well-characterized, we evaluated its composition and compared it to broader microbial databases, specifically NCBI’s RefSeq, as well as 3,035 publicly available soil amplicon datasets. These comparisons identified phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. For example, RefSoil was observed to have increased representation of Firmicutes despite its low abundance in soil environments and also lacked representation of Acidobacteria and Verrucomicrobia, which are abundant in soils. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to recover 14 genomes from the "most wanted" list, which improved RefSoil's representation of EMP sequences by 7% by abundance. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER); National Science Foundation (NSF)
Grant/Contract Number:
AC05-76RL01830; SC0010775
OSTI ID:
1353315
Report Number(s):
PNNL-SA-122172
Journal Information:
The ISME Journal, Vol. 11, Issue 4; ISSN 1751-7362
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 61 works
Citation information provided by
Web of Science

References (20)

Microbiota-mediated colonization resistance against intestinal pathogens journal October 2013
Structure, fluctuation and magnitude of a natural grassland soil metagenome journal February 2012
Reconstructing the Microbial Diversity and Function of Pre-Agricultural Tallgrass Prairie Soils in the United States journal October 2013
Cross-biome metagenomic analyses of soil microbial communities and their functional attributes journal December 2012
Single-cell genome sequencing: current state of the science journal January 2016
The Earth Microbiome project: successes and aspirations journal August 2014
Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes journal March 2015
Improved Culturability of Soil Bacteria and Isolation in Pure Culture of Novel Members of the Divisions Acidobacteria, Actinobacteria, Proteobacteria, and Verrucomicrobia journal May 2002
Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus journal April 2014
Exploration of hitherto-uncultured bacteria from the rhizosphere: Exploration of hitherto-unculturable bacteria journal September 2009
Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences journal January 2014
Insights into the phylogeny and coding potential of microbial dark matter journal July 2013
Clostridium difficile infection: new developments in epidemiology and pathogenesis journal July 2009
Metagenomic microbial community profiling using unique clade-specific marker genes journal June 2012
Single cell genomics: an individual look at microbes journal October 2012
Potential for Chemolithoautotrophy Among Ubiquitous Bacteria Lineages in the Dark Ocean journal September 2011
Cultivation of unculturable soil bacteria journal September 2012
Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy journal June 2007
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea journal December 2009
The human microbiome: there is much left to do journal June 2022

Cited By (16)

Structure and variation of root-associated microbiomes of potato grown in alfisol journal November 2019
Embracing the unknown: disentangling the complexities of the soil microbiome journal August 2017
Improved genome recovery and integrated cell-size analyses of individual uncultured microbial cells and viral particles journal July 2017
Struo: a pipeline for building custom databases for common metagenome profilers journal November 2019
The Microbe Directory v2.0: An Expanded Database of Ecological and Phenotypical Features of Microbes posted_content December 2019
metagenomeFeatures: An R package for working with 16S rRNA reference databases and marker-gene survey feature data journal June 2018
Inference based PICRUSt accuracy varies across sample types and functional categories posted_content May 2019
A robust, cost‐effective method for DNA, RNA and protein co‐extraction from soil, other complex microbiomes and pure cultures journal February 2019
IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences journal August 2018
Genomes from uncultivated prokaryotes: a comparison of metagenome-assembled and single-amplified genomes journal September 2018
Inference-based accuracy of metagenome prediction tools varies across sample types and functional categories journal April 2020
Actinobacteria and Cyanobacteria Diversity in Terrestrial Antarctic Microenvironments Evaluated by Culture-Dependent and Independent Methods journal May 2019
A global survey of arsenic-related genes in soil microbiomes journal May 2019
Ecological selection for small microbial genomes along a temperate-to-thermal soil gradient journal November 2018
metagenomeFeatures: an R package for working with 16S rRNA reference databases and marker-gene survey feature data journal March 2019
gapseq: informed prediction of bacterial metabolic pathways and reconstruction of accurate metabolic models journal March 2021