DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Strategies to improve reference databases for soil microbiomes

Abstract

A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 928 genomes of soil-associated organisms (888 bacteria, 34 archaea, and 6 fungi). Using this database as a representation of the current state of knowledge of soil microbes that are well-characterized, we evaluated its composition and compared it to broader microbial databases, specifically NCBI’s RefSeq, as well as 3,035 publicly available soil amplicon datasets. These comparisons identified phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. For example, RefSoil was observed to have increased representation of Firmicutes despite its low abundance in soil environments and also lacked representation of Acidobacteria and Verrucomicrobia, which are abundant in soils. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cellmore » genomics in a pilot experiment to recover 14 genomes from the "most wanted" list, which improved RefSoil's representation of EMP sequences by 7% by abundance. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.« less

Authors:
ORCiD logo [1];  [1]; ORCiD logo [2]; ORCiD logo [3];  [4];  [1];  [1];  [4]; ORCiD logo [5];  [1];  [1]
  1. Iowa State Univ., Ames, IA (United States). Dept. of Agricultural and Biosystems Engineering
  2. Bigelow Lab. for Ocean Sciences, East Boothbay, ME (United States)
  3. Univ. of British Columbia, Vancouver, BC (Canada). Dept. of Microbiology & Immunology
  4. Michigan State Univ., East Lansing, MI (United States). Center for Microbial Ecology
  5. Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Environmental Molecular Sciences Lab. (EMSL); Iowa State Univ., Ames, IA (United States). Dept. of Ecology, Evolution and Organismal Biology
Publication Date:
Research Org.:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER); National Science Foundation (NSF)
OSTI Identifier:
1353315
Report Number(s):
PNNL-SA-122172
Journal ID: ISSN 1751-7362
Grant/Contract Number:  
AC05-76RL01830; SC0010775
Resource Type:
Accepted Manuscript
Journal Name:
The ISME Journal
Additional Journal Information:
Journal Volume: 11; Journal Issue: 4; Journal ID: ISSN 1751-7362
Publisher:
Nature Publishing Group
Country of Publication:
United States
Language:
English
Subject:
96 KNOWLEDGE MANAGEMENT AND PRESERVATION; 59 BASIC BIOLOGICAL SCIENCES

Citation Formats

Choi, Jinlyung, Yang, Fan, Stepanauskas, Ramunas, Cardenas, Erick, Garoutte, Aaron, Williams, Ryan, Flater, Jared, Tiedje, James M., Hofmockel, Kirsten S., Gelder, Brian, and Howe, Adina. Strategies to improve reference databases for soil microbiomes. United States: N. p., 2016. Web. doi:10.1038/ismej.2016.168.
Choi, Jinlyung, Yang, Fan, Stepanauskas, Ramunas, Cardenas, Erick, Garoutte, Aaron, Williams, Ryan, Flater, Jared, Tiedje, James M., Hofmockel, Kirsten S., Gelder, Brian, & Howe, Adina. Strategies to improve reference databases for soil microbiomes. United States. https://doi.org/10.1038/ismej.2016.168
Choi, Jinlyung, Yang, Fan, Stepanauskas, Ramunas, Cardenas, Erick, Garoutte, Aaron, Williams, Ryan, Flater, Jared, Tiedje, James M., Hofmockel, Kirsten S., Gelder, Brian, and Howe, Adina. Fri . "Strategies to improve reference databases for soil microbiomes". United States. https://doi.org/10.1038/ismej.2016.168. https://www.osti.gov/servlets/purl/1353315.
@article{osti_1353315,
title = {Strategies to improve reference databases for soil microbiomes},
author = {Choi, Jinlyung and Yang, Fan and Stepanauskas, Ramunas and Cardenas, Erick and Garoutte, Aaron and Williams, Ryan and Flater, Jared and Tiedje, James M. and Hofmockel, Kirsten S. and Gelder, Brian and Howe, Adina},
abstractNote = {A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 928 genomes of soil-associated organisms (888 bacteria, 34 archaea, and 6 fungi). Using this database as a representation of the current state of knowledge of soil microbes that are well-characterized, we evaluated its composition and compared it to broader microbial databases, specifically NCBI’s RefSeq, as well as 3,035 publicly available soil amplicon datasets. These comparisons identified phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. For example, RefSoil was observed to have increased representation of Firmicutes despite its low abundance in soil environments and also lacked representation of Acidobacteria and Verrucomicrobia, which are abundant in soils. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to recover 14 genomes from the "most wanted" list, which improved RefSoil's representation of EMP sequences by 7% by abundance. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.},
doi = {10.1038/ismej.2016.168},
journal = {The ISME Journal},
number = 4,
volume = 11,
place = {United States},
year = {Fri Dec 09 00:00:00 EST 2016},
month = {Fri Dec 09 00:00:00 EST 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 61 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Microbiota-mediated colonization resistance against intestinal pathogens
journal, October 2013

  • Buffie, Charlie G.; Pamer, Eric G.
  • Nature Reviews Immunology, Vol. 13, Issue 11
  • DOI: 10.1038/nri3535

Structure, fluctuation and magnitude of a natural grassland soil metagenome
journal, February 2012

  • Delmont, Tom O.; Prestat, Emmanuel; Keegan, Kevin P.
  • The ISME Journal, Vol. 6, Issue 9
  • DOI: 10.1038/ismej.2011.197

Reconstructing the Microbial Diversity and Function of Pre-Agricultural Tallgrass Prairie Soils in the United States
journal, October 2013


Cross-biome metagenomic analyses of soil microbial communities and their functional attributes
journal, December 2012

  • Fierer, N.; Leff, J. W.; Adams, B. J.
  • Proceedings of the National Academy of Sciences, Vol. 109, Issue 52
  • DOI: 10.1073/pnas.1215210110

Single-cell genome sequencing: current state of the science
journal, January 2016

  • Gawad, Charles; Koh, Winston; Quake, Stephen R.
  • Nature Reviews Genetics, Vol. 17, Issue 3
  • DOI: 10.1038/nrg.2015.16

The Earth Microbiome project: successes and aspirations
journal, August 2014


Multi-omics of permafrost, active layer and thermokarst bog soil microbiomes
journal, March 2015

  • Hultman, Jenni; Waldrop, Mark P.; Mackelprang, Rachel
  • Nature, Vol. 521, Issue 7551
  • DOI: 10.1038/nature14238

Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus
journal, April 2014


Exploration of hitherto-uncultured bacteria from the rhizosphere: Exploration of hitherto-unculturable bacteria
journal, September 2009


Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences
journal, January 2014

  • Rideout, Jai Ram; He, Yan; Navas-Molina, Jose A.
  • PeerJ, Vol. 2
  • DOI: 10.7717/peerj.545

Insights into the phylogeny and coding potential of microbial dark matter
journal, July 2013

  • Rinke, Christian; Schwientek, Patrick; Sczyrba, Alexander
  • Nature, Vol. 499, Issue 7459
  • DOI: 10.1038/nature12352

Clostridium difficile infection: new developments in epidemiology and pathogenesis
journal, July 2009

  • Rupnik, Maja; Wilcox, Mark H.; Gerding, Dale N.
  • Nature Reviews Microbiology, Vol. 7, Issue 7
  • DOI: 10.1038/nrmicro2164

Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012

  • Segata, Nicola; Waldron, Levi; Ballarini, Annalisa
  • Nature Methods, Vol. 9, Issue 8
  • DOI: 10.1038/nmeth.2066

Single cell genomics: an individual look at microbes
journal, October 2012


Potential for Chemolithoautotrophy Among Ubiquitous Bacteria Lineages in the Dark Ocean
journal, September 2011


Cultivation of unculturable soil bacteria
journal, September 2012


Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy
journal, June 2007

  • Wang, Q.; Garrity, G. M.; Tiedje, J. M.
  • Applied and Environmental Microbiology, Vol. 73, Issue 16
  • DOI: 10.1128/AEM.00062-07

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
journal, December 2009

  • Wu, Dongying; Hugenholtz, Philip; Mavromatis, Konstantinos
  • Nature, Vol. 462, Issue 7276
  • DOI: 10.1038/nature08656

Single cell genomics: an individual look at microbes
journal, October 2012


The human microbiome: there is much left to do
journal, June 2022


Structure, fluctuation and magnitude of a natural grassland soil metagenome
journal, February 2012

  • Delmont, Tom O.; Prestat, Emmanuel; Keegan, Kevin P.
  • The ISME Journal, Vol. 6, Issue 9
  • DOI: 10.1038/ismej.2011.197

A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
journal, December 2009

  • Wu, Dongying; Hugenholtz, Philip; Mavromatis, Konstantinos
  • Nature, Vol. 462, Issue 7276
  • DOI: 10.1038/nature08656

Metagenomic microbial community profiling using unique clade-specific marker genes
journal, June 2012

  • Segata, Nicola; Waldron, Levi; Ballarini, Annalisa
  • Nature Methods, Vol. 9, Issue 8
  • DOI: 10.1038/nmeth.2066

Single-cell genome sequencing: current state of the science
journal, January 2016

  • Gawad, Charles; Koh, Winston; Quake, Stephen R.
  • Nature Reviews Genetics, Vol. 17, Issue 3
  • DOI: 10.1038/nrg.2015.16

Microbiota-mediated colonization resistance against intestinal pathogens
journal, October 2013

  • Buffie, Charlie G.; Pamer, Eric G.
  • Nature Reviews Immunology, Vol. 13, Issue 11
  • DOI: 10.1038/nri3535

Potential for Chemolithoautotrophy Among Ubiquitous Bacteria Lineages in the Dark Ocean
journal, September 2011


The Earth Microbiome project: successes and aspirations
journal, August 2014


Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences
journal, January 2014

  • Rideout, Jai Ram; He, Yan; Navas-Molina, Jose A.
  • PeerJ, Vol. 2
  • DOI: 10.7717/peerj.545

Works referencing / citing this record:

Structure and variation of root-associated microbiomes of potato grown in alfisol
journal, November 2019

  • Mardanova, Ayslu; Lutfullin, Marat; Hadieva, Guzel
  • World Journal of Microbiology and Biotechnology, Vol. 35, Issue 12
  • DOI: 10.1007/s11274-019-2761-3

Embracing the unknown: disentangling the complexities of the soil microbiome
journal, August 2017


Improved genome recovery and integrated cell-size analyses of individual uncultured microbial cells and viral particles
journal, July 2017

  • Stepanauskas, Ramunas; Fergusson, Elizabeth A.; Brown, Joseph
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/s41467-017-00128-z

Struo: a pipeline for building custom databases for common metagenome profilers
journal, November 2019


The Microbe Directory v2.0: An Expanded Database of Ecological and Phenotypical Features of Microbes
posted_content, December 2019


metagenomeFeatures: An R package for working with 16S rRNA reference databases and marker-gene survey feature data
journal, June 2018

  • Olson, Nathan D.; Shah, Nidhi; Kancherla, Jayaram
  • Bioinformatics
  • DOI: 10.1101/339812

Inference based PICRUSt accuracy varies across sample types and functional categories
posted_content, May 2019

  • Sun, Shan; Jones, Roshonda B.; Fodor, Anthony A.
  • BioRxiv
  • DOI: 10.1101/655746

A robust, cost‐effective method for DNA, RNA and protein co‐extraction from soil, other complex microbiomes and pure cultures
journal, February 2019

  • Thorn, Camilla E.; Bergesch, Christian; Joyce, Aoife
  • Molecular Ecology Resources, Vol. 19, Issue 2
  • DOI: 10.1111/1755-0998.12979

IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences
journal, August 2018


Genomes from uncultivated prokaryotes: a comparison of metagenome-assembled and single-amplified genomes
journal, September 2018

  • Alneberg, Johannes; Karlsson, Christofer M. G.; Divne, Anna-Maria
  • Microbiome, Vol. 6, Issue 1
  • DOI: 10.1186/s40168-018-0550-0

Inference-based accuracy of metagenome prediction tools varies across sample types and functional categories
journal, April 2020


Actinobacteria and Cyanobacteria Diversity in Terrestrial Antarctic Microenvironments Evaluated by Culture-Dependent and Independent Methods
journal, May 2019


A global survey of arsenic-related genes in soil microbiomes
journal, May 2019


Ecological selection for small microbial genomes along a temperate-to-thermal soil gradient
journal, November 2018

  • Sorensen, Jackson W.; Dunivin, Taylor K.; Tobin, Tammy C.
  • Nature Microbiology, Vol. 4, Issue 1
  • DOI: 10.1038/s41564-018-0276-6

metagenomeFeatures: an R package for working with 16S rRNA reference databases and marker-gene survey feature data
journal, March 2019


A global survey of arsenic-related genes in soil microbiomes
journal, May 2019


gapseq: informed prediction of bacterial metabolic pathways and reconstruction of accurate metabolic models
journal, March 2021


Actinobacteria and Cyanobacteria Diversity in Terrestrial Antarctic Microenvironments Evaluated by Culture-Dependent and Independent Methods
journal, May 2019