skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: How Much Do rRNA Gene Surveys Underestimate Extant Bacterial Diversity?

Journal Article · · Applied and Environmental Microbiology
DOI:https://doi.org/10.1128/aem.00014-18· OSTI ID:1503621
ORCiD logo [1];  [1];  [2];  [3];  [3];  [1];  [4]
  1. Georgia Inst. of Technology, Atlanta, GA (United States)
  2. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
  3. Michigan State Univ., East Lansing, MI (United States)
  4. Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

The most common practice in studying and cataloguing prokaryotic diversity involves the grouping of sequences into operational taxonomic units (OTUs) at the 97% 16S rRNA gene sequence identity level, often using partial gene sequences, such as PCR-generated amplicons. Due to the high sequence conservation of rRNA genes, organisms belonging to closely related yet distinct species may be grouped under the same OTU. With this being said, it remains unclear how much diversity has been underestimated by this practice. To address this question, we compared the OTUs of genomes defined at the 97% or 98.5% 16S rRNA gene identity level against OTUs of the same genomes defined at the 95% whole-genome average nucleotide identity (ANI), which is a much more accurate proxy for species. Our results show that OTUs resulting from a 98.5% 16S rRNA gene identity cutoff are more accurate than 97% compared to 95% ANI (90.5% versus 89.9% accuracy) but indistinguishable from any other threshold in the 98.29 to 98.78% range. Even with the more stringent thresholds, the 16S rRNA gene-based approach commonly underestimates the number of OTUs by ~12%, on average, compared to the ANI-based approach (~14% underestimation when using the 97% identity threshold). Moreover, the degree of underestimation can become 50% or more for certain taxa, such as the genera Pseudomonas, Burkholderia, Escherichia, Campylobacter, and Citrobacter. These results provide a quantitative view of the degree of underestimation of extant prokaryotic diversity by 16S rRNA gene-defined OTUs and suggest that genomic resolution is often necessary. IMPORTANCE: Species diversity is one of the most fundamental pieces of information for community ecology and conservational biology. Thus, employing accurate proxies for what a species or the unit of diversity is are cornerstones for a large set of microbial ecology and diversity studies. The most common proxies currently used rely on the clustering of 16S rRNA gene sequences at some threshold of nucleotide identity, typically 97% or 98.5%. Here, we explore how well this strategy reflects the more accurate whole-genome-based proxies and determine the frequency with which the high conservation of 16S rRNA sequences masks substantial species-level diversity.

Research Organization:
Michigan State Univ., East Lansing, MI (United States); University of California, Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities Division
Grant/Contract Number:
FG02-99ER62848
OSTI ID:
1503621
Journal Information:
Applied and Environmental Microbiology, Vol. 84, Issue 6; ISSN 0099-2240
Publisher:
American Society for MicrobiologyCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 36 works
Citation information provided by
Web of Science

References (37)

Then and now: a systematic review of the systematics of prokaryotes in the last 80 years journal December 2013
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins journal December 2004
Oligotyping: differentiating between closely related microbial taxa using 16S rRNA gene data journal October 2013
Genotypic Diversity Within a Natural Coastal Bacterioplankton Population journal February 2005
High-throughput ANI Analysis of 90K Prokaryotic Genomes Reveals Clear Species Boundaries posted_content October 2017
Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes journal February 2014
Unusual biology across a group comprising more than 15% of domain Bacteria journal June 2015
Trait-based approaches for understanding microbial biodiversity and ecosystem functioning journal May 2014
Estimating prokaryotic diversity and its limits journal July 2002
Objective Criteria for the Evaluation of Clustering Methods journal December 1971
BEDTools: a flexible suite of utilities for comparing genomic features journal January 2010
The species concept for prokaryotes journal January 2001
V-Xtractor: An open-source, high-throughput software tool to identify and extract hypervariable regions of small subunit (16S/18S) ribosomal RNA gene sequences journal November 2010
Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities journal October 2009
Complete Genome Sequence of Borrelia afzelii K78 and Comparative Genome Analysis journal March 2015
Microbiomes in light of traits: A phylogenetic perspective journal November 2015
QIIME allows analysis of high-throughput community sequencing data journal April 2010
The enveomics collection: a toolbox for specialized analyses of microbial genomes and metagenomes preprint March 2016
Comparing partitions journal December 1985
Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life journal September 2017
Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences journal August 2014
Recombination and the Nature of Bacterial Speciation journal January 2007
Scaling laws predict global microbial diversity journal May 2016
Microbial species delineation using whole genome sequences journal July 2015
The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes journal September 2008
Insights into the phylogeny and coding potential of microbial dark matter journal July 2013
EMBOSS: The European Molecular Biology Open Software Suite journal June 2000
DNA–DNA hybridization values and their relationship to whole-genome sequence similarities journal January 2007
Status of the Archaeal and Bacterial Census: an Update journal May 2016
Bacterial species may exist, metagenomics reveal: Bacterial species may exist journal December 2011
Genomic Insights into a New Citrobacter koseri Strain Revealed Gene Exchanges with the Virulence-Associated Yersinia pestis pPCP1 Plasmid journal March 2016
The enveomics collection: a toolbox for specialized analyses of microbial genomes and metagenomes preprint March 2016
Alcohol pretreatment of stools effect on culturomics journal March 2020
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins journal January 2007
Diversity of 16S rRNA Genes within Individual Prokaryotic Genomes journal August 2010
Trait-based approaches for understanding microbial biodiversity and ecosystem functioning text January 2014
Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes journal May 2014

Cited By (4)

Quantifying the changes in genetic diversity within sequence-discrete bacterial populations across a spatial and temporal riverine gradient journal November 2018
Resource heterogeneity structures aquatic bacterial communities journal May 2019
Genomic metrics made easy: what to do and where to go in the new era of bacterial taxonomy journal March 2019
The Microbial Genomes Atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level journal June 2018

Figures / Tables (6)


Similar Records

Microdiversity of an Abundant Terrestrial Bacterium Encompasses Extensive Variation in Ecologically Relevant Traits
Journal Article · Fri Dec 29 00:00:00 EST 2017 · mBio · OSTI ID:1503621

Benchmarking of Methods for Genomic Taxonomy
Journal Article · Wed Feb 26 00:00:00 EST 2014 · Journal of Clinical Microbiology · OSTI ID:1503621

Taxonomic and Metabolic Incongruence in the Ancient Genus Streptomyces
Journal Article · Fri Sep 20 00:00:00 EDT 2019 · Frontiers in Microbiology · OSTI ID:1503621