DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Rapid identification of enteric bacteria from whole genome sequences using average nucleotide identity metrics

Journal Article · · Frontiers in Microbiology

Identification of enteric bacteria species by whole genome sequence (WGS) analysis requires a rapid and an easily standardized approach. We leveraged the principles of average nucleotide identity using MUMmer (ANIm) software, which calculates the percent bases aligned between two bacterial genomes and their corresponding ANI values, to set threshold values for determining species consistent with the conventional identification methods of known species. The performance of species identification was evaluated using two datasets: the Reference Genome Dataset v2 (RGDv2), consisting of 43 enteric genome assemblies representing 32 species, and the Test Genome Dataset (TGDv1), comprising 454 genome assemblies which is designed to represent all species needed to query for identification, as well as rare and closely related species. The RGDv2 contains six Campylobacter spp., three Escherichia/Shigella spp., one Grimontia hollisae, six Listeria spp., one Photobacterium damselae, two Salmonella spp., and thirteen Vibrio spp., while the TGDv1 contains 454 enteric bacterial genomes representing 42 different species. The analysis showed that, when a standard minimum of 70% genome bases alignment existed, the ANI threshold values determined for these species were ≥95 for Escherichia/Shigella and Vibrio species, ≥93% for Salmonella species, and ≥92% for Campylobacter and Listeria species. Using these metrics, the RGDv2 accurately classified all validation strains in TGDv1 at the species level, which is consistent with the classification based on previous gold standard methods.

Research Organization:
Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
SC0014664
OSTI ID:
2471266
Journal Information:
Frontiers in Microbiology, Journal Name: Frontiers in Microbiology Vol. 14; ISSN 1664-302X
Publisher:
Frontiers Research FoundationCopyright Statement
Country of Publication:
United States
Language:
English

References (21)

The species concept for prokaryotes journal January 2001
Next-generation sequencing technologies and their application to the study and control of bacterial infections journal April 2018
High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries journal November 2018
Trends between gene content and genome size in prokaryotic species with larger genomes journal February 2004
Genomic insights that advance the species definition for prokaryotes journal February 2005
Shifting the genomic gold standard for the prokaryotic species definition journal October 2009
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing journal May 2012
An Overview of PulseNet USA Databases journal July 2019
A computational genomics pipeline for prokaryotic sequencing projects journal June 2010
DNA–DNA hybridization values and their relationship to whole-genome sequence similarities journal January 2007
Using average nucleotide identity to improve taxonomic assignments in prokaryotic genomes at the NCBI journal July 2018
Ribosomal MLST nucleotide identity (rMLST-NI), a rapid bacterial species identification method: application to Klebsiella and Raoultella genomic species validation journal September 2022
Ribosomal multilocus sequence typing: universal characterization of bacteria from domain to strain journal January 2012
A review of the taxonomy, genetics, and biology of the genus Escherichia and the type species Escherichia coli journal August 2021
Versatile and open software for comparing large genomes journal January 2004
Mash: fast genome and metagenome distance estimation using MinHash journal June 2016
GAMBIT (Genomic Approximation Method for Bacterial Identification and Tracking): A methodology to rapidly leverage whole genome sequencing of bacterial isolates for clinical identification journal February 2023
Whole Genome Sequencing: Bridging One-Health Surveillance of Foodborne Diseases journal June 2019
Corrigendum: Whole Genome Sequencing: Bridging One-Health Surveillance of Foodborne Diseases journal December 2019
Use of Whole Genome Sequencing by the Federal Interagency Collaboration for Genomics for Food and Feed Safety in the United States journal May 2022
The enveomics collection: a toolbox for specialized analyses of microbial genomes and metagenomes preprint March 2016