DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella

Journal Article · · Journal of Clinical Microbiology
 [1];  [2];  [3];  [2];  [1];  [4];  [3];  [3];  [1]
  1. Univ. of Chicago, Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Houston Medical Research Inst. and Houston Medical Hospital, Houston, TX (United States); Weill Cornell Medical College, New York, NY (United States)
  3. U.S. Food and Drug Administration, Laurel, MD (United States)
  4. Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, Chicago, IL (United States)

Nontyphoidal Salmonella species are the leading bacterial cause of food-borne disease in the United States. Whole genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, were used to generate XGBoost-based machine learning models for predicting minimum inhibitory concentrations (MICs) for 15 antibiotics. The MIC prediction models have an overall average accuracy of 95% within ± 1 two-fold dilution step (confidence interval of 95-95%), an average very major error rate of 2.7% (confidence interval of 2.4-3.0%) and an average major error rate of 0.1% (confidence interval of 0.1-0.2%). The model predicts MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Furthermore, our strategy for developing whole genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
National Institutes of Health (NIH), National Institute of Allergy and Infectious Diseases (NIAID); USDOE
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1494667
Journal Information:
Journal of Clinical Microbiology, Vol. 57, Issue 2; ISSN 0095-1137
Publisher:
American Society for MicrobiologyCopyright Statement
Country of Publication:
United States
Language:
English

References (64)

Genomic Epidemiology of Gonococcal Resistance to Extended-Spectrum Cephalosporins, Macrolides, and Fluoroquinolones in the United States, 2000–2013 journal September 2016
KMC 2: fast and resource-frugal k-mer counting journal January 2015
Quinolone resistance-determining region in the DNA gyrase gyrA gene of Escherichia coli. journal June 1990
Antimicrobial Susceptibility Testing: A Review of General Principles and Contemporary Practices journal December 2009
BayesHammer: Bayesian clustering for error correction in single-cell sequencing journal January 2013
Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae journal January 2018
The Use of Machine Learning Methodologies to Analyse Antibiotic and Biocide Susceptibility in Staphylococcus aureus journal February 2013
Quinolone-resistant mutations of the gyrA gene of Escherichia coli journal January 1988
Epidemiology, Clinical Presentation, Laboratory Diagnosis, Antimicrobial Resistance, and Antimicrobial Management of Invasive Salmonella Infections journal July 2015
PATRIC as a unique resource for studying antimicrobial resistance journal July 2017
Rapid Identification and Antibiotic Susceptibility Testing of Salmonella enterica Serovar Typhi Isolated from Blood: Implications for Therapy journal October 2001
Modal Codon Usage: Assessing the Typical Codon Usage of a Genome journal December 2009
Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center journal November 2016
Comparison of Typing Methods with a New Procedure Based on Sequence Characterization for Salmonella Serovar Prediction journal April 2013
Genomic sequencing of Neisseria gonorrhoeae to respond to the urgent threat of antimicrobial-resistant gonorrhea journal April 2017
Salmonella enterica : Survival, Colonization, and Virulence Differences among Serovars journal January 2015
Hospitalization and Antimicrobial Resistance in Salmonella Outbreaks, 1984–2002 journal June 2005
Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation journal October 2006
FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments journal March 2010
Making the Leap from Research Laboratory to Clinic: Challenges and Opportunities for Next-Generation Sequencing in Infectious Disease Diagnostics journal December 2015
XGBoost: A Scalable Tree Boosting System conference January 2016
Blood culture-based diagnosis of bacteraemia: state of the art journal April 2015
Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons journal September 2016
Clinical Outcomes of Nalidixic Acid, Ceftriaxone, and Multidrug-Resistant Nontyphoidal Salmonella Infections Compared with Pansusceptible Infections in FoodNet Sites, 2006–2008 journal May 2014
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability journal January 2013
WebLogo: A Sequence Logo Generator journal May 2004
Antimicrobial Resistance Prediction in PATRIC and RAST journal June 2016
Characterizing the Native Codon Usages of a Genome: An Axis Projection Approach journal August 2010
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing journal May 2012
A Common Mechanism of Cellular Death Induced by Bactericidal Antibiotics journal September 2007
On the Surprising Behavior of Distance Metrics in High Dimensional Space book January 2001
Effect of Antibiotic Therapy in Acute Salmonellosis on the Fecal Excretion of Salmonellae journal September 1969
Predicting antimicrobial susceptibilities for Escherichia coli and Klebsiella pneumoniae isolates using whole genomic sequence data journal May 2013
Understanding Machine Learning book July 2014
Evaluation of Machine Learning and Rules-Based Approaches for Predicting Antimicrobial Resistance Profiles in Gram-negative Bacilli from Whole Genome Sequence Data journal November 2016
Antimicrobial resistance: risk associated with antibiotic overuse and initiatives to reduce the problem journal September 2014
WGS to predict antibiotic MICs for Neisseria gonorrhoeae journal March 2017
Genomic analyses of Neisseria gonorrhoeae reveal an association of the gonococcal genetic island with antimicrobial resistance journal December 2016
ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads journal October 2017
BLAST+: architecture and applications journal January 2009
RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes journal February 2015
Antimicrobial resistance surveillance in the genomic age: AMR surveillance in the genomic age journal November 2016
Antimicrobial‐Resistant Nontyphoidal Salmonella Is Associated with Excess Bloodstream Infections and Hospitalizations journal February 2005
Clinical and microbiological implications of time-to-positivity of blood cultures in patients with Gram-negative bacilli bacteremia journal February 2013
Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis journal December 2015
Establishing Genotypic Cutoff Values To Measure Antimicrobial Resistance in Salmonella journal December 2016
Human Health Consequences of Antimicrobial Drug--Resistant Salmonella and Other Foodborne Pathogens journal December 2005
National Antimicrobial Resistance Monitoring System: Two Decades of Advancing Public Health Through Integrated Surveillance of Antimicrobial Resistance journal October 2017
Oxidation of the Guanine Nucleotide Pool Underlies Cell Death by Bactericidal Antibiotics journal April 2012
Transforming clinical microbiology with bacterial genome sequencing journal August 2012
Jalview Version 2--a multiple sequence alignment editor and analysis workbench journal January 2009
Evolutionary pathway to increased virulence and epidemic group A Streptococcus disease derived from 3,615 genome sequences journal April 2014
Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock* journal January 2006
Using whole genome sequencing to identify resistance determinants and predict antimicrobial resistance phenotypes for year 2015 invasive pneumococcal disease isolates recovered in the United States journal December 2016
Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella journal July 2016
A molecular trigger for intercontinental epidemics of group A Streptococcus journal August 2015
Machine learning for the prediction of antibacterial susceptibility in Mycobacterium tuberculosis conference June 2014
Selection criteria for an antimicrobial susceptibility testing system. journal January 1993
Systematic discovery of pseudomonad genetic factors involved in sensitivity to tailocins journal March 2021
Coriander Genomics Database: a genomic, transcriptomic, and metabolic database for coriander journal April 2020
Why does the Aβ peptide of Alzheimer share structural similarity with antimicrobial peptides? journal March 2020
ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads. text January 2017
BayesHammer: Bayesian clustering for error correction in single-cell sequencing text January 2012
KMC 2: Fast and resource-frugal $k$-mer counting text January 2014

Cited By (13)

The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities journal October 2019
Computational Health Engineering Applied to Model Infectious Diseases and Antimicrobial Resistance Spread journal June 2019
A hierarchical Bayesian latent class mixture model with censorship for detection of linear temporal changes in antibiotic resistance journal January 2020
Machine Learning Approaches for Epidemiological Investigations of Food-Borne Disease Outbreaks journal August 2019
Identification of Primary Antimicrobial Resistance Drivers in Agricultural Nontyphoidal Salmonella enterica Serovars by Using Machine Learning journal August 2019
Analysis of isolates from Bangladesh highlights multiple ways to carry resistance genes in Salmonella Typhi journal June 2019
Using Genomics to Track Global Antimicrobial Resistance journal September 2019
“It Takes a Village”: Mechanisms Underlying Antimicrobial Recalcitrance of Polymicrobial Biofilms journal September 2019
Machine Learning Techniques to Identify Antimicrobial Resistance in the Intensive Care Unit journal June 2019
Clinical Utility of Advanced Microbiology Testing Tools journal June 2019
Using Machine Learning Techniques to Aid Empirical Antibiotic Therapy Decisions in the Intensive Care Unit of a General Hospital in Greece journal January 2020
Enhancing the one health initiative by using whole genome sequencing to monitor antimicrobial resistance of animal pathogens: Vet-LIRN collaborative project with veterinary diagnostic laboratories in United States and Canada journal May 2019
BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes journal June 2021