DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella

Abstract

Nontyphoidal Salmonella species are the leading bacterial cause of food-borne disease in the United States. Whole genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, were used to generate XGBoost-based machine learning models for predicting minimum inhibitory concentrations (MICs) for 15 antibiotics. The MIC prediction models have an overall average accuracy of 95% within ± 1 two-fold dilution step (confidence interval of 95-95%), an average very major error rate of 2.7% (confidence interval of 2.4-3.0%) and an average major error rate of 0.1% (confidence interval of 0.1-0.2%). The model predicts MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predictingmore » MICs. To date, this is one of the largest MIC modeling studies to be published. Furthermore, our strategy for developing whole genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.« less

Authors:
 [1];  [2];  [3];  [2];  [1];  [4];  [3];  [3];  [1]
  1. Univ. of Chicago, Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Houston Medical Research Inst. and Houston Medical Hospital, Houston, TX (United States); Weill Cornell Medical College, New York, NY (United States)
  3. U.S. Food and Drug Administration, Laurel, MD (United States)
  4. Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, Chicago, IL (United States)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
National Institutes of Health (NIH), National Institute of Allergy and Infectious Diseases (NIAID); USDOE
OSTI Identifier:
1494667
Grant/Contract Number:  
AC02-06CH11357
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Clinical Microbiology
Additional Journal Information:
Journal Volume: 57; Journal Issue: 2; Journal ID: ISSN 0095-1137
Publisher:
American Society for Microbiology
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; antimicrobial susceptibility testing; deep learning; diagnostics; genome sequencing; machine learning

Citation Formats

Nguyen, Marcus, Long, S. Wesley, McDermott, Patrick F., Olsen, Randall J., Olson, Robert, Stevens, Rick L., Tyson, Gregory H., Zhao, Shaohua, and Davis, James J. Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella. United States: N. p., 2019. Web. doi:10.1128/JCM.01260-18.
Nguyen, Marcus, Long, S. Wesley, McDermott, Patrick F., Olsen, Randall J., Olson, Robert, Stevens, Rick L., Tyson, Gregory H., Zhao, Shaohua, & Davis, James J. Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella. United States. https://doi.org/10.1128/JCM.01260-18
Nguyen, Marcus, Long, S. Wesley, McDermott, Patrick F., Olsen, Randall J., Olson, Robert, Stevens, Rick L., Tyson, Gregory H., Zhao, Shaohua, and Davis, James J. Wed . "Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella". United States. https://doi.org/10.1128/JCM.01260-18. https://www.osti.gov/servlets/purl/1494667.
@article{osti_1494667,
title = {Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella},
author = {Nguyen, Marcus and Long, S. Wesley and McDermott, Patrick F. and Olsen, Randall J. and Olson, Robert and Stevens, Rick L. and Tyson, Gregory H. and Zhao, Shaohua and Davis, James J.},
abstractNote = {Nontyphoidal Salmonella species are the leading bacterial cause of food-borne disease in the United States. Whole genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, were used to generate XGBoost-based machine learning models for predicting minimum inhibitory concentrations (MICs) for 15 antibiotics. The MIC prediction models have an overall average accuracy of 95% within ± 1 two-fold dilution step (confidence interval of 95-95%), an average very major error rate of 2.7% (confidence interval of 2.4-3.0%) and an average major error rate of 0.1% (confidence interval of 0.1-0.2%). The model predicts MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Furthermore, our strategy for developing whole genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.},
doi = {10.1128/JCM.01260-18},
journal = {Journal of Clinical Microbiology},
number = 2,
volume = 57,
place = {United States},
year = {2019},
month = {1}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 104 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Genomic Epidemiology of Gonococcal Resistance to Extended-Spectrum Cephalosporins, Macrolides, and Fluoroquinolones in the United States, 2000–2013
journal, September 2016

  • Grad, Yonatan H.; Harris, Simon R.; Kirkcaldy, Robert D.
  • Journal of Infectious Diseases, Vol. 214, Issue 10
  • DOI: 10.1093/infdis/jiw420

KMC 2: fast and resource-frugal k-mer counting
journal, January 2015


Quinolone resistance-determining region in the DNA gyrase gyrA gene of Escherichia coli.
journal, June 1990

  • Yoshida, H.; Bogaki, M.; Nakamura, M.
  • Antimicrobial Agents and Chemotherapy, Vol. 34, Issue 6
  • DOI: 10.1128/AAC.34.6.1271

Antimicrobial Susceptibility Testing: A Review of General Principles and Contemporary Practices
journal, December 2009

  • Jorgensen, James H.; Ferraro, Mary Jane
  • Clinical Infectious Diseases, Vol. 49, Issue 11
  • DOI: 10.1086/647952

BayesHammer: Bayesian clustering for error correction in single-cell sequencing
journal, January 2013

  • Nikolenko, Sergey I.; Korobeynikov, Anton I.; Alekseyev, Max A.
  • BMC Genomics, Vol. 14, Issue Suppl 1
  • DOI: 10.1186/1471-2164-14-S1-S7

Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae
journal, January 2018


The Use of Machine Learning Methodologies to Analyse Antibiotic and Biocide Susceptibility in Staphylococcus aureus
journal, February 2013


Quinolone-resistant mutations of the gyrA gene of Escherichia coli
journal, January 1988

  • Yoshida, Hiroaki; Kojima, Tsuyoshi; Yamagishi, Jun-ichi
  • MGG Molecular & General Genetics, Vol. 211, Issue 1
  • DOI: 10.1007/BF00338386

Epidemiology, Clinical Presentation, Laboratory Diagnosis, Antimicrobial Resistance, and Antimicrobial Management of Invasive Salmonella Infections
journal, July 2015

  • Crump, John A.; Sjölund-Karlsson, Maria; Gordon, Melita A.
  • Clinical Microbiology Reviews, Vol. 28, Issue 4
  • DOI: 10.1128/CMR.00002-15

PATRIC as a unique resource for studying antimicrobial resistance
journal, July 2017

  • Antonopoulos, Dionysios A.; Assaf, Rida; Aziz, Ramy Karam
  • Briefings in Bioinformatics, Vol. 20, Issue 4
  • DOI: 10.1093/bib/bbx083

Rapid Identification and Antibiotic Susceptibility Testing of Salmonella enterica Serovar Typhi Isolated from Blood: Implications for Therapy
journal, October 2001


Modal Codon Usage: Assessing the Typical Codon Usage of a Genome
journal, December 2009

  • Davis, J. J.; Olsen, G. J.
  • Molecular Biology and Evolution, Vol. 27, Issue 4
  • DOI: 10.1093/molbev/msp281

Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center
journal, November 2016

  • Wattam, Alice R.; Davis, James J.; Assaf, Rida
  • Nucleic Acids Research, Vol. 45, Issue D1
  • DOI: 10.1093/nar/gkw1017

Comparison of Typing Methods with a New Procedure Based on Sequence Characterization for Salmonella Serovar Prediction
journal, April 2013

  • Ranieri, M. L.; Shi, C.; Moreno Switt, A. I.
  • Journal of Clinical Microbiology, Vol. 51, Issue 6
  • DOI: 10.1128/JCM.03201-12

Genomic sequencing of Neisseria gonorrhoeae to respond to the urgent threat of antimicrobial-resistant gonorrhea
journal, April 2017


Salmonella enterica : Survival, Colonization, and Virulence Differences among Serovars
journal, January 2015


Hospitalization and Antimicrobial Resistance in Salmonella Outbreaks, 1984–2002
journal, June 2005

  • Varma, Jay K.; Greene, Katherine D.; Ovitt, Jessa
  • Emerging Infectious Diseases, Vol. 11, Issue 6
  • DOI: 10.3201/eid1106.041231

Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation
journal, October 2006


FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments
journal, March 2010


XGBoost: A Scalable Tree Boosting System
conference, January 2016

  • Chen, Tianqi; Guestrin, Carlos
  • Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16
  • DOI: 10.1145/2939672.2939785

Blood culture-based diagnosis of bacteraemia: state of the art
journal, April 2015


Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons
journal, September 2016


MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
journal, January 2013

  • Katoh, K.; Standley, D. M.
  • Molecular Biology and Evolution, Vol. 30, Issue 4
  • DOI: 10.1093/molbev/mst010

WebLogo: A Sequence Logo Generator
journal, May 2004

  • Crooks, Gavin E.; Hon, Gary; Chandonia, John-Marc
  • Genome Research, Vol. 14, Issue 6, p. 1188-1190
  • DOI: 10.1101/gr.849004

Antimicrobial Resistance Prediction in PATRIC and RAST
journal, June 2016

  • Davis, James J.; Boisvert, Sébastien; Brettin, Thomas
  • Scientific Reports, Vol. 6, Issue 1
  • DOI: 10.1038/srep27930

Characterizing the Native Codon Usages of a Genome: An Axis Projection Approach
journal, August 2010

  • Davis, J. J.; Olsen, G. J.
  • Molecular Biology and Evolution, Vol. 28, Issue 1
  • DOI: 10.1093/molbev/msq185

Selection criteria for an antimicrobial susceptibility testing system.
journal, January 1993


SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
journal, May 2012

  • Bankevich, Anton; Nurk, Sergey; Antipov, Dmitry
  • Journal of Computational Biology, Vol. 19, Issue 5
  • DOI: 10.1089/cmb.2012.0021

A Common Mechanism of Cellular Death Induced by Bactericidal Antibiotics
journal, September 2007


Effect of Antibiotic Therapy in Acute Salmonellosis on the Fecal Excretion of Salmonellae
journal, September 1969


Predicting antimicrobial susceptibilities for Escherichia coli and Klebsiella pneumoniae isolates using whole genomic sequence data
journal, May 2013

  • Stoesser, N.; Batty, E. M.; Eyre, D. W.
  • Journal of Antimicrobial Chemotherapy, Vol. 68, Issue 10
  • DOI: 10.1093/jac/dkt180

Antimicrobial resistance: risk associated with antibiotic overuse and initiatives to reduce the problem
journal, September 2014


WGS to predict antibiotic MICs for Neisseria gonorrhoeae
journal, March 2017

  • Eyre, David W.; De Silva, Dilrini; Cole, Kevin
  • Journal of Antimicrobial Chemotherapy, Vol. 72, Issue 7
  • DOI: 10.1093/jac/dkx067

Genomic analyses of Neisseria gonorrhoeae reveal an association of the gonococcal genetic island with antimicrobial resistance
journal, December 2016

  • Harrison, Odile B.; Clemence, Marianne; Dillard, Joseph P.
  • Journal of Infection, Vol. 73, Issue 6
  • DOI: 10.1016/j.jinf.2016.08.010

ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads
journal, October 2017

  • Hunt, Martin; Mather, Alison E.; Sánchez-Busó, Leonor
  • Microbial Genomics, Vol. 3, Issue 10
  • DOI: 10.1099/mgen.0.000131

BLAST+: architecture and applications
journal, January 2009

  • Camacho, Christiam; Coulouris, George; Avagyan, Vahram
  • BMC Bioinformatics, Vol. 10, Issue 1
  • DOI: 10.1186/1471-2105-10-421

RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes
journal, February 2015

  • Brettin, Thomas; Davis, James J.; Disz, Terry
  • Scientific Reports, Vol. 5, Issue 1
  • DOI: 10.1038/srep08365

Antimicrobial resistance surveillance in the genomic age: AMR surveillance in the genomic age
journal, November 2016

  • McArthur, Andrew G.; Tsang, Kara K.
  • Annals of the New York Academy of Sciences, Vol. 1388, Issue 1
  • DOI: 10.1111/nyas.13289

Antimicrobial‐Resistant Nontyphoidal Salmonella Is Associated with Excess Bloodstream Infections and Hospitalizations
journal, February 2005

  • Varma, Jay K.; Mølbak, Kåre; Barrett, Timothy J.
  • The Journal of Infectious Diseases, Vol. 191, Issue 4
  • DOI: 10.1086/427263

Clinical and microbiological implications of time-to-positivity of blood cultures in patients with Gram-negative bacilli bacteremia
journal, February 2013

  • Palmer, H. R.; Palavecino, E. L.; Johnson, J. W.
  • European Journal of Clinical Microbiology & Infectious Diseases, Vol. 32, Issue 7
  • DOI: 10.1007/s10096-013-1833-9

Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis
journal, December 2015

  • Bradley, Phelim; Gordon, N. Claire; Walker, Timothy M.
  • Nature Communications, Vol. 6, Issue 1
  • DOI: 10.1038/ncomms10063

Establishing Genotypic Cutoff Values To Measure Antimicrobial Resistance in Salmonella
journal, December 2016

  • Tyson, Gregory H.; Zhao, Shaohua; Li, Cong
  • Antimicrobial Agents and Chemotherapy, Vol. 61, Issue 3
  • DOI: 10.1128/AAC.02140-16

Human Health Consequences of Antimicrobial Drug--Resistant Salmonella and Other Foodborne Pathogens
journal, December 2005

  • Angulo, F. J.; Molbak, K.
  • Clinical Infectious Diseases, Vol. 41, Issue 11
  • DOI: 10.1086/497599

National Antimicrobial Resistance Monitoring System: Two Decades of Advancing Public Health Through Integrated Surveillance of Antimicrobial Resistance
journal, October 2017

  • Karp, Beth E.; Tate, Heather; Plumblee, Jodie R.
  • Foodborne Pathogens and Disease, Vol. 14, Issue 10
  • DOI: 10.1089/fpd.2017.2283

Oxidation of the Guanine Nucleotide Pool Underlies Cell Death by Bactericidal Antibiotics
journal, April 2012


Transforming clinical microbiology with bacterial genome sequencing
journal, August 2012

  • Didelot, Xavier; Bowden, Rory; Wilson, Daniel J.
  • Nature Reviews Genetics, Vol. 13, Issue 9
  • DOI: 10.1038/nrg3226

Jalview Version 2--a multiple sequence alignment editor and analysis workbench
journal, January 2009


Evolutionary pathway to increased virulence and epidemic group A Streptococcus disease derived from 3,615 genome sequences
journal, April 2014

  • Nasser, W.; Beres, S. B.; Olsen, R. J.
  • Proceedings of the National Academy of Sciences, Vol. 111, Issue 17
  • DOI: 10.1073/pnas.1403138111

Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock*
journal, January 2006


Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella
journal, July 2016

  • McDermott, Patrick F.; Tyson, Gregory H.; Kabera, Claudine
  • Antimicrobial Agents and Chemotherapy, Vol. 60, Issue 9
  • DOI: 10.1128/AAC.01030-16

A molecular trigger for intercontinental epidemics of group A Streptococcus
journal, August 2015

  • Zhu, Luchang; Olsen, Randall J.; Nasser, Waleed
  • Journal of Clinical Investigation, Vol. 125, Issue 9
  • DOI: 10.1172/JCI82478

Machine learning for the prediction of antibacterial susceptibility in Mycobacterium tuberculosis
conference, June 2014

  • Niehaus, Katherine E.; Walker, Timothy M.; Crook, Derrick W.
  • 2014 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)
  • DOI: 10.1109/BHI.2014.6864440

Systematic discovery of pseudomonad genetic factors involved in sensitivity to tailocins
journal, March 2021


Coriander Genomics Database: a genomic, transcriptomic, and metabolic database for coriander
journal, April 2020


Why does the Aβ peptide of Alzheimer share structural similarity with antimicrobial peptides?
journal, March 2020

  • Pastore, Annalisa; Raimondi, Francesco; Rajendran, Lawrence
  • Communications Biology, Vol. 3, Issue 1
  • DOI: 10.1038/s42003-020-0865-9

ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads.
text, January 2017

  • Hunt, Martin; Mather, Alison E.; Sánchez-Busó, Leonor
  • Apollo - University of Cambridge Repository
  • DOI: 10.17863/cam.23482

BayesHammer: Bayesian clustering for error correction in single-cell sequencing
text, January 2012


KMC 2: Fast and resource-frugal $k$-mer counting
text, January 2014


Works referencing / citing this record:

The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities
journal, October 2019

  • Davis, James J.; Wattam, Alice R.; Aziz, Ramy K.
  • Nucleic Acids Research
  • DOI: 10.1093/nar/gkz943

Computational Health Engineering Applied to Model Infectious Diseases and Antimicrobial Resistance Spread
journal, June 2019

  • Cartelle Gestal, Mónica; Dedloff, Margaret R.; Torres-Sangiao, Eva
  • Applied Sciences, Vol. 9, Issue 12
  • DOI: 10.3390/app9122486

Machine Learning Approaches for Epidemiological Investigations of Food-Borne Disease Outbreaks
journal, August 2019

  • Vilne, Baiba; Meistere, Irēna; Grantiņa-Ieviņa, Lelde
  • Frontiers in Microbiology, Vol. 10
  • DOI: 10.3389/fmicb.2019.01722

Identification of Primary Antimicrobial Resistance Drivers in Agricultural Nontyphoidal Salmonella enterica Serovars by Using Machine Learning
journal, August 2019


Analysis of isolates from Bangladesh highlights multiple ways to carry resistance genes in Salmonella Typhi
journal, June 2019

  • Lima, Nicholas Costa Barroso; Tanmoy, Arif M.; Westeel, Emilie
  • BMC Genomics, Vol. 20, Issue 1
  • DOI: 10.1186/s12864-019-5916-6

Using Genomics to Track Global Antimicrobial Resistance
journal, September 2019

  • Hendriksen, Rene S.; Bortolaia, Valeria; Tate, Heather
  • Frontiers in Public Health, Vol. 7
  • DOI: 10.3389/fpubh.2019.00242

Machine Learning Techniques to Identify Antimicrobial Resistance in the Intensive Care Unit
journal, June 2019

  • Martínez-Agüero, Sergio; Mora-Jiménez, Inmaculada; Lérida-García, Jon
  • Entropy, Vol. 21, Issue 6
  • DOI: 10.3390/e21060603

Clinical Utility of Advanced Microbiology Testing Tools
journal, June 2019

  • Miller, Melissa B.; Atrzadeh, Faranak; Burnham, Carey-Ann D.
  • Journal of Clinical Microbiology, Vol. 57, Issue 9
  • DOI: 10.1128/jcm.00495-19

Using Machine Learning Techniques to Aid Empirical Antibiotic Therapy Decisions in the Intensive Care Unit of a General Hospital in Greece
journal, January 2020


BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes
journal, June 2021