Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella
Abstract
Nontyphoidal Salmonella species are the leading bacterial cause of food-borne disease in the United States. Whole genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, were used to generate XGBoost-based machine learning models for predicting minimum inhibitory concentrations (MICs) for 15 antibiotics. The MIC prediction models have an overall average accuracy of 95% within ± 1 two-fold dilution step (confidence interval of 95-95%), an average very major error rate of 2.7% (confidence interval of 2.4-3.0%) and an average major error rate of 0.1% (confidence interval of 0.1-0.2%). The model predicts MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predictingmore »
- Authors:
-
- Univ. of Chicago, Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)
- Houston Medical Research Inst. and Houston Medical Hospital, Houston, TX (United States); Weill Cornell Medical College, New York, NY (United States)
- U.S. Food and Drug Administration, Laurel, MD (United States)
- Argonne National Lab. (ANL), Argonne, IL (United States); Univ. of Chicago, Chicago, IL (United States)
- Publication Date:
- Research Org.:
- Argonne National Lab. (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- National Institutes of Health (NIH), National Institute of Allergy and Infectious Diseases (NIAID); USDOE
- OSTI Identifier:
- 1494667
- Grant/Contract Number:
- AC02-06CH11357
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Clinical Microbiology
- Additional Journal Information:
- Journal Volume: 57; Journal Issue: 2; Journal ID: ISSN 0095-1137
- Publisher:
- American Society for Microbiology
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; antimicrobial susceptibility testing; deep learning; diagnostics; genome sequencing; machine learning
Citation Formats
Nguyen, Marcus, Long, S. Wesley, McDermott, Patrick F., Olsen, Randall J., Olson, Robert, Stevens, Rick L., Tyson, Gregory H., Zhao, Shaohua, and Davis, James J. Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella. United States: N. p., 2019.
Web. doi:10.1128/JCM.01260-18.
Nguyen, Marcus, Long, S. Wesley, McDermott, Patrick F., Olsen, Randall J., Olson, Robert, Stevens, Rick L., Tyson, Gregory H., Zhao, Shaohua, & Davis, James J. Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella. United States. https://doi.org/10.1128/JCM.01260-18
Nguyen, Marcus, Long, S. Wesley, McDermott, Patrick F., Olsen, Randall J., Olson, Robert, Stevens, Rick L., Tyson, Gregory H., Zhao, Shaohua, and Davis, James J. Wed .
"Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella". United States. https://doi.org/10.1128/JCM.01260-18. https://www.osti.gov/servlets/purl/1494667.
@article{osti_1494667,
title = {Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella},
author = {Nguyen, Marcus and Long, S. Wesley and McDermott, Patrick F. and Olsen, Randall J. and Olson, Robert and Stevens, Rick L. and Tyson, Gregory H. and Zhao, Shaohua and Davis, James J.},
abstractNote = {Nontyphoidal Salmonella species are the leading bacterial cause of food-borne disease in the United States. Whole genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, were used to generate XGBoost-based machine learning models for predicting minimum inhibitory concentrations (MICs) for 15 antibiotics. The MIC prediction models have an overall average accuracy of 95% within ± 1 two-fold dilution step (confidence interval of 95-95%), an average very major error rate of 2.7% (confidence interval of 2.4-3.0%) and an average major error rate of 0.1% (confidence interval of 0.1-0.2%). The model predicts MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Furthermore, our strategy for developing whole genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.},
doi = {10.1128/JCM.01260-18},
journal = {Journal of Clinical Microbiology},
number = 2,
volume = 57,
place = {United States},
year = {2019},
month = {1}
}
Web of Science
Works referenced in this record:
Genomic Epidemiology of Gonococcal Resistance to Extended-Spectrum Cephalosporins, Macrolides, and Fluoroquinolones in the United States, 2000–2013
journal, September 2016
- Grad, Yonatan H.; Harris, Simon R.; Kirkcaldy, Robert D.
- Journal of Infectious Diseases, Vol. 214, Issue 10
KMC 2: fast and resource-frugal k-mer counting
journal, January 2015
- Deorowicz, Sebastian; Kokot, Marek; Grabowski, Szymon
- Bioinformatics, Vol. 31, Issue 10
Quinolone resistance-determining region in the DNA gyrase gyrA gene of Escherichia coli.
journal, June 1990
- Yoshida, H.; Bogaki, M.; Nakamura, M.
- Antimicrobial Agents and Chemotherapy, Vol. 34, Issue 6
Antimicrobial Susceptibility Testing: A Review of General Principles and Contemporary Practices
journal, December 2009
- Jorgensen, James H.; Ferraro, Mary Jane
- Clinical Infectious Diseases, Vol. 49, Issue 11
BayesHammer: Bayesian clustering for error correction in single-cell sequencing
journal, January 2013
- Nikolenko, Sergey I.; Korobeynikov, Anton I.; Alekseyev, Max A.
- BMC Genomics, Vol. 14, Issue Suppl 1
Developing an in silico minimum inhibitory concentration panel test for Klebsiella pneumoniae
journal, January 2018
- Nguyen, Marcus; Brettin, Thomas; Long, S. Wesley
- Scientific Reports, Vol. 8, Issue 1
The Use of Machine Learning Methodologies to Analyse Antibiotic and Biocide Susceptibility in Staphylococcus aureus
journal, February 2013
- Coelho, Joana Rosado; Carriço, João André; Knight, Daniel
- PLoS ONE, Vol. 8, Issue 2
Quinolone-resistant mutations of the gyrA gene of Escherichia coli
journal, January 1988
- Yoshida, Hiroaki; Kojima, Tsuyoshi; Yamagishi, Jun-ichi
- MGG Molecular & General Genetics, Vol. 211, Issue 1
Epidemiology, Clinical Presentation, Laboratory Diagnosis, Antimicrobial Resistance, and Antimicrobial Management of Invasive Salmonella Infections
journal, July 2015
- Crump, John A.; Sjölund-Karlsson, Maria; Gordon, Melita A.
- Clinical Microbiology Reviews, Vol. 28, Issue 4
PATRIC as a unique resource for studying antimicrobial resistance
journal, July 2017
- Antonopoulos, Dionysios A.; Assaf, Rida; Aziz, Ramy Karam
- Briefings in Bioinformatics, Vol. 20, Issue 4
Rapid Identification and Antibiotic Susceptibility Testing of Salmonella enterica Serovar Typhi Isolated from Blood: Implications for Therapy
journal, October 2001
- Saha, S. K.; Darmstadt, G. L.; Baqui, A. H.
- Journal of Clinical Microbiology, Vol. 39, Issue 10
Modal Codon Usage: Assessing the Typical Codon Usage of a Genome
journal, December 2009
- Davis, J. J.; Olsen, G. J.
- Molecular Biology and Evolution, Vol. 27, Issue 4
Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center
journal, November 2016
- Wattam, Alice R.; Davis, James J.; Assaf, Rida
- Nucleic Acids Research, Vol. 45, Issue D1
Comparison of Typing Methods with a New Procedure Based on Sequence Characterization for Salmonella Serovar Prediction
journal, April 2013
- Ranieri, M. L.; Shi, C.; Moreno Switt, A. I.
- Journal of Clinical Microbiology, Vol. 51, Issue 6
Genomic sequencing of Neisseria gonorrhoeae to respond to the urgent threat of antimicrobial-resistant gonorrhea
journal, April 2017
- Abrams, A. Jeanine; Trees, David L.
- Pathogens and Disease, Vol. 75, Issue 4
Salmonella enterica : Survival, Colonization, and Virulence Differences among Serovars
journal, January 2015
- Andino, A.; Hanning, I.
- The Scientific World Journal, Vol. 2015
Hospitalization and Antimicrobial Resistance in Salmonella Outbreaks, 1984–2002
journal, June 2005
- Varma, Jay K.; Greene, Katherine D.; Ovitt, Jessa
- Emerging Infectious Diseases, Vol. 11, Issue 6
Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation
journal, October 2006
- Letunic, I.; Bork, P.
- Bioinformatics, Vol. 23, Issue 1
FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments
journal, March 2010
- Price, Morgan N.; Dehal, Paramvir S.; Arkin, Adam P.
- PLoS ONE, Vol. 5, Issue 3
Making the Leap from Research Laboratory to Clinic: Challenges and Opportunities for Next-Generation Sequencing in Infectious Disease Diagnostics
journal, December 2015
- Goldberg, Brittany; Sichtig, Heike; Geyer, Chelsie
- mBio, Vol. 6, Issue 6
XGBoost: A Scalable Tree Boosting System
conference, January 2016
- Chen, Tianqi; Guestrin, Carlos
- Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16
Blood culture-based diagnosis of bacteraemia: state of the art
journal, April 2015
- Opota, O.; Croxatto, A.; Prod'hom, G.
- Clinical Microbiology and Infection, Vol. 21, Issue 4
Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons
journal, September 2016
- Drouin, Alexandre; Giguère, Sébastien; Déraspe, Maxime
- BMC Genomics, Vol. 17, Issue 1
Clinical Outcomes of Nalidixic Acid, Ceftriaxone, and Multidrug-Resistant Nontyphoidal Salmonella Infections Compared with Pansusceptible Infections in FoodNet Sites, 2006–2008
journal, May 2014
- Krueger, Amy L.; Greene, Sharon A.; Barzilay, Ezra J.
- Foodborne Pathogens and Disease, Vol. 11, Issue 5
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
journal, January 2013
- Katoh, K.; Standley, D. M.
- Molecular Biology and Evolution, Vol. 30, Issue 4
WebLogo: A Sequence Logo Generator
journal, May 2004
- Crooks, Gavin E.; Hon, Gary; Chandonia, John-Marc
- Genome Research, Vol. 14, Issue 6, p. 1188-1190
Antimicrobial Resistance Prediction in PATRIC and RAST
journal, June 2016
- Davis, James J.; Boisvert, Sébastien; Brettin, Thomas
- Scientific Reports, Vol. 6, Issue 1
Characterizing the Native Codon Usages of a Genome: An Axis Projection Approach
journal, August 2010
- Davis, J. J.; Olsen, G. J.
- Molecular Biology and Evolution, Vol. 28, Issue 1
Selection criteria for an antimicrobial susceptibility testing system.
journal, January 1993
- Jorgensen, J. H.
- Journal of Clinical Microbiology, Vol. 31, Issue 11
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
journal, May 2012
- Bankevich, Anton; Nurk, Sergey; Antipov, Dmitry
- Journal of Computational Biology, Vol. 19, Issue 5
A Common Mechanism of Cellular Death Induced by Bactericidal Antibiotics
journal, September 2007
- Kohanski, Michael A.; Dwyer, Daniel J.; Hayete, Boris
- Cell, Vol. 130, Issue 5, p. 797-810
Effect of Antibiotic Therapy in Acute Salmonellosis on the Fecal Excretion of Salmonellae
journal, September 1969
- Aserkoff, Bernard; Bennett, John V.
- New England Journal of Medicine, Vol. 281, Issue 12
Penicillin-Binding Protein Transpeptidase Signatures for Tracking and Predicting β-Lactam Resistance Levels in Streptococcus pneumoniae
journal, June 2016
- Li, Yuan; Metcalf, Benjamin J.; Chochua, Sopio
- mBio, Vol. 7, Issue 3
Predicting antimicrobial susceptibilities for Escherichia coli and Klebsiella pneumoniae isolates using whole genomic sequence data
journal, May 2013
- Stoesser, N.; Batty, E. M.; Eyre, D. W.
- Journal of Antimicrobial Chemotherapy, Vol. 68, Issue 10
Evaluation of Machine Learning and Rules-Based Approaches for Predicting Antimicrobial Resistance Profiles in Gram-negative Bacilli from Whole Genome Sequence Data
journal, November 2016
- Pesesky, Mitchell W.; Hussain, Tahir; Wallace, Meghan
- Frontiers in Microbiology, Vol. 7
Antimicrobial resistance: risk associated with antibiotic overuse and initiatives to reduce the problem
journal, September 2014
- Llor, Carl; Bjerrum, Lars
- Therapeutic Advances in Drug Safety, Vol. 5, Issue 6
WGS to predict antibiotic MICs for Neisseria gonorrhoeae
journal, March 2017
- Eyre, David W.; De Silva, Dilrini; Cole, Kevin
- Journal of Antimicrobial Chemotherapy, Vol. 72, Issue 7
Genomic analyses of Neisseria gonorrhoeae reveal an association of the gonococcal genetic island with antimicrobial resistance
journal, December 2016
- Harrison, Odile B.; Clemence, Marianne; Dillard, Joseph P.
- Journal of Infection, Vol. 73, Issue 6
ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads
journal, October 2017
- Hunt, Martin; Mather, Alison E.; Sánchez-Busó, Leonor
- Microbial Genomics, Vol. 3, Issue 10
BLAST+: architecture and applications
journal, January 2009
- Camacho, Christiam; Coulouris, George; Avagyan, Vahram
- BMC Bioinformatics, Vol. 10, Issue 1
RASTtk: A modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes
journal, February 2015
- Brettin, Thomas; Davis, James J.; Disz, Terry
- Scientific Reports, Vol. 5, Issue 1
Antimicrobial resistance surveillance in the genomic age: AMR surveillance in the genomic age
journal, November 2016
- McArthur, Andrew G.; Tsang, Kara K.
- Annals of the New York Academy of Sciences, Vol. 1388, Issue 1
Antimicrobial‐Resistant Nontyphoidal Salmonella Is Associated with Excess Bloodstream Infections and Hospitalizations
journal, February 2005
- Varma, Jay K.; Mølbak, Kåre; Barrett, Timothy J.
- The Journal of Infectious Diseases, Vol. 191, Issue 4
Clinical and microbiological implications of time-to-positivity of blood cultures in patients with Gram-negative bacilli bacteremia
journal, February 2013
- Palmer, H. R.; Palavecino, E. L.; Johnson, J. W.
- European Journal of Clinical Microbiology & Infectious Diseases, Vol. 32, Issue 7
Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis
journal, December 2015
- Bradley, Phelim; Gordon, N. Claire; Walker, Timothy M.
- Nature Communications, Vol. 6, Issue 1
Establishing Genotypic Cutoff Values To Measure Antimicrobial Resistance in Salmonella
journal, December 2016
- Tyson, Gregory H.; Zhao, Shaohua; Li, Cong
- Antimicrobial Agents and Chemotherapy, Vol. 61, Issue 3
Human Health Consequences of Antimicrobial Drug--Resistant Salmonella and Other Foodborne Pathogens
journal, December 2005
- Angulo, F. J.; Molbak, K.
- Clinical Infectious Diseases, Vol. 41, Issue 11
National Antimicrobial Resistance Monitoring System: Two Decades of Advancing Public Health Through Integrated Surveillance of Antimicrobial Resistance
journal, October 2017
- Karp, Beth E.; Tate, Heather; Plumblee, Jodie R.
- Foodborne Pathogens and Disease, Vol. 14, Issue 10
Oxidation of the Guanine Nucleotide Pool Underlies Cell Death by Bactericidal Antibiotics
journal, April 2012
- Foti, J. J.; Devadoss, B.; Winkler, J. A.
- Science, Vol. 336, Issue 6079
Transforming clinical microbiology with bacterial genome sequencing
journal, August 2012
- Didelot, Xavier; Bowden, Rory; Wilson, Daniel J.
- Nature Reviews Genetics, Vol. 13, Issue 9
Jalview Version 2--a multiple sequence alignment editor and analysis workbench
journal, January 2009
- Waterhouse, A. M.; Procter, J. B.; Martin, D. M. A.
- Bioinformatics, Vol. 25, Issue 9
Evolutionary pathway to increased virulence and epidemic group A Streptococcus disease derived from 3,615 genome sequences
journal, April 2014
- Nasser, W.; Beres, S. B.; Olsen, R. J.
- Proceedings of the National Academy of Sciences, Vol. 111, Issue 17
Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock*
journal, January 2006
- Kumar, Anand; Roberts, Daniel; Wood, Kenneth E.
- Critical Care Medicine, Vol. 34, Issue 6
Using whole genome sequencing to identify resistance determinants and predict antimicrobial resistance phenotypes for year 2015 invasive pneumococcal disease isolates recovered in the United States
journal, December 2016
- Metcalf, B. J.; Chochua, S.; Gertz, R. E.
- Clinical Microbiology and Infection, Vol. 22, Issue 12
Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella
journal, July 2016
- McDermott, Patrick F.; Tyson, Gregory H.; Kabera, Claudine
- Antimicrobial Agents and Chemotherapy, Vol. 60, Issue 9
A molecular trigger for intercontinental epidemics of group A Streptococcus
journal, August 2015
- Zhu, Luchang; Olsen, Randall J.; Nasser, Waleed
- Journal of Clinical Investigation, Vol. 125, Issue 9
Machine learning for the prediction of antibacterial susceptibility in Mycobacterium tuberculosis
conference, June 2014
- Niehaus, Katherine E.; Walker, Timothy M.; Crook, Derrick W.
- 2014 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)
Systematic discovery of pseudomonad genetic factors involved in sensitivity to tailocins
journal, March 2021
- Carim, Sean; Azadeh, Ashley L.; Kazakov, Alexey E.
- The ISME Journal, Vol. 15, Issue 8
Coriander Genomics Database: a genomic, transcriptomic, and metabolic database for coriander
journal, April 2020
- Song, Xiaoming; Nie, Fulei; Chen, Wei
- Horticulture Research, Vol. 7, Issue 1
Why does the Aβ peptide of Alzheimer share structural similarity with antimicrobial peptides?
journal, March 2020
- Pastore, Annalisa; Raimondi, Francesco; Rajendran, Lawrence
- Communications Biology, Vol. 3, Issue 1
ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads.
text, January 2017
- Hunt, Martin; Mather, Alison E.; Sánchez-Busó, Leonor
- Apollo - University of Cambridge Repository
BayesHammer: Bayesian clustering for error correction in single-cell sequencing
text, January 2012
- Nikolenko, Sergey I.; Korobeynikov, Anton I.; Alekseyev, Max A.
- arXiv
KMC 2: Fast and resource-frugal $k$-mer counting
text, January 2014
- Deorowicz, Sebastian; Kokot, Marek; Grabowski, Szymon
- arXiv
Works referencing / citing this record:
The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities
journal, October 2019
- Davis, James J.; Wattam, Alice R.; Aziz, Ramy K.
- Nucleic Acids Research
Computational Health Engineering Applied to Model Infectious Diseases and Antimicrobial Resistance Spread
journal, June 2019
- Cartelle Gestal, Mónica; Dedloff, Margaret R.; Torres-Sangiao, Eva
- Applied Sciences, Vol. 9, Issue 12
A hierarchical Bayesian latent class mixture model with censorship for detection of linear temporal changes in antibiotic resistance
journal, January 2020
- Zhang, Min; Wang, Chong; O’Connor, Annette
- PLOS ONE, Vol. 15, Issue 1
Machine Learning Approaches for Epidemiological Investigations of Food-Borne Disease Outbreaks
journal, August 2019
- Vilne, Baiba; Meistere, Irēna; Grantiņa-Ieviņa, Lelde
- Frontiers in Microbiology, Vol. 10
Identification of Primary Antimicrobial Resistance Drivers in Agricultural Nontyphoidal Salmonella enterica Serovars by Using Machine Learning
journal, August 2019
- Maguire, Finlay; Rehman, Muhammad Attiq; Carrillo, Catherine
- mSystems, Vol. 4, Issue 4
Analysis of isolates from Bangladesh highlights multiple ways to carry resistance genes in Salmonella Typhi
journal, June 2019
- Lima, Nicholas Costa Barroso; Tanmoy, Arif M.; Westeel, Emilie
- BMC Genomics, Vol. 20, Issue 1
Using Genomics to Track Global Antimicrobial Resistance
journal, September 2019
- Hendriksen, Rene S.; Bortolaia, Valeria; Tate, Heather
- Frontiers in Public Health, Vol. 7
Machine Learning Techniques to Identify Antimicrobial Resistance in the Intensive Care Unit
journal, June 2019
- Martínez-Agüero, Sergio; Mora-Jiménez, Inmaculada; Lérida-García, Jon
- Entropy, Vol. 21, Issue 6
Clinical Utility of Advanced Microbiology Testing Tools
journal, June 2019
- Miller, Melissa B.; Atrzadeh, Faranak; Burnham, Carey-Ann D.
- Journal of Clinical Microbiology, Vol. 57, Issue 9
Using Machine Learning Techniques to Aid Empirical Antibiotic Therapy Decisions in the Intensive Care Unit of a General Hospital in Greece
journal, January 2020
- Feretzakis, Georgios; Loupelis, Evangelos; Sakagianni, Aikaterini
- Antibiotics, Vol. 9, Issue 2
Enhancing the one health initiative by using whole genome sequencing to monitor antimicrobial resistance of animal pathogens: Vet-LIRN collaborative project with veterinary diagnostic laboratories in United States and Canada
journal, May 2019
- Ceric, Olgica; Tyson, Gregory H.; Goodman, Laura B.
- BMC Veterinary Research, Vol. 15, Issue 1
BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes
journal, June 2021
- DiMucci, Demetrius; Kon, Mark; Segrè, Daniel
- Frontiers in Molecular Biosciences, Vol. 8