Random forest models accurately classify synthetic opioids using high-dimensionality mass spectrometry datasets
Abstract
Detection of novel threat agents presents several challenges, a principle one being the development of untargeted methods to screen an increasing number of threat chemicals whose exact structures are unknown. With the use of Machine Learning (ML) tools, we can guide the development of analytical methods for broad-spectrum detection of unbounded threat chemical families in complex mixtures. Toward this goal, we used nominal mass and high-resolution mass spectrometry data for hundreds of synthetic opioids and non-opioid compounds. We tested two ML techniques, logistic regression and random forest, to develop models towards a practical, implementable method for opioid detection. We found that of these tested ML methods, random forest models resulted in the highest validation accuracy (95+%) for both nominal mass and high-resolution classification of opioids versus non-opioids, with low false positive and false negative rates. The RF models were then used to successfully predict the classification of 10 compounds—five opioids and five non-opioids not part of the training and validation analysis. This application of ML is a critical step towards the development of field-deployable nominal mass spectrometers with ML-driven analyses for classification of emergent threats.
- Authors:
-
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Publication Date:
- Other Number(s):
- LLNL-DATA-2009736
- DOE Contract Number:
- AC52-07NA27344
- Research Org.:
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 2589092
- DOI:
- https://doi.org/10.11579/2589092
Citation Formats
Arasteh, Kourosh, Magana-Zook, Steven, Ponce, Colin V., Leif, Roald, Vu, Alex, Dreyer, Mark, Mayer, Brian P., Williams, Audrey M., and Fisher, Carolyn L. Random forest models accurately classify synthetic opioids using high-dimensionality mass spectrometry datasets. United States: N. p., 2025.
Web. doi:10.11579/2589092.
Arasteh, Kourosh, Magana-Zook, Steven, Ponce, Colin V., Leif, Roald, Vu, Alex, Dreyer, Mark, Mayer, Brian P., Williams, Audrey M., & Fisher, Carolyn L. Random forest models accurately classify synthetic opioids using high-dimensionality mass spectrometry datasets. United States. doi:https://doi.org/10.11579/2589092
Arasteh, Kourosh, Magana-Zook, Steven, Ponce, Colin V., Leif, Roald, Vu, Alex, Dreyer, Mark, Mayer, Brian P., Williams, Audrey M., and Fisher, Carolyn L. 2025.
"Random forest models accurately classify synthetic opioids using high-dimensionality mass spectrometry datasets". United States. doi:https://doi.org/10.11579/2589092. https://www.osti.gov/servlets/purl/2589092. Pub date:Fri May 30 04:00:00 UTC 2025
@article{osti_2589092,
title = {Random forest models accurately classify synthetic opioids using high-dimensionality mass spectrometry datasets},
author = {Arasteh, Kourosh and Magana-Zook, Steven and Ponce, Colin V. and Leif, Roald and Vu, Alex and Dreyer, Mark and Mayer, Brian P. and Williams, Audrey M. and Fisher, Carolyn L.},
abstractNote = {Detection of novel threat agents presents several challenges, a principle one being the development of untargeted methods to screen an increasing number of threat chemicals whose exact structures are unknown. With the use of Machine Learning (ML) tools, we can guide the development of analytical methods for broad-spectrum detection of unbounded threat chemical families in complex mixtures. Toward this goal, we used nominal mass and high-resolution mass spectrometry data for hundreds of synthetic opioids and non-opioid compounds. We tested two ML techniques, logistic regression and random forest, to develop models towards a practical, implementable method for opioid detection. We found that of these tested ML methods, random forest models resulted in the highest validation accuracy (95+%) for both nominal mass and high-resolution classification of opioids versus non-opioids, with low false positive and false negative rates. The RF models were then used to successfully predict the classification of 10 compounds—five opioids and five non-opioids not part of the training and validation analysis. This application of ML is a critical step towards the development of field-deployable nominal mass spectrometers with ML-driven analyses for classification of emergent threats.},
doi = {10.11579/2589092},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri May 30 04:00:00 UTC 2025},
month = {Fri May 30 04:00:00 UTC 2025}
}
