Machine Learning Correlation of Electron Micrographs and ToF-SIMS for the Analysis of Organic Biomarkers in Mudstone
Journal Article
·
· Journal of the American Society for Mass Spectrometry
- Univ. of Illinois, Chicago, IL (United States)
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
The spatial distribution of organics in geological samples can be used to determine when and how these organics were incorporated into the host rock. Mass spectrometry (MS) imaging can rapidly collect a large amount of data, but ions produced are mixed without discrimination, resulting in complex mass spectra that can be difficult to interpret. Here, we apply unsupervised and supervised machine learning (ML) to help interpret spectra from time-of-flight-secondary ion mass spectrometry (ToF-SIMS) of an organic-carbon-rich mudstone of the Middle Jurassic of England (UK). It was previously shown that the presence of sterane molecular biomarkers in this sample can be detected via ToF-SIMS (Pasterski, M. J. et al., Astrobiology 2023, 23, 936). We use unsupervised ML on scanning electron microscopy–electron dispersive spectroscopy (SEM-EDS) measurements to define compositional categories based on differences in elemental abundances. We then test the ability of four ML algorithms─k-nearest neighbors (KNN), recursive partitioning and regressive trees (RPART), eXtreme gradient boost (XGBoost), and random forest (RF)─to classify the ToF-SIM spectra using (1) the categories assigned via SEM-EDS, (2) organic and inorganic labels assigned via SEM-EDS, and (3) the presence or absence of detectable steranes in ToF-SIMS spectra. In terms of predictive accuracy and balanced accuracy, KNN was the best performing model and RPART the worst. The feature importance, or the specific features of the ToF-SIM spectra used by the models to make classifications, cannot be determined for KNN, preventing posthoc model interpretation. Nevertheless, the feature importance extracted from the other models was useful for interpreting spectra. In conclusion, we determined that some of the organic ions used to classify biomarker containing spectra may be fragment ions derived from kerogen which is abundant in this mudstone sample.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- National Aeronautics and Space Administration (NASA); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
- Grant/Contract Number:
- AC05-00OR22725
- OSTI ID:
- 2573130
- Journal Information:
- Journal of the American Society for Mass Spectrometry, Journal Name: Journal of the American Society for Mass Spectrometry Journal Issue: 1 Vol. 36; ISSN 1879-1123; ISSN 1044-0305
- Publisher:
- American Society for Mass SpectrometryCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Indigenous Organic Molecular Biosignatures are Detectable via ToF-SIMS of a Kerogen-rich Jurassic Clay
The Determination of the Spatial Distribution of Indigenous Lipid Biomarkers in an Immature Jurassic Sediment Using Time-of-Flight–Secondary Ion Mass Spectrometry
Multimodal Mass Spectrometry Imaging (MSI) of Archean and Jurassic Geologic Samples
Conference
·
Sun May 01 00:00:00 EDT 2022
·
OSTI ID:1883691
The Determination of the Spatial Distribution of Indigenous Lipid Biomarkers in an Immature Jurassic Sediment Using Time-of-Flight–Secondary Ion Mass Spectrometry
Journal Article
·
Sun Jul 16 20:00:00 EDT 2023
· Astrobiology
·
OSTI ID:1996649
Multimodal Mass Spectrometry Imaging (MSI) of Archean and Jurassic Geologic Samples
Conference
·
Mon Nov 30 23:00:00 EST 2020
·
OSTI ID:1817403