Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Machine Learning Correlation of Electron Micrographs and ToF-SIMS for the Analysis of Organic Biomarkers in Mudstone

Journal Article · · Journal of the American Society for Mass Spectrometry
The spatial distribution of organics in geological samples can be used to determine when and how these organics were incorporated into the host rock. Mass spectrometry (MS) imaging can rapidly collect a large amount of data, but ions produced are mixed without discrimination, resulting in complex mass spectra that can be difficult to interpret. Here, we apply unsupervised and supervised machine learning (ML) to help interpret spectra from time-of-flight-secondary ion mass spectrometry (ToF-SIMS) of an organic-carbon-rich mudstone of the Middle Jurassic of England (UK). It was previously shown that the presence of sterane molecular biomarkers in this sample can be detected via ToF-SIMS (Pasterski, M. J. et al., Astrobiology 2023, 23, 936). We use unsupervised ML on scanning electron microscopy–electron dispersive spectroscopy (SEM-EDS) measurements to define compositional categories based on differences in elemental abundances. We then test the ability of four ML algorithms─k-nearest neighbors (KNN), recursive partitioning and regressive trees (RPART), eXtreme gradient boost (XGBoost), and random forest (RF)─to classify the ToF-SIM spectra using (1) the categories assigned via SEM-EDS, (2) organic and inorganic labels assigned via SEM-EDS, and (3) the presence or absence of detectable steranes in ToF-SIMS spectra. In terms of predictive accuracy and balanced accuracy, KNN was the best performing model and RPART the worst. The feature importance, or the specific features of the ToF-SIM spectra used by the models to make classifications, cannot be determined for KNN, preventing posthoc model interpretation. Nevertheless, the feature importance extracted from the other models was useful for interpreting spectra. In conclusion, we determined that some of the organic ions used to classify biomarker containing spectra may be fragment ions derived from kerogen which is abundant in this mudstone sample.
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
National Aeronautics and Space Administration (NASA); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
2573130
Journal Information:
Journal of the American Society for Mass Spectrometry, Journal Name: Journal of the American Society for Mass Spectrometry Journal Issue: 1 Vol. 36; ISSN 1879-1123; ISSN 1044-0305
Publisher:
American Society for Mass SpectrometryCopyright Statement
Country of Publication:
United States
Language:
English

References (48)

ToF-SIMS quantification of polystyrene spectra based on principal component analysis (PCA)† journal October 1997
Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry journal May 2020
On including nonlinearity in multivariate analysis of imaging SIMS data journal September 2014
Modern Applied Statistics with S book August 2002
Novel Surface Modification of Sulfur by Plasma Polymerization and its Application in Dissimilar Rubber-Rubber Blends journal July 2010
Mars 2020 Mission Overview journal December 2020
Multivariate Analysis of ToF-SIMS Data from Multicomponent Systems: The Why, When, and How journal December 2012
Flash pyrolysis of artificially matured kerogens from the Kimmeridge Clay, U.K. journal January 1988
Analysis, structure and geochemical significance of organically-bound sulphur in the geosphere: State of the art and future research journal January 1990
TOF-SIMS in cosmochemistry journal August 2001
A molecular and carbon isotope biogeochemical study of biomarkers and kerogen pyrolysates of the Kimmeridge Clay Facies: palaeoenvironmental implications journal December 1997
Discrimination between biologically relevant calcium phosphate phases by surface-analytical techniques journal August 2014
Multivariate analysis strategies for processing ToF-SIMS images of biomaterials journal May 2007
Analysis of kerogens and model compounds by time-of-flight secondary ion mass spectrometry (TOF-SIMS) journal February 2021
Millimeter-scale concentration gradients of hydrocarbons in Archean shales: Live-oil escape or fingerprint of contamination? journal June 2011
Analysis of single oil-bearing fluid inclusions in mid-Proterozoic sandstones (Roper Group, Australia) journal December 2013
Methods for full resolution data exploration and visualization for large 2D and 3D mass spectrometry imaging datasets journal April 2014
Elemental sulfur as a versatile low-mass-range calibration standard for laser desorption ionization mass spectrometry journal January 2010
Inclusive sharing of mass spectrometry imaging data requires a converter for all journal August 2012
Kerogen origin, evolution and structure journal May 2007
Occurrence and fate of fatty acyl biomarkers in an ancient whale bone (Oligocene, El Cien Formation, Mexico) journal March 2014
Geochemistry of a thermally immature Eagle Ford Group drill core in central Texas journal May 2019
Evaluation of Time-of-Flight Secondary Ion Mass Spectrometry Spectra of Peptides by Random Forest with Amino Acid Labels: Results from a Versailles Project on Advanced Materials and Standards Interlaboratory Study journal February 2021
Revealing Contamination and Sequence of Overlapping Fingerprints by Unsupervised Treatment of a Hyperspectral Secondary Ion Mass Spectrometry Dataset journal October 2021
Femtosecond Laser Desorption Postionization MS vs ToF-SIMS Imaging for Uncovering Biomarkers Buried in Geological Samples journal November 2021
Development of Machine-Learning Techniques for Time-of-Flight Secondary Ion Mass Spectrometry Spectral Analysis: Application for the Identification of Silane Coupling Agents in Multicomponent Films journal January 2022
Unsupervised Analysis of Big ToF-SIMS Data Sets: a Statistical Pattern Recognition Approach journal February 2018
A Machine Learning-Driven Comparison of Ion Images Obtained by MALDI and MALDI-2 Mass Spectrometry Imaging journal February 2024
Petroleum Formation and Occurrence journal September 1985
The Determination of the Spatial Distribution of Indigenous Lipid Biomarkers in an Immature Jurassic Sediment Using Time-of-Flight–Secondary Ion Mass Spectrometry journal September 2023
Applicability of ToF-SIMS for monitoring compositional changes in bone in a long-term animal model journal September 2013
Analysis of hopanes and steranes in single oil-bearing fluid inclusions using time-of-flight secondary ion mass spectrometry (ToF-SIMS): Analysis of hopanes and steranes in single oil-bearing fluid inclusions journal January 2010
TOF-SIMS analysis of polycyclic aromatic hydrocarbons in Allan Hills 84001 journal January 2003
TOF-SIMS analysis of cometary matter in Stardust aerogel tracks journal February 2008
The potential science and engineering value of samples delivered to Earth by Mars sample return: International MSR Objectives and Samples Team (iMOST) journal March 2019
Femtosecond laser desorption ionization mass spectrometry imaging and multivariate analysis of lipids in pancreatic tissue journal April 2018
Applications of multivariate analysis and unsupervised machine learning to ToF-SIMS images of organic, bioorganic, and biological systems journal March 2022
Organic synthesis on Mars by electrochemical reduction of CO 2 journal October 2018
Aqueously altered igneous rocks sampled on the floor of Jezero crater, Mars journal September 2022
Isotopic biogeochemistry of the Oxford Clay Formation (Jurassic), UK journal January 1994
Petrographic analyses of organo-mineral relationships: depositional conditions of the Oxford Clay Formation (Jurassic), UK journal January 1994
A lithofacies study of the Peterborough Member, Oxford Clay Formation (Jurassic), UK: an example of sediment bypass in a mudstone succession journal January 1994
Recognition of tectonic events in undeformed regions: contrasting results from the Midland Platform and East Midlands Shelf, Central England journal January 2001
An empirical comparison of supervised learning algorithms conference January 2006
XGBoost: A Scalable Tree Boosting System conference January 2016
Using Time-of-Flight Secondary Ion Mass Spectrometry to Study Biomarkers journal May 2011
FactoMineR : An R Package for Multivariate Analysis journal January 2008
A Mineralogical Context for the Organic Matter in the Paris Meteorite Determined by A Multi-Technique Analysis journal May 2019