Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Inferring the shape of data: a probabilistic framework for analysing experiments in the natural sciences

Journal Article · · Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences

A critical step in data analysis for many different types of experiments is the identification of features with theoretically defined shapes in N -dimensional datasets; examples of this process include finding peaks in multi-dimensional molecular spectra or emitters in fluorescence microscopy images. Identifying such features involves determining if the overall shape of the data is consistent with an expected shape; however, it is generally unclear how to quantitatively make this determination. In practice, many analysis methods employ subjective, heuristic approaches, which complicates the validation of any ensuing results—especially as the amount and dimensionality of the data increase. Here, we present a probabilistic solution to this problem by using Bayes’ rule to calculate the probability that the data have any one of several potential shapes. This probabilistic approach may be used to objectively compare how well different theories describe a dataset, identify changes between datasets and detect features within data using a corollary method called Bayesian Inference-based Template Search; several proof-of-principle examples are provided. Altogether, this mathematical framework serves as an automated ‘engine’ capable of computationally executing analysis decisions currently made by visual inspection across the sciences.

Research Organization:
Oak Ridge Associated Universities (ORAU), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-06OR23100
OSTI ID:
2424091
Journal Information:
Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences, Journal Name: Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences Journal Issue: 2266 Vol. 478; ISSN 1364-5021
Publisher:
The Royal Society Publishing
Country of Publication:
United States
Language:
English

References (33)

Imaging modes of atomic force microscopy for application in molecular and cell biology journal April 2017
Proton Fingerprints Portray Molecular Structures: Enhanced Description of the1H NMR Spectra of Small Molecules journal September 2013
Bayesian analysis of individual electron microscopy images: Towards structures of dynamic and heterogeneous biomolecular assemblies journal December 2013
Estimating the evidence - a review journal January 2012
Multi-wavelength single-molecule fluorescence analysis of transcription mechanisms journal September 2015
Bayesian Detection of Intensity Changes in Single Molecule and Molecular Dynamics Trajectories journal January 2010
Statistical challenges of high-dimensional data
  • Johnstone, Iain M.; Titterington, D. Michael
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 367, Issue 1906 https://doi.org/10.1098/rsta.2009.0159
journal November 2009
Probability Theory book January 2003
A Review of Calibration Transfer Practices and Instrument Differences in Spectroscopy journal October 2017
A table of integrals of the Error functions journal January 1969
Automated projection spectroscopy (APSY) journal July 2005
Optical tweezers in single-molecule biophysics journal March 2021
Bayesian inference in physics journal September 2011
Single Particle Tracking: From Theory to Biophysical Applications journal May 2017
A Review of Super-Resolution Single-Molecule Localization Microscopy Cluster Analysis and Quantification Methods journal June 2020
Template matching using fast normalized cross correlation conference March 2001
Three-dimensional sub–100 nm resolution fluorescence microscopy of thick samples journal May 2008
Mass Spectrometry and Protein Analysis journal April 2006
Understanding and evaluating blind deconvolution algorithms conference June 2009
Biological mechanisms, one molecule at a time journal June 2011
Free R value: a novel statistical quantity for assessing the accuracy of crystal structures journal January 1992
Statistical Shape Analysis, with Applications in R book September 2016
Bayes Offers a 'New' Way to Make Sense of Numbers journal November 1999
Hierarchical Dirichlet Processes journal December 2006
Prevention of overfitting in cryo-EM structure determination journal July 2012
Tackling the widespread and critical impact of batch effects in high-throughput data journal September 2010
Deformed alignment of super-resolution images for semi-flexible structures journal March 2019
BioAFMviewer: An interactive interface for simulated AFM scanning of biomolecular structures and dynamics journal November 2020
Single-Molecule Reaction Chemistry in Patterned Nanowells journal June 2016
Bayesian Inference: The Comprehensive Approach to Analyzing Single-Molecule Experiments journal May 2021
Automated interpretation of vibrational spectra journal December 1990
Three-Dimensional Electron Microscopy of Macromolecular Assemblies book January 2006
Single-molecule fluorescence to study molecular motors journal February 2007

Similar Records

Artificial intelligence inferred microstructural properties from voltage–capacity curves
Journal Article · 2022 · Scientific Reports · OSTI ID:1879796

Coincident learning for unsupervised anomaly detection of scientific instruments
Journal Article · 2024 · Machine Learning: Science and Technology · OSTI ID:2426670

Dimensionally reduced machine learning model for predicting single component octanol–water partition coefficients
Journal Article · 2023 · Journal of Cheminformatics · OSTI ID:1909929