skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SpectraMiner, an Interactive Data Mining and Visualization Software for Single Particle Mass Spectroscopy: A Laboratory Test Case

Journal Article · · International Journal of Mass Spectrometry, 258(1-3):58-73

Single Particle Mass Spectrometers are sophisticated instruments designed to measure the sizes and compositions of a wide range of individual particles in-situ, in real-time. They characterize hundreds of thousands or millions of particles, generating vast amounts of rich and complex data, the proper mining of which requires dedicated state of the art tools. The analysis of individual particle mass spectra is particularly difficult because of their high dimensionality — each data point, representing a single particle, includes the 450 mass spectral peak intensities, particle size, and time of detection. The first step is to organize the data; a process typically accomplished by grouping particles of similar attributes. Since the common assumption is that the data must be reduced to become manageable, they are typically classified into a small number of clusters (~10) and represented by their average/representative spectra. Our approach is quite different. We have developed a data mining and visualization software package we call SpectraMiner that makes it possible to handle hundreds of clusters without loss of information and thus overcome the limits set by traditional statistical data analysis approaches. Data, which often include over 1 million particle spectra, are organized using K-mean clustering algorithm. The clusters are merged into nodes by sequentially combining similar clusters. The final structure is displayed in a hierarchical dynamical tree or circular dendogram. This interactive dendogram is the visual interface that allows for real-time data mining and exploration. Clicking on any of the clusters/nodes in the dendogram reveals the mass spectral and other detailed information about the particles that reside at that position. At each step the scientist is in control of the level of detail and the visualization format, rapidly switching between them while running the program on an office PC.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Environmental Molecular Sciences Lab. (EMSL)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
896695
Report Number(s):
PNNL-SA-49822; 3644; KC0302020; TRN: US200703%%731
Journal Information:
International Journal of Mass Spectrometry, 258(1-3):58-73, Journal Name: International Journal of Mass Spectrometry, 258(1-3):58-73
Country of Publication:
United States
Language:
English