Anomaly detection in scientific data using joint statistical moments
Abstract
We propose an anomaly detection method for multi-variate scientific data based on analysis of high-order joint moments. Using kurtosis as a reliable measure of outliers, we suggest that principal kurtosis vectors, by analogy to principal component analysis (PCA) vectors, signify the principal directions along which outliers appear. The inception of an anomaly, then, manifests as a change in the principal values and vectors of kurtosis. Obtaining the principal kurtosis vectors requires decomposing a fourth order joint cumulant tensor for which we use a simple, computationally less expensive approach that involves performing a singular value decomposition (SVD) over the matricized tensor. We demonstrate the efficacy of this approach on synthetic data, and develop an algorithm to identify the occurrence of a spatial and/or temporal anomalous event in scientific phenomena. The algorithm decomposes the data into several spatial sub-domains and time steps to identify regions with such events. Feature moment metrics, based on the alignments of the principal kurtosis vectors, are computed at each sub-domain and time step for all features to quantify their relative importance towards the overall kurtosis in the data. Accordingly, spatial and temporal anomaly metrics for each sub-domain are proposed using the Hellinger distance of the feature momentmore »
- Authors:
-
- Sandia National Lab. (SNL-CA), Livermore, CA (United States)
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Citrine Informatics, Redwood City, CA (United States)
- Publication Date:
- Research Org.:
- Sandia National Lab. (SNL-CA), Livermore, CA (United States); Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 1502973
- Alternate Identifier(s):
- OSTI ID: 1502456; OSTI ID: 1636004
- Report Number(s):
- SAND-2019-2948J; SAND-2018-8923J
Journal ID: ISSN 0021-9991; 673503
- Grant/Contract Number:
- NA0003525; AC04-94AL85000; FWP16-019471
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Computational Physics
- Additional Journal Information:
- Journal Volume: 387; Journal ID: ISSN 0021-9991
- Publisher:
- Elsevier
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; anomaly detection; scientific computing; co-kurtosis; tensor decomposition; Hellinger distance, auto-ignition; Hellinger distance; Auto-ignition
Citation Formats
Konduri, Aditya, Kolla, Hemanth, Kegelmeyer, W. Philip, Shead, Timothy M., Ling, Julia, and Davis, Warren L. Anomaly detection in scientific data using joint statistical moments. United States: N. p., 2019.
Web. doi:10.1016/j.jcp.2019.03.003.
Konduri, Aditya, Kolla, Hemanth, Kegelmeyer, W. Philip, Shead, Timothy M., Ling, Julia, & Davis, Warren L. Anomaly detection in scientific data using joint statistical moments. United States. https://doi.org/10.1016/j.jcp.2019.03.003
Konduri, Aditya, Kolla, Hemanth, Kegelmeyer, W. Philip, Shead, Timothy M., Ling, Julia, and Davis, Warren L. Wed .
"Anomaly detection in scientific data using joint statistical moments". United States. https://doi.org/10.1016/j.jcp.2019.03.003. https://www.osti.gov/servlets/purl/1502973.
@article{osti_1502973,
title = {Anomaly detection in scientific data using joint statistical moments},
author = {Konduri, Aditya and Kolla, Hemanth and Kegelmeyer, W. Philip and Shead, Timothy M. and Ling, Julia and Davis, Warren L.},
abstractNote = {We propose an anomaly detection method for multi-variate scientific data based on analysis of high-order joint moments. Using kurtosis as a reliable measure of outliers, we suggest that principal kurtosis vectors, by analogy to principal component analysis (PCA) vectors, signify the principal directions along which outliers appear. The inception of an anomaly, then, manifests as a change in the principal values and vectors of kurtosis. Obtaining the principal kurtosis vectors requires decomposing a fourth order joint cumulant tensor for which we use a simple, computationally less expensive approach that involves performing a singular value decomposition (SVD) over the matricized tensor. We demonstrate the efficacy of this approach on synthetic data, and develop an algorithm to identify the occurrence of a spatial and/or temporal anomalous event in scientific phenomena. The algorithm decomposes the data into several spatial sub-domains and time steps to identify regions with such events. Feature moment metrics, based on the alignments of the principal kurtosis vectors, are computed at each sub-domain and time step for all features to quantify their relative importance towards the overall kurtosis in the data. Accordingly, spatial and temporal anomaly metrics for each sub-domain are proposed using the Hellinger distance of the feature moment metric distribution from a suitable nominal distribution. Finally, we apply the algorithm to two turbulent auto-ignition combustion cases and demonstrate that the anomaly metrics reliably capture the occurrence of auto-ignition in relevant spatial sub-domains at the right time steps.},
doi = {10.1016/j.jcp.2019.03.003},
journal = {Journal of Computational Physics},
number = ,
volume = 387,
place = {United States},
year = {Wed Mar 13 00:00:00 EDT 2019},
month = {Wed Mar 13 00:00:00 EDT 2019}
}
Web of Science
Works referenced in this record:
Anomaly detection: A survey
journal, July 2009
- Chandola, Varun; Banerjee, Arindam; Kumar, Vipin
- ACM Computing Surveys, Vol. 41, Issue 3, p. 1-58
Using feature importance metrics to detect events of interest in scientific computing applications
conference, October 2017
- Ling, Julia; Kegelmeyer, W. Philip; Aditya, Konduri
- 2017 IEEE 7th Symposium on Large Data Analysis and Visualization (LDAV)
Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights
journal, March 2016
- Pébay, Philippe; Terriberry, Timothy B.; Kolla, Hemanth
- Computational Statistics, Vol. 31, Issue 4
Procedures for Detecting Outlying Observations in Samples
journal, February 1969
- Grubbs, Frank E.
- Technometrics, Vol. 11, Issue 1
On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study
journal, January 2016
- Campos, Guilherme O.; Zimek, Arthur; Sander, Jörg
- Data Mining and Knowledge Discovery, Vol. 30, Issue 4
A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data
journal, April 2016
- Goldstein, Markus; Uchida, Seiichi
- PLOS ONE, Vol. 11, Issue 4
Kurtosis as Peakedness, 1905–2014. R.I.P.
journal, July 2014
- Westfall, Peter H.
- The American Statistician, Vol. 68, Issue 3
Tensor Decompositions and Applications
journal, August 2009
- Kolda, Tamara G.; Bader, Brett W.
- SIAM Review, Vol. 51, Issue 3
Symmetric Tensors and Symmetric Tensor Rank
journal, January 2008
- Comon, Pierre; Golub, Gene; Lim, Lek-Heng
- SIAM Journal on Matrix Analysis and Applications, Vol. 30, Issue 3
Independent component analysis and (simultaneous) third-order tensor diagonalization
journal, January 2001
- de Lathauwer, L.; de Moor, B.; Vandewalle, J.
- IEEE Transactions on Signal Processing, Vol. 49, Issue 10
Advanced compression-ignition engines—understanding the in-cylinder processes
journal, January 2009
- Dec, John E.
- Proceedings of the Combustion Institute, Vol. 32, Issue 2
The Reheat Concept: The Proven Pathway to Ultralow Emissions and High Efficiency and Flexibility
journal, December 2008
- Güthe, Felix; Hellat, Jaan; Flohr, Peter
- Journal of Engineering for Gas Turbines and Power, Vol. 131, Issue 2
Direct numerical simulation of flame stabilization assisted by autoignition in a reheat gas turbine combustor
journal, January 2019
- Aditya, Konduri; Gruber, Andrea; Xu, Chao
- Proceedings of the Combustion Institute, Vol. 37, Issue 2
Trigger Detection for Adaptive Scientific Workflows Using Percentile Sampling
journal, January 2016
- Bennett, Janine C.; Bhagatwala, Ankit; Chen, Jacqueline H.
- SIAM Journal on Scientific Computing, Vol. 38, Issue 5
Three-dimensional direct numerical simulation of a turbulent lifted hydrogen jet flame in heated coflow: a chemical explosive mode analysis
journal, May 2010
- Lu, T. F.; Yoo, C. S.; Chen, J. H.
- Journal of Fluid Mechanics, Vol. 652
Terascale direct numerical simulations of turbulent combustion using S3D
journal, January 2009
- Chen, J. H.; Choudhary, A.; de Supinski, B.
- Computational Science & Discovery, Vol. 2, Issue 1
Scalar mixing in direct numerical simulations of temporally evolving plane jet flames with skeletal CO/H2 kinetics
journal, January 2007
- Hawkes, Evatt R.; Sankaran, Ramanan; Sutherland, James C.
- Proceedings of the Combustion Institute, Vol. 31, Issue 1
Direct numerical simulations of HCCI/SACI with ethanol
journal, July 2014
- Bhagatwala, Ankit; Chen, Jacqueline H.; Lu, Tianfeng
- Combustion and Flame, Vol. 161, Issue 7
A Multilinear Singular Value Decomposition
journal, January 2000
- De Lathauwer, Lieven; De Moor, Bart; Vandewalle, Joos
- SIAM Journal on Matrix Analysis and Applications, Vol. 21, Issue 4