A dictionary learning algorithm for compression and reconstruction of streaming data in preset order
Abstract
There has been an emerging interest in developing and applying dictionary learning (DL) to process massive datasets in the last decade. Many of these efforts, however, focus on employing DL to compress and extract a set of important features from data, while considering restoring the original data from this set a secondary goal. On the other hand, although several methods are able to process streaming data by updating the dictionary incrementally as new snapshots pass by, most of those algorithms are designed for the setting where the snapshots are randomly drawn from a probability distribution. In this paper, we present a new DL approach to compress and denoise massive dataset in real time, in which the data are streamed through in a preset order (instances are videos and temporal experimental data), so at any time, we can only observe a biased sample set of the whole data. Here, our approach incrementally builds up the dictionary in a relatively simple manner: if the new snapshot is adequately explained by the current dictionary, we perform a sparse coding to find its sparse representation; otherwise, we add the new snapshot to the dictionary, with a Gram-Schmidt process to maintain the orthogonality. To compressmore »
- Authors:
-
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Publication Date:
- Research Org.:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR). Scientific Discovery through Advanced Computing (SciDAC); USDOE Laboratory Directed Research and Development (LDRD) Program
- OSTI Identifier:
- 1883981
- Grant/Contract Number:
- AC05-00OR22725; AC02-05CH11231
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Discrete and Continuous Dynamical Systems - Series S
- Additional Journal Information:
- Journal Volume: 15; Journal Issue: 4; Journal ID: ISSN 1937-1632
- Publisher:
- American Institute of Mathematical Sciences (AIMS)
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; dictionary learning; matrix factorization; online algorithm
Citation Formats
Archibald, Richard, and Tran, Hoang. A dictionary learning algorithm for compression and reconstruction of streaming data in preset order. United States: N. p., 2022.
Web. doi:10.3934/dcdss.2021102.
Archibald, Richard, & Tran, Hoang. A dictionary learning algorithm for compression and reconstruction of streaming data in preset order. United States. https://doi.org/10.3934/dcdss.2021102
Archibald, Richard, and Tran, Hoang. Fri .
"A dictionary learning algorithm for compression and reconstruction of streaming data in preset order". United States. https://doi.org/10.3934/dcdss.2021102. https://www.osti.gov/servlets/purl/1883981.
@article{osti_1883981,
title = {A dictionary learning algorithm for compression and reconstruction of streaming data in preset order},
author = {Archibald, Richard and Tran, Hoang},
abstractNote = {There has been an emerging interest in developing and applying dictionary learning (DL) to process massive datasets in the last decade. Many of these efforts, however, focus on employing DL to compress and extract a set of important features from data, while considering restoring the original data from this set a secondary goal. On the other hand, although several methods are able to process streaming data by updating the dictionary incrementally as new snapshots pass by, most of those algorithms are designed for the setting where the snapshots are randomly drawn from a probability distribution. In this paper, we present a new DL approach to compress and denoise massive dataset in real time, in which the data are streamed through in a preset order (instances are videos and temporal experimental data), so at any time, we can only observe a biased sample set of the whole data. Here, our approach incrementally builds up the dictionary in a relatively simple manner: if the new snapshot is adequately explained by the current dictionary, we perform a sparse coding to find its sparse representation; otherwise, we add the new snapshot to the dictionary, with a Gram-Schmidt process to maintain the orthogonality. To compress and denoise noisy datasets, we apply the denoising to the snapshot directly before sparse coding, which deviates from traditional dictionary learning approach that achieves denoising via sparse coding. Compared to full-batch matrix decomposition methods, where the whole data is kept in memory, and other mini-batch approaches, where unbiased sampling is often assumed, our approach has minimal requirement in data sampling and storage: i) each snapshot is only seen once then discarded, and ii) the snapshots are drawn in a preset order, so can be highly biased. Through experiments on climate simulations and scanning transmission electron microscopy (STEM) data, we demonstrate that the proposed approach performs competitively to those methods in data reconstruction and denoising.},
doi = {10.3934/dcdss.2021102},
journal = {Discrete and Continuous Dynamical Systems - Series S},
number = 4,
volume = 15,
place = {United States},
year = {Fri Apr 01 00:00:00 EDT 2022},
month = {Fri Apr 01 00:00:00 EDT 2022}
}
Works referenced in this record:
Sparse coding with an overcomplete basis set: A strategy employed by V1?
journal, December 1997
- Olshausen, Bruno A.; Field, David J.
- Vision Research, Vol. 37, Issue 23
Fast Low-Rank Shared Dictionary Learning for Image Classification
journal, November 2017
- Vu, Tiep Huu; Monga, Vishal
- IEEE Transactions on Image Processing, Vol. 26, Issue 11
Online group-structured dictionary learning
conference, June 2011
- Szabo, Zoltan; Poczos, Barnabas; Lorincz, Andras
- CVPR 2011
Identifying Novel Polar Distortion Modes in Engineered Magnetic Oxide Superlattices
journal, July 2017
- Ghosh, Saurabh; Choquette, Amber; May, Steve
- Microscopy and Microanalysis, Vol. 23, Issue S1
An initial-value problem for testing numerical models of the global shallow-water equations
journal, January 2004
- Galewsky, Joseph; Scott, Richard K.; Polvani, Lorenzo M.
- Tellus A: Dynamic Meteorology and Oceanography, Vol. 56, Issue 5
Dictionary Learning
journal, March 2011
- Tosic, Ivana; Frossard, Pascal
- IEEE Signal Processing Magazine, Vol. 28, Issue 2
Online Robust Dictionary Learning
conference, June 2013
- Lu, Cewu; Shi, Jiaping; Jia, Jiaya
- 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
A fast patch-dictionary method for whole image recovery
journal, May 2016
- Xu, Yangyang; Yin, Wotao
- Inverse Problems and Imaging, Vol. 10, Issue 2
Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries
journal, January 2006
- Elad, Michael; Aharon, Michal
- IEEE Transactions on Image Processing, Vol. 15, Issue 12
Online dictionary learning from big data using accelerated stochastic approximation algorithms
conference, May 2014
- Slavakis, Konstantinos; Giannakis, Georgios B.
- 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Rudin-Osher-Fatemi Total Variation Denoising using Split Bregman
journal, January 2012
- Getreuer, Pascal
- Image Processing On Line, Vol. 2
First- and Second-Order Methods for Online Convolutional Dictionary Learning
journal, January 2018
- Liu, Jialin; Garcia-Cardona, Cristina; Wohlberg, Brendt
- SIAM Journal on Imaging Sciences, Vol. 11, Issue 2
A Discontinuous Galerkin Transport Scheme on the Cubed Sphere
journal, April 2005
- Nair, Ramachandran D.; Thomas, Stephen J.; Loft, Richard D.
- Monthly Weather Review, Vol. 133, Issue 4
$rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation
journal, November 2006
- Aharon, M.; Elad, M.; Bruckstein, A.
- IEEE Transactions on Signal Processing, Vol. 54, Issue 11
Sparse and Redundant Modeling of Image Content Using an Image-Signature-Dictionary
journal, January 2008
- Aharon, Michal; Elad, Michael
- SIAM Journal on Imaging Sciences, Vol. 1, Issue 3
Dictionaries for Sparse Representation Modeling
journal, June 2010
- Rubinstein, Ron; Bruckstein, Alfred M.; Elad, Michael
- Proceedings of the IEEE, Vol. 98, Issue 6
Incremental Learning for Robust Visual Tracking
journal, August 2007
- Ross, David A.; Lim, Jongwoo; Lin, Ruei-Sung
- International Journal of Computer Vision, Vol. 77, Issue 1-3
Task-Driven Dictionary Learning
journal, April 2012
- Mairal, J.; Bach, F.; Ponce, J.
- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, Issue 4, p. 791-804
An Algorithm for Total Variation Minimization and Applications
journal, January 2004
- Nikolova, Mila
- Journal of Mathematical Imaging and Vision, Vol. 20, Issue 1/2
Online dictionary learning for sparse coding
conference, January 2009
- Mairal, Julien; Bach, Francis; Ponce, Jean
- Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09
Online convolutional dictionary learning for multimodal imaging
conference, September 2017
- Degraux, Kevin; Kamilov, Ulugbek S.; Boufounos, Petros T.
- 2017 IEEE International Conference on Image Processing (ICIP)