DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A General Framework for Progressive Data Compression and Retrieval

Journal Article · · IEEE Transactions on Visualization and Computer Graphics

In scientific simulations, observations, and experiments, the transfer of data to and from disk and across networks has become a major bottleneck for data analysis and visualization. Compression techniques have been employed to tackle this challenge, but traditional lossy methods often demand conservative error tolerances to meet the numerical accuracy requirements of both anticipated and unknown data analysis tasks. Progressive data compression and retrieval has emerged as a promising solution, where each analysis task dictates its own accuracy needs. However, few analysis algorithms inherently support progressive data processing, and adapting compression techniques, file formats, client/server frameworks, and APIs to support progressivity can be challenging. Here, this paper presents a framework that enables progressive-precision data queries for any data compressor or numerical representation. Our strategy hinges on a multi-component representation that successively reduces the error between the original and compressed field, allowing each field in the progressive sequence to be expressed as a partial sum of components. We have implemented this approach with four established scientific data compressors and assessed its effectiveness using real-world data sets from the SDRBench collection. The results show that our framework competes in accuracy with the standalone compressors it is based upon. Additionally, (de)compression time is proportional to the number of components requested by the user. Finally, our framework allows for fully lossless compression using lossy compressors when a sufficient number of components are employed.

Research Organization:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
AC52-07NA27344
OSTI ID:
2204923
Report Number(s):
LLNL--JRNL-852372; 1079555
Journal Information:
IEEE Transactions on Visualization and Computer Graphics, Journal Name: IEEE Transactions on Visualization and Computer Graphics Journal Issue: 1 Vol. 30; ISSN 1077-2626
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (49)

SDRBench: Scientific Data Reduction Benchmark for Lossy Compressors conference December 2020
Algorithms for quad-double precision floating point arithmetic conference June 2001
ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management journal July 2020
Accelerating Multigrid-based Hierarchical Scientific Data Refactoring on GPUs conference May 2021
Definition and properties of Lagrangian coherent structures from finite-time Lyapunov exponents in two-dimensional aperiodic flows journal December 2005
The JPEG2000 still image coding system: an overview journal January 2000
Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization journal January 2016
The Community Earth System Model (CESM) Large Ensemble Project: A Community Resource for Studying Climate Change in the Presence of Internal Climate Variability journal August 2015
MultiPosits: Universal Coding of $\mathbb {R}^n$ book January 2022
Multilevel Techniques for Compression and Reduction of Scientific Data---The Multivariate Case journal January 2019
Transform Coding for Hardware-accelerated Volume Rendering journal November 2007
cuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data
  • Tian, Jiannan; Di, Sheng; Zhao, Kai
  • PACT '20: International Conference on Parallel Architectures and Compilation Techniques, Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques https://doi.org/10.1145/3410463.3414624
conference September 2020
Lossy Scientific Data Compression With SPERR conference May 2023
AMM: Adaptive Multilinear Meshes journal June 2022
NetCDF: an interface for scientific data access journal July 1990
An overview of the HDF5 technology suite and its applications conference January 2011
Global static indexing for real-time exploration of very large regular grids conference January 2001
A new, fast, and efficient image codec based on set partitioning in hierarchical trees journal June 1996
Rounding Errors in Algebraic Processes book January 1966
Efficient Encoding and Reconstruction of HPC Datasets for Checkpoint/Restart conference May 2019
Region-adaptive, Error-controlled Scientific Data Compression using Multilevel Decomposition conference July 2022
Exabyte Scale Storage at CERN journal December 2011
Fast and Efficient Compression of Floating-Point Data journal September 2006
A floating-point technique for extending the available precision journal June 1971
Hierarchical Residual Encoding for Multiresolution Time Series Compression journal May 2023
Progressive Data Access for Regular Grids book October 2012
TuckerMPI journal June 2020
Efficient and Flexible Hierarchical Data Layouts for a Unified Encoding of Scalar Field Precision and Resolution journal February 2021
Error Analysis of ZFP Compression for Floating-Point Data journal January 2019
ROOT — A C++ framework for petabyte data storage, statistical analysis and visualization journal December 2009
VAPOR: A Visualization Package Tailored to Analyze Simulation Data in Earth System Science journal August 2019
Parallelization of Variable Rate Decompression through Metadata
  • Noordsij, Lennart; Vlugt, Steven van der; Bamakhrama, Mohamed A.
  • 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) https://doi.org/10.1109/pdp50117.2020.00045
conference March 2020
Data Reduction Techniques for Simulation, Visualization and Data Analysis: Survey on Scientific Data Reduction Techniques journal March 2018
TTHRESH: Tensor Compression for Multidimensional Visual Data journal September 2020
SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors journal April 2023
A Comparison of Gradient Estimation Methods for Volume Rendering on Unstructured Meshes journal March 2011
Vector quantization for volume rendering conference January 1992
A direct numerical simulation study of turbulence and flame structure in transverse jets analysed in jet-trajectory based coordinates journal July 2012
The State of the Art in Vortex Extraction: The State of the Art in Vortex Extraction journal January 2018
Group testing for block transform image compression conference January 2001
Error-controlled, progressive, and adaptable retrieval of scientific data with multilevel decomposition conference November 2021
Fixed-Rate Compressed Floating-Point Arrays journal December 2014
Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates journal October 1997
Efficient, Low-Complexity Image Coding With a Set-Partitioning Embedded Block Coder journal November 2004
A Web services accessible database of turbulent channel flow and its use for testing a new integral wall model for LES journal December 2015
Embedded image coding using zerotrees of wavelet coefficients journal January 1993
The Laplacian Pyramid as a Compact Image Code book January 1987
A Study of the Trade-off Between Reducing Precision and Reducing Resolution for Data Analysis and Visualization journal January 2019
LSST: From Science Drivers to Reference Design and Anticipated Data Products journal March 2019