DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: OptZConfig: Efficient Parallel Optimization of Lossy Compression Configuration

Journal Article · · IEEE Transactions on Parallel and Distributed Systems

Lossless compressors have very low compression ratios that do not meet the needs of today's large-scale scientific applications that produce vast volumes of data. Error-bounded lossy compression (EBLC) is considered a critical technique for the success of scientific research. Although EBLC allows users to set an error bound for the compression, users have been unable to specify the requirements on the compression quality, limiting practical use. Our contributions are: (1) We formulate the problem of configuring EBLC to preserve a user-defined metric as an optimization problem. This allows many classes of new metrics to be preserved, which improves over current practices. (2) We present a framework, OptZConfig, that can adapt to improvements in the search algorithm, compressor, and metrics with minimal changes, enabling future advancements in this area. (3) We demonstrate the advantages of our approach against the leading methods to configure compressors to preserve specific metrics. Here, our approach improves compression ratios against a specialized compressor by up to 3 x, has a 56x speedup over FRaZ, 1000x speedup over MGARD-QOI post tuning, and 110x speedup over systematic approaches which had not been bounded by compressors before.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
National Science Foundation (NSF); USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
2377211
Journal Information:
IEEE Transactions on Parallel and Distributed Systems, Journal Name: IEEE Transactions on Parallel and Distributed Systems Journal Issue: 12 Vol. 33; ISSN 1045-9219
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (22)

The NEWUOA software for unconstrained optimization without derivatives book January 2006
Toward a Multi-method Approach: Lossy Data Compression for Climate Simulation Data book January 2017
Multilevel techniques for compression and reduction of scientific data—the univariate case journal November 2018
Dark matter haloes: a multistream view journal June 2017
An Efficient Transformation Scheme for Lossy Data Compression with Point-Wise Relative Error Bound conference September 2018
Fixed-PSNR Lossy Compression for Scientific Data conference September 2018
Fast Error-Bounded Lossy HPC Data Compression with SZ conference May 2016
Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization conference May 2017
FRaZ: A Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientific Floating-point Data conference May 2020
FPC: A High-Speed Compressor for Double-Precision Floating-Point Data journal January 2009
Optimizing Lossy Compression Rate-Distortion from Automatic Online Selection between SZ and ZFP journal August 2019
libEnsemble: A Library to Coordinate the Concurrent Evaluation of Dynamic Ensembles of Calculations journal April 2022
Fixed-Rate Compressed Floating-Point Arrays journal December 2014
Evaluating image quality measures to assess the impact of lossy data compression applied to climate simulation data journal June 2019
Multilevel Techniques for Compression and Reduction of Scientific Data---The Multivariate Case journal January 2019
Multilevel Techniques for Compression and Reduction of Scientific Data-Quantitative Control of Accuracy in Derived Quantities journal January 2019
Stability Analysis of Inline ZFP Compression for Floating-Point Data in Iterative Methods journal January 2020
Aequitas conference June 2016
HACC: extreme scaling and performance across diverse architectures journal December 2016
Significantly Improving Lossy Compression for HPC Datasets with Second-Order Prediction and Parameter Optimization
  • Zhao, Kai; Di, Sheng; Liang, Xin
  • HPDC '20: The 29th International Symposium on High-Performance Parallel and Distributed Computing, Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing https://doi.org/10.1145/3369583.3392688
conference June 2020
Use cases of lossy compression for floating-point data in scientific data sets journal May 2019
Composable Multi-Threading for Python Libraries conference January 2016