Optimization of Error-Bounded Lossy Compression for Hard-to-Compress HPC Data
- Argonne National Lab. (ANL), Argonne, IL (United States). Mathematics and Computer Science (MCS) Division
Since today’s scientific applications are producing vast amounts of data, compressing them before storage/transmission is critical. Results of existing compressors show two types of HPC data sets: highly compressible and hard to compress. In this work, we carefully design and optimize the error-bounded lossy compression for hard-tocompress scientific data. We propose an optimized algorithm that can adaptively partition the HPC data into best-fit consecutive segments each having mutually close data values, such that the compression condition can be optimized. Another significant contribution is the optimization of shifting offset such that the XOR-leading-zero length between two consecutive unpredictable data points can be maximized. We finally devise an adaptive method to select the best-fit compressor at runtime for maximizing the compression factor. We evaluate our solution using 13 benchmarks based on real-world scientific problems, and we compare it with 9 other state-of-the-art compressors. Experiments show that our compressor can always guarantee the compression errors within the user-specified error bounds. Most importantly, our optimization can improve the compression factor effectively, by up to 49% for hard-tocompress data sets with similar compression/decompression time cost.
- Research Organization:
- Argonne National Lab. (ANL), Argonne, IL (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC02-06CH11357
- OSTI ID:
- 1417025
- Journal Information:
- IEEE Transactions on Parallel and Distributed Systems, Vol. 29, Issue 1; ISSN 1045-9219
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Similar Records
Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data
waveSZ: A Hardware-Algorithm Co-Design of Efficient Lossy Compression for Scientific Data