waveSZ: A Hardware-Algorithm Co-Design of Efficient Lossy Compression for Scientific Data
Error-bounded lossy compression is critical to the success of extreme-scale scientific research because of ever-increasing volumes of data produced by today's high-performance computing (HPC) applications. Not only can error-controlled lossy compressors significantly reduce the I/O and storage burden but they can retain high data fidelity for post analysis. Existing state-of-the-art lossy compressors, however, generally suffer from relatively low compression and decompression throughput (up to hundreds of megabytes per second on a single CPU core), which considerably restrict the adoption of lossy compression by many HPC applications especially those with a fairly high data production rate. In this paper, we propose a highly efficient lossy compression approach based on field programmable gate arrays (FPGAs) under the state-of-the-art lossy compression model SZ. Our contributions are fourfold. (1) We adopt a wavefront memory layout to alleviate the data dependency during the prediction for higher-dimensional predictors, such as the Lorenzo predictor. (2) We propose a co-design framework named WAVESZ based on the wavefront memory layout and the characteristics of SZ algorithm and carefully implement it by using high-level synthesis. (3) We propose a hardware-algorithm co-optimization method to improve the performance. (4) We evaluate our proposed WAVESZ on three real-world HPC simulation datasets from the Scientific Data Reduction Benchmarks and compare it with other state-of-the-art methods on both CPUs and FPGAs. Experiments show that our WAVESZ can improve SZ's compression throughput by 6.9x similar to 8.7x over the production version running on a state-of-the-art CPU and improve the compression ratio and throughput by 2.1x and 5.8x on average, respectively, respectively, compared with the state-of-the-art FPGA design.
- Research Organization:
- Argonne National Laboratory (ANL)
- Sponsoring Organization:
- USDOE Exascale Computing Project; USDOE Office of Science; USDOE National Nuclear Security Administration (NNSA); National Oceanic and Atmospheric Administration (NOAA)
- DOE Contract Number:
- AC02-06CH11357
- OSTI ID:
- 1757958
- Country of Publication:
- United States
- Language:
- English
Similar Records
Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs
cuSZ:CUDA-based Error-Bounded Lossy Compressor for Scientific Data Scientific Data
Ultrafast Error-bounded Lossy Compression for Scientific Datasets
Conference
·
Thu Dec 31 23:00:00 EST 2020
·
OSTI ID:1864152
cuSZ:CUDA-based Error-Bounded Lossy Compressor for Scientific Data Scientific Data
Software
·
Fri Oct 09 20:00:00 EDT 2020
·
OSTI ID:code-62650
Ultrafast Error-bounded Lossy Compression for Scientific Datasets
Conference
·
Fri Dec 31 23:00:00 EST 2021
·
OSTI ID:1903841