DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data

Abstract

Scientific simulations in high-performance computing (HPC) environments generate vast volume of data, which may cause a severe I/O bottleneck at runtime and a huge burden on storage space for postanalysis. Unlike traditional data reduction schemes such as deduplication or lossless compression, not only can error-controlled lossy compression significantly reduce the data size but it also holds the promise to satisfy user demand on error control. Pointwise relative error bounds (i.e., compression errors depends on the data values) are widely used by many scientific applications with lossy compression since error control can adapt to the error bound in the dataset automatically. Pointwise relative-error-bounded compression is complicated and time consuming. In this article, we develop efficient precomputation-based mechanisms based on the SZ lossy compression framework. Our mechanisms can avoid costly logarithmic transformation and identify quantization factor values via a fast table lookup, greatly accelerating the relative-error-bounded compression with excellent compression ratios. In addition, we reduce traversing operations for Huffman decoding, significantly accelerating the decompression process in SZ. Experiments with eight well-known real-world scientific simulation datasets show that our solution can improve the compression and decompression rates (i.e., the speed) by about 40 and 80 p, respectively, in most of cases, making ourmore » designed lossy compression strategy the best-in-class solution in most cases.« less

Authors:
ORCiD logo [1];  [2]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [3];  [4];  [3]
  1. Harbin Inst. of Technology (China)
  2. Marvell Technology Group, Santa Clara, CA (United States)
  3. Argonne National Lab. (ANL), Argonne, IL (United States)
  4. Univ. of Alabama, Tuscaloosa, AL (United States)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
National Science Foundation (NSF); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Contributing Org.:
National Key Research and Development Program of China
OSTI Identifier:
1603491
Grant/Contract Number:  
AC02-06CH11357
Resource Type:
Accepted Manuscript
Journal Name:
IEEE Transactions on Parallel and Distributed Systems
Additional Journal Information:
Journal Volume: 31; Journal Issue: 7; Journal ID: ISSN 1045-9219
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; 96 KNOWLEDGE MANAGEMENT AND PRESERVATION; Lossy compression; compression rate; high-performance computing; scientific data

Citation Formats

Zou, Xiangyu, Lu, Tao, Xia, Wen, Wang, Xuan, Zhang, Weizhe, Zhang, Haijun, Di, Sheng, Tao, Dingwen, and Cappello, Franck. Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data. United States: N. p., 2020. Web. doi:10.1109/TPDS.2020.2972548.
Zou, Xiangyu, Lu, Tao, Xia, Wen, Wang, Xuan, Zhang, Weizhe, Zhang, Haijun, Di, Sheng, Tao, Dingwen, & Cappello, Franck. Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data. United States. https://doi.org/10.1109/TPDS.2020.2972548
Zou, Xiangyu, Lu, Tao, Xia, Wen, Wang, Xuan, Zhang, Weizhe, Zhang, Haijun, Di, Sheng, Tao, Dingwen, and Cappello, Franck. Mon . "Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data". United States. https://doi.org/10.1109/TPDS.2020.2972548. https://www.osti.gov/servlets/purl/1603491.
@article{osti_1603491,
title = {Performance Optimization for Relative-Error-Bounded Lossy Compression on Scientific Data},
author = {Zou, Xiangyu and Lu, Tao and Xia, Wen and Wang, Xuan and Zhang, Weizhe and Zhang, Haijun and Di, Sheng and Tao, Dingwen and Cappello, Franck},
abstractNote = {Scientific simulations in high-performance computing (HPC) environments generate vast volume of data, which may cause a severe I/O bottleneck at runtime and a huge burden on storage space for postanalysis. Unlike traditional data reduction schemes such as deduplication or lossless compression, not only can error-controlled lossy compression significantly reduce the data size but it also holds the promise to satisfy user demand on error control. Pointwise relative error bounds (i.e., compression errors depends on the data values) are widely used by many scientific applications with lossy compression since error control can adapt to the error bound in the dataset automatically. Pointwise relative-error-bounded compression is complicated and time consuming. In this article, we develop efficient precomputation-based mechanisms based on the SZ lossy compression framework. Our mechanisms can avoid costly logarithmic transformation and identify quantization factor values via a fast table lookup, greatly accelerating the relative-error-bounded compression with excellent compression ratios. In addition, we reduce traversing operations for Huffman decoding, significantly accelerating the decompression process in SZ. Experiments with eight well-known real-world scientific simulation datasets show that our solution can improve the compression and decompression rates (i.e., the speed) by about 40 and 80 p, respectively, in most of cases, making our designed lossy compression strategy the best-in-class solution in most cases.},
doi = {10.1109/TPDS.2020.2972548},
journal = {IEEE Transactions on Parallel and Distributed Systems},
number = 7,
volume = 31,
place = {United States},
year = {Mon Feb 10 00:00:00 EST 2020},
month = {Mon Feb 10 00:00:00 EST 2020}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 5 works
Citation information provided by
Web of Science

Save / Share: