Use cases of lossy compression for floating-point data in scientific data sets
Abstract
Architectural and technological trends of systems used for scientific computing call for a significant reduction of scientific data sets that are composed mainly of floating-point data. This article surveys and presents experimental results of currently identified use cases of generic lossy compression to address the different limitations of scientific computing systems. The article shows from a collection of experiments run on parallel systems of a leadership facility that lossy data compression not only can reduce the footprint of scientific data sets on storage but also can reduce I/O and checkpoint/restart times, accelerate computation, and even allow significantly larger problems to be run than without lossy compression. These results suggest that lossy compression will become an important technology in many aspects of high performance scientific computing. Because the constraints for each use case are different and often conflicting, this collection of results also indicates the need for more specialization of the compression pipelines.
- Authors:
-
- Argonne National Lab. (ANL), Lemont, IL (United States); Univ. of Illinois, Urbana-Champaign, IL (United States)
- Argonne National Lab. (ANL), Lemont, IL (United States)
- Univ. of California, Riverside, CA (United States)
- Northwestern Univ., Evanston, IL (United States)
- Univ. of Alabama, Tuscaloosa, AL (United States)
- SLAC National Accelerator Lab., Menlo Park, CA (United States)
- Univ. of Chicago, IL (United States)
- Publication Date:
- Research Org.:
- SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE Exascale Computing Project; National Science Foundation (NSF)
- OSTI Identifier:
- 1560791
- Alternate Identifier(s):
- OSTI ID: 1575218
- Grant/Contract Number:
- AC02-76SF00515; AC02-06CH11357
- Resource Type:
- Accepted Manuscript
- Journal Name:
- International Journal of High Performance Computing Applications
- Additional Journal Information:
- Journal Volume: 33; Journal Issue: 6; Journal ID: ISSN 1094-3420
- Publisher:
- SAGE
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; Lossy compression; floating-point data; scientific data set; applications; use cases
Citation Formats
Cappello, Franck, Di, Sheng, Li, Sihuan, Liang, Xin, Gok, Ali Murat, Tao, Dingwen, Yoon, Chun Hong, Wu, Xin-Chuan, Alexeev, Yuri, and Chong, Frederic T. Use cases of lossy compression for floating-point data in scientific data sets. United States: N. p., 2019.
Web. doi:10.1177/1094342019853336.
Cappello, Franck, Di, Sheng, Li, Sihuan, Liang, Xin, Gok, Ali Murat, Tao, Dingwen, Yoon, Chun Hong, Wu, Xin-Chuan, Alexeev, Yuri, & Chong, Frederic T. Use cases of lossy compression for floating-point data in scientific data sets. United States. https://doi.org/10.1177/1094342019853336
Cappello, Franck, Di, Sheng, Li, Sihuan, Liang, Xin, Gok, Ali Murat, Tao, Dingwen, Yoon, Chun Hong, Wu, Xin-Chuan, Alexeev, Yuri, and Chong, Frederic T. Tue .
"Use cases of lossy compression for floating-point data in scientific data sets". United States. https://doi.org/10.1177/1094342019853336. https://www.osti.gov/servlets/purl/1560791.
@article{osti_1560791,
title = {Use cases of lossy compression for floating-point data in scientific data sets},
author = {Cappello, Franck and Di, Sheng and Li, Sihuan and Liang, Xin and Gok, Ali Murat and Tao, Dingwen and Yoon, Chun Hong and Wu, Xin-Chuan and Alexeev, Yuri and Chong, Frederic T.},
abstractNote = {Architectural and technological trends of systems used for scientific computing call for a significant reduction of scientific data sets that are composed mainly of floating-point data. This article surveys and presents experimental results of currently identified use cases of generic lossy compression to address the different limitations of scientific computing systems. The article shows from a collection of experiments run on parallel systems of a leadership facility that lossy data compression not only can reduce the footprint of scientific data sets on storage but also can reduce I/O and checkpoint/restart times, accelerate computation, and even allow significantly larger problems to be run than without lossy compression. These results suggest that lossy compression will become an important technology in many aspects of high performance scientific computing. Because the constraints for each use case are different and often conflicting, this collection of results also indicates the need for more specialization of the compression pipelines.},
doi = {10.1177/1094342019853336},
journal = {International Journal of High Performance Computing Applications},
number = 6,
volume = 33,
place = {United States},
year = {Tue Jul 09 00:00:00 EDT 2019},
month = {Tue Jul 09 00:00:00 EDT 2019}
}
Web of Science
Figures / Tables:
Works referenced in this record:
Toward a Multi-method Approach: Lossy Data Compression for Climate Simulation Data
book, January 2017
- Baker, Allison H.; Xu, Haiying; Hammerling, Dorit M.
- Lecture Notes in Computer Science
PaSTRI: Error-Bounded Lossy Compression for Two-Electron Integrals in Quantum Chemistry
conference, September 2018
- Gok, Ali Murat; Di, Sheng; Alexeev, Yuri
- 2018 IEEE International Conference on Cluster Computing (CLUSTER)
The Community Earth System Model: A Framework for Collaborative Research
journal, September 2013
- Hurrell, James W.; Holland, M. M.; Gent, P. R.
- Bulletin of the American Meteorological Society, Vol. 94, Issue 9
Selenium single-wavelength anomalous diffraction de novo phasing using an X-ray-free electron laser
journal, November 2016
- Hunter, Mark S.; Yoon, Chun Hong; DeMirci, Hasan
- Nature Communications, Vol. 7, Issue 1
Enabling Near Real-Time Remote Search for Fast Transient Events with Lossy Data Compression
journal, January 2017
- Vohl, Dany; Pritchard, Tyler; Andreoni, Igor
- Publications of the Astronomical Society of Australia, Vol. 34
General atomic and molecular electronic structure system
journal, November 1993
- Schmidt, Michael W.; Baldridge, Kim K.; Boatz, Jerry A.
- Journal of Computational Chemistry, Vol. 14, Issue 11, p. 1347-1363
Evaluating lossy data compression on climate simulation data within a large ensemble
journal, January 2016
- Baker, Allison H.; Hammerling, Dorit M.; Mickelson, Sheri A.
- Geoscientific Model Development, Vol. 9, Issue 12
Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization
conference, May 2017
- Tao, Dingwen; Di, Sheng; Chen, Zizhong
- 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Data systems for the Linac coherent light source
journal, January 2017
- Thayer, J.; Damiani, D.; Ford, C.
- Advanced Structural and Chemical Imaging, Vol. 3, Issue 1
Spatio-Temporal Just Noticeable Distortion Profile for Grey Scale Image/Video in DCT Domain
journal, March 2009
- Zhenyu Wei, ; Ngan, K. N.
- IEEE Transactions on Circuits and Systems for Video Technology, Vol. 19, Issue 3
A first order approximation to the optimum checkpoint interval
journal, September 1974
- Young, John W.
- Communications of the ACM, Vol. 17, Issue 9
Wavefield compression for adjoint methods in full-waveform inversion
journal, November 2016
- Boehm, Christian; Hanzich, Mauricio; de la Puente, Josep
- GEOPHYSICS, Vol. 81, Issue 6
Improving performance of iterative methods by lossy checkponting
conference, January 2018
- Tao, Dingwen; Di, Sheng; Liang, Xin
- Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '18
Exploration of Lossy Compression for Application-Level Checkpoint/Restart
conference, May 2015
- Sasaki, Naoto; Sato, Kento; Endo, Toshio
- 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Data compression in the petascale astronomy era: A GERLUMPH case study
journal, September 2015
- Vohl, D.; Fluke, C. J.; Vernardos, G.
- Astronomy and Computing, Vol. 12
Fast Error-Bounded Lossy HPC Data Compression with SZ
conference, May 2016
- Di, Sheng; Cappello, Franck
- 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Lossless compression of high-volume numerical data from simulations
conference, January 2000
- Engelson, V.; Fritzson, D.; Fritzson, P.
- Proceedings DCC 2000. Data Compression Conference
FTI: high performance fault tolerance interface for hybrid systems
conference, January 2011
- Bautista-Gomez, Leonardo; Tsuboi, Seiji; Komatitsch, Dimitri
- Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
Efficient TPC data compression by track and cluster modeling
journal, October 2006
- Röhrich, Dieter; Vestbø, Anders
- Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 566, Issue 2
Exploring the feasibility of lossy compression for PDE simulations
journal, November 2017
- Calhoun, Jon; Cappello, Franck; Olson, Luke N.
- The International Journal of High Performance Computing Applications, Vol. 33, Issue 2
Fixed-Rate Compressed Floating-Point Arrays
journal, December 2014
- Lindstrom, Peter
- IEEE Transactions on Visualization and Computer Graphics, Vol. 20, Issue 12
Lossy compression of TPC data and trajectory tracking efficiency for the ALICE experiment
journal, March 2003
- Nicolaucig, A.; Ivanov, M.; Mattavelli, M.
- Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 500, Issue 1-3
Improving I/O Forwarding Throughput with Data Compression
conference, September 2011
- Welton, Benjamin; Kimpe, Dries; Cope, Jason
- 2011 IEEE International Conference on Cluster Computing (CLUSTER)
HACC: extreme scaling and performance across diverse architectures
journal, December 2016
- Habib, Salman; Insley, Joe; Daniel, David
- Communications of the ACM, Vol. 60, Issue 1
Optimizing Lossy Compression with Adjacent Snapshots for N-body Simulation Data
conference, December 2018
- Li, Sihuan; Di, Sheng; Liang, Xin
- 2018 IEEE International Conference on Big Data (Big Data)
Error-Controlled Lossy Compression Optimized for High Compression Ratios of Scientific Datasets
conference, December 2018
- Liang, Xin; Di, Sheng; Tao, Dingwen
- 2018 IEEE International Conference on Big Data (Big Data)
Fast and Efficient Compression of Floating-Point Data
journal, September 2006
- Lindstrom, Peter; Isenburg, Martin
- IEEE Transactions on Visualization and Computer Graphics, Vol. 12, Issue 5
McrEngine: A Scalable Checkpointing System Using Data-Aware Aggregation and Compression
journal, January 2013
- Islam, Tanzima Zerin; Mohror, Kathryn; Bagchi, Saurabh
- Scientific Programming, Vol. 21, Issue 3-4
The JPEG still picture compression standard
journal, January 1992
- Wallace, G. K.
- IEEE Transactions on Consumer Electronics, Vol. 38, Issue 1
Scheduling the I/O of HPC Applications Under Congestion
conference, May 2015
- Gainaru, Ana; Aupy, Guillaume; Benoit, Anne
- 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
The impact of JPEG2000 lossy compression on the scientific quality of radio astronomy imagery
journal, October 2014
- Peters, S. M.; Kitaeff, V. V.
- Astronomy and Computing, Vol. 6
Se-SAD serial femtosecond crystallography datasets from selenobiotinyl-streptavidin
journal, April 2017
- Yoon, Chun Hong; DeMirci, Hasan; Sierra, Raymond G.
- Scientific Data, Vol. 4, Issue 1
Multilevel techniques for compression and reduction of scientific data—the univariate case
journal, November 2018
- Ainsworth, Mark; Tugluk, Ozan; Whitney, Ben
- Computing and Visualization in Science, Vol. 19, Issue 5-6
The History of Storage Systems
journal, May 2012
- Goda, K.; Kitsuregawa, M.
- Proceedings of the IEEE, Vol. 100, Issue Special Centennial Issue
In-depth exploration of single-snapshot lossy compression techniques for N-body simulations
conference, December 2017
- Tao, Dingwen; Di, Sheng; Chen, Zizhong
- 2017 IEEE International Conference on Big Data (Big Data)
Data Reduction Techniques for Simulation, Visualization and Data Analysis: Survey on Scientific Data Reduction Techniques
journal, March 2018
- Li, S.; Marsaglia, N.; Garth, C.
- Computer Graphics Forum, Vol. 37, Issue 6
18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios
conference, January 2017
- Fu, Haohuan; Yin, Wanwang; Yang, Guangwen
- Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17
Compression of interferometric radio-astronomical data
journal, November 2016
- Offringa, A. R.
- Astronomy & Astrophysics, Vol. 595
The Paris Climate Agreement and future sea-level rise from Antarctica
journal, May 2021
- DeConto, Robert M.; Pollard, David; Alley, Richard B.
- Nature, Vol. 593, Issue 7857
The JPEG still picture compression standard
journal, April 1991
- Wallace, Gregory K.
- Communications of the ACM, Vol. 34, Issue 4
The impact of JPEG2000 lossy compression on the scientific quality of radio astronomy imagery
preprint, January 2014
- Peters, Sean M.; Kitaeff, Vyacheslav V.
- arXiv
Data Compression in the Petascale Astronomy Era: a GERLUMPH case study
text, January 2015
- Vohl, Dany; Fluke, Christopher J.; Vernardos, Georgios
- arXiv
Improving Performance of Iterative Methods by Lossy Checkponting
text, January 2018
- Tao, Dingwen; Di, Sheng; Liang, Xin
- arXiv
Works referencing / citing this record:
Significantly improving lossy compression quality based on an optimized hybrid prediction model
conference, November 2019
- Liang, Xin; Di, Sheng; Li, Sihuan
- SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Compression Challenges in Large Scale Partial Differential Equation Solvers
journal, September 2019
- Götschel, Sebastian; Weiser, Martin
- Algorithms, Vol. 12, Issue 9
Figures / Tables found in this record: