skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: DuoModel: Leveraging Reduced Model for Data Reduction and Re-Computation on HPC Storage

Abstract

High-performance computing (HPC) applications generate large amounts of floating-point data that need to be stored and analyzed efficiently to extract the insights and advance knowledge discovery. With the growing disparities between compute and I/O, optimizing the storage stack alone may not suffice to cure the I/O problem. There has been a strong push in the HPC communities to perform data reduction before data is transmitted to storage in order to lower the I/O cost. However, as of now, neither lossless nor lossy compressors can achieve the adequate reduction ratio that is desired by applications. This paper proposes DuoModel, a new approach that leverages the similarity between the full and reduced application models, and further improve the data reduction ratio. DouModel further improves the compression ratio of state-of-the-art compressors via compressing the differences (termed as delta) between the data products of the two models. For data analytics, the high fidelity data can be re-computed by launching the reduced model and applying the compressed delta. Our evaluations confirm that DuoModel can further push the limit of data reduction while the high fidelity of data is maintained.


Citation Formats

Luo, Huizhang, Liu, Qing, Qiao, Zhenbo, Wang, Jinzhen, Wang, Mengxiao, and Jiang, Hong. DuoModel: Leveraging Reduced Model for Data Reduction and Re-Computation on HPC Storage. United States: N. p., 2018. Web. doi:10.1109/LCOS.2018.2855118.
Luo, Huizhang, Liu, Qing, Qiao, Zhenbo, Wang, Jinzhen, Wang, Mengxiao, & Jiang, Hong. DuoModel: Leveraging Reduced Model for Data Reduction and Re-Computation on HPC Storage. United States. doi:10.1109/LCOS.2018.2855118.
Luo, Huizhang, Liu, Qing, Qiao, Zhenbo, Wang, Jinzhen, Wang, Mengxiao, and Jiang, Hong. Mon . "DuoModel: Leveraging Reduced Model for Data Reduction and Re-Computation on HPC Storage". United States. doi:10.1109/LCOS.2018.2855118.
@article{osti_1567586,
title = {DuoModel: Leveraging Reduced Model for Data Reduction and Re-Computation on HPC Storage},
author = {Luo, Huizhang and Liu, Qing and Qiao, Zhenbo and Wang, Jinzhen and Wang, Mengxiao and Jiang, Hong},
abstractNote = {High-performance computing (HPC) applications generate large amounts of floating-point data that need to be stored and analyzed efficiently to extract the insights and advance knowledge discovery. With the growing disparities between compute and I/O, optimizing the storage stack alone may not suffice to cure the I/O problem. There has been a strong push in the HPC communities to perform data reduction before data is transmitted to storage in order to lower the I/O cost. However, as of now, neither lossless nor lossy compressors can achieve the adequate reduction ratio that is desired by applications. This paper proposes DuoModel, a new approach that leverages the similarity between the full and reduced application models, and further improve the data reduction ratio. DouModel further improves the compression ratio of state-of-the-art compressors via compressing the differences (termed as delta) between the data products of the two models. For data analytics, the high fidelity data can be re-computed by launching the reduced model and applying the compressed delta. Our evaluations confirm that DuoModel can further push the limit of data reduction while the high fidelity of data is maintained.},
doi = {10.1109/LCOS.2018.2855118},
journal = {IEEE Letters of the Computer Society},
issn = {2573-9689},
number = 1,
volume = 1,
place = {United States},
year = {2018},
month = {1}
}