skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Compressing unstructured mesh data from simulations using machine learning

Journal Article · · International journal of data science and analytics

The amount of data output from a computer simulation has grown to terabytes and petabytes as increasingly complex simulations are being run on massively parallel systems. As we approach exaflop computing in the next decade, it is expected that the I/O subsystem will not be able to write out these large volumes of data. In this paper, we explore the use of machine learning to compress the data before it is written out. Despite the computational constraints that limit us to using very simple learning algorithms, our results show that machine learning is a viable option for compressing unstructured data. Furthermore, we demonstrate that by simply using a better sampling algorithm to generate the training set, we can obtain more accurate results compared to random sampling, but at no extra cost. Further, by carefully selecting and incorporating points with high prediction error, we can improve reconstruction accuracy without sacrificing the compression rate.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
AC52-07NA27344
OSTI ID:
1738887
Report Number(s):
LLNL-JRNL-750460; 935302
Journal Information:
International journal of data science and analytics, Vol. 9, Issue 1; ISSN 2364-415X
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English

References (13)

Fast Error-Bounded Lossy HPC Data Compression with SZ conference May 2016
Fixed-Rate Compressed Floating-Point Arrays journal December 2014
Spectrally optimal sampling for distribution ray tracing conference January 1991
Turbulent Transport Reduction by Zonal Flows: Massively Parallel Simulations journal September 1998
Fast and Efficient Compression of Floating-Point Data journal September 2006
Learning to compress images and videos conference January 2007
ISABELA for effective in situ compression of scientific data: ISABELA FOR EFFECTIVE
  • Lakshminarasimhan, Sriram; Shah, Neil; Ethier, Stephane
  • Concurrency and Computation: Practice and Experience, Vol. 25, Issue 4 https://doi.org/10.1002/cpe.2887
journal July 2012
Learning to Compress Unstructured Mesh Data from Simulations conference October 2017
Spectrally optimal sampling for distribution ray tracing journal July 1991
A Comparison of Compressed Sensing and Sparse Recovery Algorithms Applied to Simulation Data journal August 2016
NUMARCK: Machine Learning Algorithm for Resiliency and Checkpointing
  • Chen, Zhengzhang; Son, Seung Woo; Hendrix, William
  • SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.65
conference November 2014
Fast Poisson disk sampling in arbitrary dimensions conference January 2007
Wavelet-based data compression for flow simulation on block-structured Cartesian mesh: DATA COMPRESSION FOR FLOW SIMULATION ON CARTESIAN MESH journal May 2013

Similar Records

Fast 2D Bicephalous Convolutional Autoencoder for Compressing 3D Time Projection Chamber Data
Journal Article · Sun Nov 12 00:00:00 EST 2023 · International Conference for High Performance Computing, Networking, Storage and Analysis · OSTI ID:1738887

Optimal Compressed Sensing and Reconstruction of Unstructured Mesh Datasets
Journal Article · Wed Aug 09 00:00:00 EDT 2017 · Data Science and Engineering · OSTI ID:1738887

Machine Learning Algorithms for Matching Theories, Simulations, and Observations in Cosmology (Final Project)
Technical Report · Mon Dec 31 00:00:00 EST 2018 · OSTI ID:1738887