Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

Journal Article · · ACM Transactions on Mathematical Software
DOI:https://doi.org/10.1145/3378445· OSTI ID:1639093
 [1];  [2];  [2]
  1. Wake Forest University, Winston-Salem, NC (United States)
  2. Sandia National Lab. (SNL-CA), Livermore, CA (United States)

With this study, our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI software package for compressing distributed data. The approach is based on treating the data as a tensor, i.e., a multidimensional array, and computing its truncated Tucker decomposition, a higher-order analogue to the truncated singular value decomposition of a matrix. The result is a low-rank approximation of the original tensor-structured data. Compression efficiency is achieved by detecting latent global structure within the data, which we contrast to most compression methods that are focused on local structure. In this work, we describe TuckerMPI, our implementation of the truncated Tucker decomposition, including details of the data distribution and in-memory layouts, the parallel and serial implementations of the key kernels, and analysis of the storage, communication, and computational costs. We test the software on 4.5 and 6.7 terabyte data sets distributed across 100 s of nodes (1,000 s of MPI processes), achieving compression ratios between 100 and 200,000×, which equates to 99--99.999% compression (depending on the desired accuracy) in substantially less time than it would take to even read the same dataset from a parallel file system. Moreover, we show that our method also allows for reconstruction of partial or down-sampled data on a single node, without a parallel computer so long as the reconstructed portion is small enough to fit on a single machine, e.g., in the instance of reconstructing/visualizing a single down-sampled time step or computing summary statistics. The code is available at https://gitlab.com/tensors/TuckerMPI.

Research Organization:
Sandia National Laboratories (SNL-CA), Livermore, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21); USDOE National Nuclear Security Administration (NNSA); National Science Foundation (NSF)
Grant/Contract Number:
AC04-94AL85000; NA0003525
OSTI ID:
1639093
Report Number(s):
SAND--2020-6977J; 687210
Journal Information:
ACM Transactions on Mathematical Software, Journal Name: ACM Transactions on Mathematical Software Journal Issue: 2 Vol. 46; ISSN 0098-3500
Publisher:
Association for Computing MachineryCopyright Statement
Country of Publication:
United States
Language:
English

References (19)

Numerical tensor calculus journal May 2014
A New Truncation Strategy for the Higher-Order Singular Value Decomposition journal January 2012
Time-varying, multivariate volume data reduction conference January 2005
Accelerating the Tucker Decomposition with Compressed Sparse Tensors book January 2017
Some mathematical notes on three-mode factor analysis journal September 1966
Lossy volume compression using Tucker truncation and thresholding journal May 2015
Structure of hydrogen-rich transverse jets in a vitiated turbulent flow journal April 2015
Data reduction method for droplet deformation experiments based on High Order Singular Value Decomposition journal December 2016
Analysis and compression of six-dimensional gyrokinetic datasets using higher order singular value decomposition journal June 2012
Numerical tensor calculus journal May 2014
Velocity and Reactive Scalar Dissipation Spectra in Turbulent Premixed Flames journal June 2016
Terascale direct numerical simulations of turbulent combustion using S3D journal January 2009
High-Performance Dense Tucker Decomposition on GPU Clusters conference November 2018
Fast Alternating LS Algorithms for High Order CANDECOMP/PARAFAC Tensor Factorizations journal October 2013
Fixed-Rate Compressed Floating-Point Arrays journal December 2014
Tensor Decompositions and Applications journal August 2009
A Multilinear Singular Value Decomposition journal January 2000
An input-adaptive and in-place approach to dense tensor-times-matrix multiply
  • Li, Jiajia; Battaglino, Casey; Perros, Ioakeim
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15 https://doi.org/10.1145/2807591.2807671
conference January 2015
Optimization of Collective Communication Operations in MPICH journal February 2005

Cited By (1)


Similar Records

TuckerCompressMPI v. 1.0
Software · Tue Sep 20 20:00:00 EDT 2016 · OSTI ID:code-45231

Parallel Tensor Compression for Large-Scale Scientific Data.
Technical Report · Thu Oct 01 00:00:00 EDT 2015 · OSTI ID:1226255

SymProp: Scaling Sparse Symmetric Tucker Decomposition via Symmetry Propagation
Conference · Sun Jun 01 00:00:00 EDT 2025 · OSTI ID:3002148