Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Load-balanced sparse MTTKRP on GPUs

Conference ·
 [1];  [2];  [1];  [3];  [1]
  1. Ohio State University
  2. BATTELLE (PACIFIC NW LAB)
  3. Georgia Institute of Technology

Sparse matricized tensor times Khatri-Rao product (MT- TKRP) is one of the most computationally expensive kernels in sparse tensor computations. This work focuses on optimizing the MTTKRP for floating point operations, storage, and scalability. We begin by identifying the performance bottlenecks in directly extending the state-of-the-art CSF (compressed sparse fiber) formats from CPUs to GPUs. Our detailed analysis over the recently proposed formats shows that the lower bounds on storage and flop counts can vary significantly depending on the structure of the sparse tensor. To address this, we propose a load balanced, computation and storage-efficient scheme, HYB, which combines the best of COO (coordinate), CSF and CSL (compressed slice). With these enhancements, our GPU framework significantly out- performs the current formats on both CPU and GPU platforms.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1862916
Report Number(s):
PNNL-SA-138752
Country of Publication:
United States
Language:
English

Similar Records

Accelerated Constrained Sparse Tensor Factorization on Massively Parallel Architectures
Conference · Thu Aug 01 00:00:00 EDT 2024 · OSTI ID:2438687

Efficient and Effective Sparse Tensor Reordering
Conference · Wed Jun 26 00:00:00 EDT 2019 · OSTI ID:1574893

True Load Balancing for Matricized Tensor Times Khatri-Rao Product
Journal Article · Thu Jan 21 23:00:00 EST 2021 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1765777

Related Subjects