Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

HPC formulations of optimization algorithms for tensor completion

Journal Article · · Parallel Computing
 [1];  [1];  [1]
  1. Univ. of Minnesota, Minneapolis, MN (United States)

Tensor completion is a powerful tool used to estimate or recover missing values in multi-way data. It has seen great success in domains such as product recommendation and healthcare. Tensor completion is most often accomplished via low-rank sparse tensor factorization, a computationally expensive non-convex optimization problem which has only recently been studied in the context of parallel computing. In this work, we study three optimization algorithms that have been successfully applied to tensor completion: alternating least squares (ALS), stochastic gradient descent (SGD), and coordinate descent (CCD++). We explore opportunities for parallelism on shared- and distributed-memory systems and address challenges such as memory- and operation-efficiency, load balance, cache locality, and communication. Among our advancements are a communication-efficient CCD++ algorithm, an ALS algorithm rich in level-3 BLAS routines, and an SGD algorithm which combines stratification with asynchronous communication. Furthermore, we show that introducing randomization during ALS and CCD++ can accelerate convergence. We evaluate our parallel formulations on a variety of real datasets on a modern supercomputer and demonstrate speedups through 16384 cores. Further, these improvements reduce time-to-solution from hours to seconds on real-world datasets. We show that after our optimizations, ALS is advantageous on parallel systems of small-to-moderate scale, while both ALS and CCD++ provide the lowest time-to-solution on large-scale distributed systems.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Organization:
USDOE Office of Science (SC); National Science Foundation (NSF); US Army Research Office (ARO)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1478749
Alternate ID(s):
OSTI ID: 1549506
Journal Information:
Parallel Computing, Journal Name: Parallel Computing Journal Issue: C Vol. 74; ISSN 0167-8191
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (10)

Towards a standardized notation and terminology in multiway analysis journal January 2000
On uniqueness in candecomp/parafac journal September 2002
Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition journal September 1970
Fast optimal load balancing algorithms for 1D partitioning journal August 2004
Parallel algorithms for tensor completion in the CP format journal September 2016
Tensor Algebra and Multidimensional Harmonic Retrieval in Signal Processing for MIMO Radar journal November 2010
Tensor Decompositions and Applications journal August 2009
Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems journal January 2012
Accelerated, Parallel, and Proximal Coordinate Descent journal January 2015
The MovieLens Datasets: History and Context journal January 2016

Similar Records

GentenMPI: Distributed Memory Sparse Tensor Decomposition
Technical Report · Sat Aug 01 00:00:00 EDT 2020 · OSTI ID:1656940

An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU
Journal Article · Sun Jan 04 23:00:00 EST 2015 · Computer Physics Communications · OSTI ID:1185465

Avoiding Communication in Primal and Dual Block Coordinate Descent Methods
Journal Article · Wed Jan 16 23:00:00 EST 2019 · SIAM Journal on Scientific Computing · OSTI ID:1544140