skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints

Abstract

In this work, we consider the problem of low-rank approximation of massive dense nonnegative tensor data, for example, to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input data across the memories of multiple nodes, and performing efficient and scalable parallel algorithms to compute the low-rank approximation. We present a software package called Parallel Low-rank Approximation with Nonnegativity Constraints, which implements our solution and allows for extension in terms of data (dense or sparse, matrices or tensors of any order), algorithm (e.g., from multiplicative updating techniques to alternating direction method of multipliers), and architecture (we exploit GPUs to accelerate the computation in this work). We describe our parallel distributions and algorithms, which are careful to avoid unnecessary communication and computation, show how to extend the software to include new algorithms and/or constraints, and report efficiency and scalability results for both synthetic and real-world data sets.

Authors:
 [1];  [1];  [2]; ORCiD logo [3]; ORCiD logo [3];  [1]
  1. Georgia Institute of Technology, Atlanta, GA (United States)
  2. Wake Forest Univ., Winston-Salem, NC (United States)
  3. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC); National Science Foundation (NSF)
OSTI Identifier:
1820808
Grant/Contract Number:  
AC05-00OR22725; OAC-1642385; OAC-1642410; AC02-05CH11231; SC0020347
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
ACM Transactions on Mathematical Software
Additional Journal Information:
Journal Volume: 47; Journal Issue: 3; Journal ID: ISSN 0098-3500
Publisher:
Association for Computing Machinery
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Eswar, Srinivas, Hayashi, Koby, Ballard, Grey, Kannan, Ramakrishnan, Matheson, Michael A., and Park, Haesun. PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints. United States: N. p., 2021. Web. doi:10.1145/3432185.
Eswar, Srinivas, Hayashi, Koby, Ballard, Grey, Kannan, Ramakrishnan, Matheson, Michael A., & Park, Haesun. PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints. United States. https://doi.org/10.1145/3432185
Eswar, Srinivas, Hayashi, Koby, Ballard, Grey, Kannan, Ramakrishnan, Matheson, Michael A., and Park, Haesun. 2021. "PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints". United States. https://doi.org/10.1145/3432185. https://www.osti.gov/servlets/purl/1820808.
@article{osti_1820808,
title = {PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints},
author = {Eswar, Srinivas and Hayashi, Koby and Ballard, Grey and Kannan, Ramakrishnan and Matheson, Michael A. and Park, Haesun},
abstractNote = {In this work, we consider the problem of low-rank approximation of massive dense nonnegative tensor data, for example, to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input data across the memories of multiple nodes, and performing efficient and scalable parallel algorithms to compute the low-rank approximation. We present a software package called Parallel Low-rank Approximation with Nonnegativity Constraints, which implements our solution and allows for extension in terms of data (dense or sparse, matrices or tensors of any order), algorithm (e.g., from multiplicative updating techniques to alternating direction method of multipliers), and architecture (we exploit GPUs to accelerate the computation in this work). We describe our parallel distributions and algorithms, which are careful to avoid unnecessary communication and computation, show how to extend the software to include new algorithms and/or constraints, and report efficiency and scalability results for both synthetic and real-world data sets.},
doi = {10.1145/3432185},
url = {https://www.osti.gov/biblio/1820808}, journal = {ACM Transactions on Mathematical Software},
issn = {0098-3500},
number = 3,
volume = 47,
place = {United States},
year = {2021},
month = {6}
}

Works referenced in this record:

Randomized nonnegative matrix factorization
journal, March 2018


Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons
journal, January 2011


Numerical tensor calculus
journal, May 2014


Interior-Point Gradient Method for Large-Scale Totally Nonnegative Least Squares Problems
journal, July 2005


Nonnegative Matrix and Tensor Factorization [Lecture Notes]
journal, January 2008


A fast non-negativity-constrained least squares algorithm
journal, September 1997


A comparison of algorithms for fitting the PARAFAC model
journal, April 2006


Collective communication: theory, practice, and experience
journal, January 2007


A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization
journal, October 2016


MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization
journal, March 2018


Projected Gradient Methods for Nonnegative Matrix Factorization
journal, October 2007


Hierarchical Singular Value Decomposition of Tensors
journal, January 2010


A weighted non-negative least squares algorithm for three-way ‘PARAFAC’ factor analysis
journal, October 1997


Tensor Decomposition for Signal Processing and Machine Learning
journal, July 2017


Exploiting Efficient Representations in Large-Scale Tensor Decompositions
journal, January 2019


Tensor Decompositions and Applications
journal, August 2009


Computing Dense Tensor Decompositions with Optimal Dimension Trees
journal, October 2018


Long-Term Optical Access to an Estimated One Million Neurons in the Live Mouse Cortex
journal, December 2016


Positive tensor factorization
journal, October 2001


Faster least squares approximation
journal, October 2010


Fast algorithm for the solution of large-scale non-negativity-constrained least squares problems
journal, October 2004


A massively parallel tensor contraction framework for coupled-cluster computations
journal, December 2014


Optimization of Collective Communication Operations in MPICH
journal, February 2005


PARAFAC algorithms for large-scale problems
journal, May 2011


Nesterov-Based Alternating Optimization for Nonnegative Tensor Factorization: Algorithm and Parallel Implementation
journal, February 2018


A Practical Randomized CP Tensor Decomposition
journal, January 2018


Efficient MATLAB Computations with Sparse and Factored Tensors
journal, January 2008


Parallel Nonnegative CP Decomposition of Dense Tensors
conference, December 2018


Communication Lower Bounds for Matricized Tensor Times Khatri-Rao Product
conference, May 2018


Efficient and scalable computations with sparse tensors
conference, September 2012


Distributed non-negative matrix factorization with determination of the number of latent features
journal, February 2020


Block-randomized Stochastic Proximal Gradient for Constrained Low-rank Tensor Factorization
conference, May 2019


A high-performance parallel algorithm for nonnegative matrix factorization
journal, February 2016


High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors
conference, August 2016


Parallel Candecomp/Parafac Decomposition of Sparse Tensors Using Dimension Trees
journal, January 2018


Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework
journal, March 2013


Fast Alternating LS Algorithms for High Order CANDECOMP/PARAFAC Tensor Factorizations
journal, October 2013


SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication
conference, May 2015