Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Partitioning and Communication Strategies for Sparse Non-negative Matrix Factorization

Conference ·
Non-negative matrix factorization (NMF), the problem of finding two non-negative low-rank factors whose product approximates an input matrix, is a useful tool for many data mining and scientific applications such as topic modeling in text mining and unmixing in microscopy. In this paper, we focus on scaling algorithms for NMF to very large sparse datasets and massively parallel machines by employing effective algorithms, communication patterns, and partitioning schemes that leverage the sparsity of the input matrix. We consider two previous works developed for related problems, one that uses a fine-grained partitioning strategy using a point-to-point communication pattern and one that uses a Cartesian, or checkerboard, partitioning strategy using a collective-based communication pattern. We show that a combination of the previous approaches balances the demands of the various computations within NMF algorithms and achieves high efficiency and scalability. From the experiments, we see that our proposed strategy runs up to 10x faster than the state of the art on real-world datasets.
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1470857
Country of Publication:
United States
Language:
English

References (26)

Scalable sparse tensor decompositions in distributed memory systems conference January 2015
Kernels for scalable data analysis in science: Towards an architecture-portable future conference December 2016
R-MAT: A Recursive Model for Graph Mining conference December 2013
F lexi F a CT: Scalable Flexible Factorization of Coupled Tensors on Hadoop conference April 2014
Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis journal May 2007
Alternating direction method of multipliers for non-negative matrix factorization with the beta-divergence conference May 2014
Symmetric Nonnegative Matrix Factorization for Graph Clustering conference December 2013
Supporting Array Programming in X10
  • Grove, David; Milthorpe, Josh; Tardieu, Olivier
  • Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming - ARRAY'14 https://doi.org/10.1145/2627373.2627380
conference January 2014
NMF-mGPU: non-negative matrix factorization on multi-GPU systems journal February 2015
Deep data analysis via physically constrained linear unmixing: universal framework, domain examples, and a community-wide platform journal April 2018
Mini-apps for high performance data analysis conference December 2016
SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering journal November 2014
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation book September 2009
Distributed GraphLab: a framework for machine learning and data mining in the cloud journal April 2012
NOMAD: non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion journal July 2014
A high-performance parallel algorithm for nonnegative matrix factorization
  • Kannan, Ramakrishnan; Ballard, Grey; Park, Haesun
  • Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP '16 https://doi.org/10.1145/2851141.2851152
conference January 2016
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization journal March 2018
Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework journal March 2013
CloudNMF: A MapReduce Implementation of Nonnegative Matrix Factorization for Large-scale Biological Datasets journal February 2014
NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization journal June 2012
Multi-level direct K-way hypergraph partitioning with multiple constraints and fixed vertices journal May 2008
Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication journal July 1999
Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce conference January 2010
Nonlinear Programming journal March 1997
Text Mining using Non-Negative Matrix Factorizations conference December 2013
Navigating the maze of graph analytics frameworks using massive graph datasets
  • Satish, Nadathur; Sundaram, Narayanan; Patwary, Md. Mostofa Ali
  • Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14 https://doi.org/10.1145/2588555.2610518
conference January 2014

Similar Records

Multifrontal Non-negative Matrix Factorization
Conference · Sat Feb 29 23:00:00 EST 2020 · OSTI ID:1649537

MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization
Journal Article · Sun Oct 29 20:00:00 EDT 2017 · IEEE Transactions on Knowledge and Data Engineering · OSTI ID:1429224

A high-performance parallel algorithm for nonnegative matrix factorization
Journal Article · Thu Dec 31 23:00:00 EST 2015 · OSTI ID:1524064

Related Subjects