HPC-NMF: A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization
NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems for $$\WW$$ and $$\HH$$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementation, our algorithm is also flexible: It performs well for both dense and sparse matrices, and allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors $$\WW$$ and $$\HH$$ within the alternating iterations.
- Short Name / Acronym:
- HPC-NMF
- Project Type:
- Open Source, Publicly Available Repository
- Site Accession Number:
- 7320
- Software Type:
- Scientific
- License(s):
- Other
- Programming Language(s):
- C++
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOEPrimary Award/Contract Number:AC05-00OR22725
- DOE Contract Number:
- AC05-00OR22725
- Code ID:
- 73024
- OSTI ID:
- 1339615
- Country of Origin:
- United States
Similar Records
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization
PLANC: Parallel Low-rank Approximation with Nonnegativity Constraints