Distributed-Memory Parallel JointNMF
- Argonne National Laboratory
- Georgia Institute of Technology
- ORNL
- Wake Forest University, Winston-Salem
- Georgia Institute of Technology, Atlanta
Joint Nonnegative Matrix Factorization (JointNMF) is a hybrid method for mining information from datasets that contain both feature and connection information. We propose distributed-memory parallelizations of three algorithms for solving the JointNMF problem based on Alternating Nonnegative Least Squares, Projected Gradient Descent, and Projected Gauss-Newton. We extend well-known communication-avoiding algorithms using a single processor grid case to our coupled case on two processor grids. We demonstrate the scalability of the algorithms on up to 960 cores (40 nodes) with 60% parallel efficiency. The more sophisticated Alternating Nonnegative Least Squares (ANLS) and Gauss-Newton variants outperform the first-order gradient descent method in reducing the objective on large-scale problems. We perform a topic modelling task on a large corpus of academic papers that consists of over 37 million paper abstracts and nearly a billion citation relationships, demonstrating the utility and scalability of the methods.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1997627
- Resource Relation:
- Conference: International Conference on Supercomputing (ICS) - Orlando, Florida, United States of America - 6/21/2023 8:00:00 AM-6/23/2023 8:00:00 AM
- Country of Publication:
- United States
- Language:
- English
Similar Records
Distributed-Memory Parallel Symmetric Nonnegative Matrix Factorization
Gradient-based Optimization for Regression in the Functional Tensor-Train Format