skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: FatMan vs. LittleBoy: Scaling up Linear Algebraic Operations in Scale-out Data Platforms

Conference ·
OSTI ID:1339386

Linear algebraic operations such as matrix manipulations form the kernel of many machine learning and other crucial algorithms. Scaling up as well as scaling out such algorithms are highly desirable to enable efficient processing over millions of data points. To this end, we present a matrix manipulation approach to effectively scale-up each node in a scale-out data parallel platform such as Apache Spark. Specifically, we enable hardware acceleration for matrix multiplications in a distributed Spark setup without user intervention. Our approach supports both dense and sparse distributed matrices, and provides flexible control of acceleration by matrix density. We demonstrate the benefit of our approach for generalized matrix multiplication operations over large matrices with up to four billion elements. To connect the effectiveness of our approach with machine learning applications, we performed Gramian matrix computation via generalized matrix multiplications. Our experiments show that our approach achieves more than 2x performance speed-up, and up to 96.1% computation improvement, compared to a state of the art Spark MLlib for dense matrices.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1339386
Resource Relation:
Conference: 1st Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems, Salt Lake City, UT, USA, 20161114, 20161114
Country of Publication:
United States
Language:
English