Scaling Up Data-Parallel Analytics Platforms: Linear Algebraic Operation Cases

Xu, Luna; Lim, Seung-Hwan; Li, Min; Butt, Ali R.; Kannan, Ramakrishnan {ramki}

doi:10.1109/BigData.2017.8257935

Scaling Up Data-Parallel Analytics Platforms: Linear Algebraic Operation Cases

Conference · Thu Nov 30 23:00:00 EST 2017

DOI:https://doi.org/10.1109/BigData.2017.8257935· OSTI ID:1422792

Xu, Luna ^[1]; ^[2]; Li, Min ^[3]; Butt, Ali R. ^[1]; ^[2]

Virginia Tech, Blacksburg, VA
ORNL
IBM Almaden Research

Linear algebraic operations such as matrix manipulations form the kernel of many machine learning and other crucial algorithms. Scaling up as well as scaling out such algorithms are key to supporting large scale data analysis that require efficient processing over millions of data samples. To this end, we present, ARION, a hardware acceleration based approach for scaling-up individual tasks of Spark, a popular data-parallel analytics platform. We support both linear algebraic operations of between two dense matrices, and between sparse and dense matrices in distributed environments. ARION provides a flexible control of acceleration according to matrix density, along with efficient scheduling based on runtime resource utilization. We demonstrate the benefit of our approach for general matrix multiplication operations over large matrices with up to four billion elements by using Gramian matrix computation that is commonly used in machine learning. Experiments show that our approach achieves more than 2× and 1.5× end-to-end performance speedups for dense and sparse matrices, respectively, and up to 57.04× faster computation compared to MLlib, a state of the art Spark-based implementation. This work is sponsored in part by the NSF under the grants: CNS-1565314, CNS-1405697, and CNS-1615411. The manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

View Conference

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1422792

Country of Publication:: United States

Language:: English

References (24)

Big data analytics with small footprint Canny, John; Zhao, Huasha Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining https://doi.org/10.1145/2487575.2487677	conference	August 2013
Elemental: A New Framework for Distributed Memory Dense Matrix Computations Poulson, Jack; Marker, Bryan; van de Geijn, Robert A. ACM Transactions on Mathematical Software, Vol. 39, Issue 2 https://doi.org/10.1145/2427023.2427030	journal	February 2013
Comparing the performance of different x86 SIMD instruction sets for a medical imaging application on modern multi- and manycore chips Hofmann, Johannes; Treibig, Jan; Hager, Georg Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing https://doi.org/10.1145/2568058.2568068	conference	February 2014
A BSP-Based Parallel Iterative Processing System with Multiple Partition Strategies for Big Graphs Wang, Zhigang; Bao, Yubin; Gu, Yu 2013 IEEE International Congress on Big Data https://doi.org/10.1109/BigData.Congress.2013.31	conference	June 2013
Pregel: a system for large-scale graph processing Malewicz, Grzegorz; Austern, Matthew H.; Bik, Aart J. C. Proceedings of the 2010 international conference on Management of data - SIGMOD '10 https://doi.org/10.1145/1807167.1807184	conference	January 2010
Plapack Alpatov, Philip; Baker, Greg; Edwards, Carter Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '97 https://doi.org/10.1145/509593.509622	conference	January 1997
Design, Synthesis and Dipeptidyl Peptidase 4 Inhibition of Novel Aminomethyl Biaryl Derivatives Meng, Xiangguo; Cai, Zhengyan; Hao, Qun Current Enzyme Inhibition, Vol. 13, Issue 3 https://doi.org/10.2174/1573408013666161121161130	journal	September 2017
Rethinking Data-Intensive Science Using Scalable Analytics Systems Nothaft, Frank Austin; Linderman, Michael; Franklin, Michael J. Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data - SIGMOD '15 https://doi.org/10.1145/2723372.2742787	conference	January 2015
sPCA Elgamal, Tarek; Yabandeh, Maysam; Aboulnaga, Ashraf Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data https://doi.org/10.1145/2723372.2751520	conference	May 2015
Sensitivity of PCA for traffic anomaly detection Ringberg, Haakon; Soule, Augustin; Rexford, Jennifer ACM SIGMETRICS Performance Evaluation Review, Vol. 35, Issue 1 https://doi.org/10.1145/1269899.1254895	journal	June 2007
Autotuning GEMM Kernels for the Fermi GPU Kurzak, Jakub; Tomov, Stanimire; Dongarra, Jack IEEE Transactions on Parallel and Distributed Systems, Vol. 23, Issue 11 https://doi.org/10.1109/TPDS.2011.311	journal	November 2012
Spark-Based Large-Scale Matrix Inversion for Big Data Processing Liu, Jun; Liang, Yang; Ansari, Nirwan IEEE Access, Vol. 4 https://doi.org/10.1109/ACCESS.2016.2546544	journal	January 2016
Singular Value Decomposition and Principal Component Analysis Wall, Michael E.; Rechtsteiner, Andreas; Rocha, Luis M. A Practical Approach to Microarray Data Analysis https://doi.org/10.1007/0-306-47815-3_5	book	January 2005
Scalable matrix inversion using MapReduce Xiang, Jingen; Meng, Huangdong; Aboulnaga, Ashraf Proceedings of the 23rd international symposium on High-performance parallel and distributed computing https://doi.org/10.1145/2600212.2600220	conference	June 2014
A fast GEMM implementation on the cypress GPU Nakasato, Naohito ACM SIGMETRICS Performance Evaluation Review, Vol. 38, Issue 4 https://doi.org/10.1145/1964218.1964227	journal	March 2011
A Multi-Platform Evaluation of the Randomized CX Low-Rank Matrix Factorization in Spark Gittens, Alex; Kottalam, Jey; Yang, Jiyan 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) https://doi.org/10.1109/IPDPSW.2016.114	conference	May 2016
An Improved Magma Gemm For Fermi Graphics Processing Units Nath, Rajib; Tomov, Stanimire; Dongarra, Jack The International Journal of High Performance Computing Applications, Vol. 24, Issue 4 https://doi.org/10.1177/1094342010385729	journal	September 2010
Exploiting accelerators for efficient high dimensional similarity search Agrawal, Sandeep R.; Dee, Christopher M.; Lebeck, Alvin R. Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP '16 https://doi.org/10.1145/2851141.2851144	conference	January 2016
Level 3 basic linear algebra subprograms for sparse matrices Duff, Iain S.; Marrone, Michele; Radicati, Giuseppe ACM Transactions on Mathematical Software, Vol. 23, Issue 3 https://doi.org/10.1145/275323.275327	journal	September 1997
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations Kang, U.; Tsourakakis, Charalampos E.; Faloutsos, Christos 2009 Ninth IEEE International Conference on Data Mining https://doi.org/10.1109/ICDM.2009.14	conference	December 2009
FatMan vs. LittleBoy: Scaling Up Linear Algebraic Operations in Scale-Out Data Platforms Xu, Luna; Lim, Seung-Hwan; Butt, Ali R. 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS) https://doi.org/10.1109/PDSW-DISCS.2016.009	conference	November 2016
Caffe: Convolutional Architecture for Fast Feature Embedding Jia, Yangqing; Shelhamer, Evan; Donahue, Jeff Proceedings of the ACM International Conference on Multimedia - MM '14 https://doi.org/10.1145/2647868.2654889	conference	January 2014
ScaLAPACK Blackford, Laura Susan; Choi, J.; Cleary, A. Proceedings of the 1996 ACM/IEEE conference on Supercomputing https://doi.org/10.1145/369028.369038	conference	November 1996
HAMA: An Efficient Matrix Computation with the MapReduce Framework Seo, Sangwon; Yoon, Edward J.; Kim, Jaehong 2010 IEEE Second International Conference on Cloud Computing Technology and Science https://doi.org/10.1109/CloudCom.2010.17	conference	November 2010

Similar Records

Sample changers for direct geometry neutron chopper spectrometers

Journal Article · Fri Aug 29 00:00:00 EDT 2025 · Scientific Reports · OSTI ID:3002892

Data Analysis Approach for Large Data Volumes in a Connected Community

Conference · Sun Jan 31 23:00:00 EST 2021 · OSTI ID:1783001

A study of stress-induced phase transformation and micromechanical behavior of CuZr-based alloy by in-situ neutron diffraction

Journal Article · Tue Feb 28 23:00:00 EST 2017 · Journal of Alloys and Compounds · OSTI ID:1335306

Scaling Up Data-Parallel Analytics Platforms: Linear Algebraic Operation Cases

Citation Formats

References (24)

Similar Records

Related Subjects