Exploring the Performance of Spark for a Scientific Use Case

Sehrish, Saba; Kowalkowski, Jim; Paterno, Marc

Title: Exploring the Performance of Spark for a Scientific Use Case

Conference · Fri Jan 01 00:00:00 EST 2016

OSTI ID:1250827

Sehrish, Saba ^[1]; Kowalkowski, Jim ^[1]; Paterno, Marc ^[1]

Fermilab

We present an evaluation of the performance of a Spark implementation of a classification algorithm in the domain of High Energy Physics (HEP). Spark is a general engine for in-memory, large-scale data processing, and is designed for applications where similar repeated analysis is performed on the same large data sets. Classification problems are one of the most common and critical data processing tasks across many domains. Many of these data processing tasks are both computation- and data-intensive, involving complex numerical computations employing extremely large data sets. We evaluated the performance of the Spark implementation on Cori, a NERSC resource, and compared the results to an untuned MPI implementation of the same algorithm. While the Spark implementation scaled well, it is not competitive in speed to our MPI implementation, even when using significantly greater computational resources.

View Conference

Cite

Export

Save

Research Organization:: Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)

Sponsoring Organization:: USDOE Office of Science (SC), High Energy Physics (HEP)

DOE Contract Number:: AC02-07CH11359

OSTI ID:: 1250827

Report Number(s):: FERMILAB-CONF-16-072-CD; 1442301

Country of Publication:: United States

Language:: English

Similar Records

Data-parallel Python for High Energy Physics Analyses

Conference · Fri Oct 26 00:00:00 EDT 2018 · OSTI ID:1250827

Paterno, Marc; Green, C.; Kowalski, J.; +1 more

Exploring MPI Communication Models for Graph Applications Using Graph Matching as a Case Study

Conference · Mon Sep 02 00:00:00 EDT 2019 · OSTI ID:1250827

Ghosh, Sayan; Halappanavar, Mahantesh; Kalyanaraman, Anantharaman; +2 more

Roofline Analysis in the Intel® Advisor to Deliver Optimized Performance for applications on Intel® Xeon Phi™ Processor

Conference · Tue May 23 00:00:00 EDT 2017 · OSTI ID:1250827

Koskela, Tuomas S.; Lobet, Mathieu; Deslippe, Jack; +1 more

Title: Exploring the Performance of Spark for a Scientific Use Case

Citation Formats

Similar Records

Related Subjects