Data-parallel Python for High Energy Physics Analyses

Paterno, Marc; Green, C.; Kowalski, J.; Sehrish, S.

Title: Data-parallel Python for High Energy Physics Analyses

Conference · Fri Oct 26 00:00:00 EDT 2018

OSTI ID:1490837

^[1]; Green, C. ^[1]; Kowalski, J. ^[1]; Sehrish, S. ^[1]

Fermilab

In this paper, we explore features available in Python which are useful for data reduction tasks in High Energy Physics (HEP). Highlevel abstractions in Python are convenient for implementing data reduction tasks. However, in order for such abstractions to be practical, the efficiency of their performance must also be high. Because the data sets we process are typically large, we care about both I/O performance and in-memory processing speed. In particular, we evaluate the use of data-parallel programming, using MPI and numpy, to process a large experimental data set (42 TiB) stored in an HDF5 file. We measure the speed of processing of the data, distinguishing between the time spent reading data and the time spent processing the data in memory, and demonstrate the scalability of both, using up to 1200 KNL nodes (76800 cores) on Cori at NERSC

View Conference

Cite

Export

Save

Research Organization:: Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)

Sponsoring Organization:: USDOE Office of Science (SC), High Energy Physics (HEP)

DOE Contract Number:: AC02-07CH11359

OSTI ID:: 1490837

Report Number(s):: FERMILAB-CONF-18-577-CD; 1712348

Country of Publication:: United States

Language:: English

Similar Records

Python and HPC for High Energy Physics Data Analyses

Journal Article · Sun Jan 01 00:00:00 EST 2017 · OSTI ID:1490837

Sehrish, S.; Kowalkowski, J.; Paterno, M.; +1 more

Roofline Analysis in the Intel® Advisor to Deliver Optimized Performance for applications on Intel® Xeon Phi™ Processor

Conference · Tue May 23 00:00:00 EDT 2017 · OSTI ID:1490837

Koskela, Tuomas S.; Lobet, Mathieu; Deslippe, Jack; +1 more

Spark and HPC for High Energy Physics Data Analyses

Journal Article · Mon May 01 00:00:00 EDT 2017 · OSTI ID:1490837

Sehrish, Saba; Kowalkowski, Jim; Paterno, Marc

Title: Data-parallel Python for High Energy Physics Analyses

Citation Formats

Similar Records

Related Subjects