skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Python and HPC for High Energy Physics Data Analyses

Journal Article ·
 [1];  [1];  [1];  [1]
  1. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)

High level abstractions in Python that can utilize computing hardware well seem to be an attractive option for writing data reduction and analysis tasks. In this paper, we explore the features available in Python which are useful and efficient for end user analysis in High Energy Physics (HEP). A typical vertical slice of an HEP data analysis is somewhat fragmented: the state of the reduction/analysis process must be saved at certain stages to allow for selective reprocessing of only parts of a generally time-consuming workflow. Also, algorithms tend to to be modular because of the heterogeneous nature of most detectors and the need to analyze different parts of the detector separately before combining the information. This fragmentation causes difficulties for interactive data analysis, and as data sets increase in size and complexity (O10 TiB for a “small” neutrino experiment to the O10 PiB currently held by the CMS experiment at the LHC), data analysis methods traditional to the field must evolve to make optimum use of emerging HPC technologies and platforms. Mainstream big data tools, while suggesting a direction in terms of what can be done if an entire data set can be available across a system and analysed with high-level programming abstractions, are not designed with either scientific computing generally, or modern HPC platform features in particular, such as data caching levels, in mind. Our example HPC use case is a search for a new elementary particle which might explain the phenomenon known as “Dark Matter”. Here, using data from the CMS detector, we will use HDF5 as our input data format, and MPI with Python to implement our use case.

Research Organization:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), High Energy Physics (HEP)
Grant/Contract Number:
AC02-07CH11359
OSTI ID:
1413085
Report Number(s):
FERMILAB-CONF-17-437-CD; 1642376
Country of Publication:
United States
Language:
English

References (3)

Spark and HPC for High Energy Physics Data Analyses conference May 2017
Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC journal September 2012
ROOT — An object oriented data analysis framework journal April 1997

Cited By (2)

Waveform Signal Entropy and Compression Study of Whole-Building Energy Datasets conference January 2019
Waveform Signal Entropy and Compression Study of Whole-Building Energy Datasets preprint January 2018

Similar Records

Spark and HPC for High Energy Physics Data Analyses
Journal Article · Mon May 01 00:00:00 EDT 2017 · OSTI ID:1413085

Data-parallel Python for High Energy Physics Analyses
Conference · Fri Oct 26 00:00:00 EDT 2018 · OSTI ID:1413085

Parallel Event Selection on HPC Systems
Conference · Tue Jan 01 00:00:00 EST 2019 · EPJ Web Conf. · OSTI ID:1413085