skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Expressing Parallelism with ROOT

Abstract

The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.

Authors:
 [1];  [1];  [1];  [1];  [1];  [1];  [1];  [2]
  1. European Organization for Nuclear Research (CERN), Geneva (Switzerland)
  2. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Publication Date:
Research Org.:
Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), High Energy Physics (HEP)
OSTI Identifier:
1415642
Report Number(s):
FERMILAB-CONF-16-738-CD
Journal ID: ISSN 1742-6588; 1638554
Grant/Contract Number:  
AC02-07CH11359
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Journal of Physics. Conference Series
Additional Journal Information:
Journal Volume: 898; Journal Issue: 7; Conference: 22nd International Conference on Computing in High Energy and Nuclear Physics, San Francisco, CA, , 10/10-10/14/2016; Journal ID: ISSN 1742-6588
Publisher:
IOP Publishing
Country of Publication:
United States
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS; 97 MATHEMATICS AND COMPUTING

Citation Formats

Piparo, D., Tejedor, E., Guiraud, E., Ganis, G., Mato, P., Moneta, L., Valls Pla, X., and Canal, P. Expressing Parallelism with ROOT. United States: N. p., 2017. Web. doi:10.1088/1742-6596/898/7/072022.
Piparo, D., Tejedor, E., Guiraud, E., Ganis, G., Mato, P., Moneta, L., Valls Pla, X., & Canal, P. Expressing Parallelism with ROOT. United States. https://doi.org/10.1088/1742-6596/898/7/072022
Piparo, D., Tejedor, E., Guiraud, E., Ganis, G., Mato, P., Moneta, L., Valls Pla, X., and Canal, P. 2017. "Expressing Parallelism with ROOT". United States. https://doi.org/10.1088/1742-6596/898/7/072022. https://www.osti.gov/servlets/purl/1415642.
@article{osti_1415642,
title = {Expressing Parallelism with ROOT},
author = {Piparo, D. and Tejedor, E. and Guiraud, E. and Ganis, G. and Mato, P. and Moneta, L. and Valls Pla, X. and Canal, P.},
abstractNote = {The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.},
doi = {10.1088/1742-6596/898/7/072022},
url = {https://www.osti.gov/biblio/1415642}, journal = {Journal of Physics. Conference Series},
issn = {1742-6588},
number = 7,
volume = 898,
place = {United States},
year = {Wed Nov 22 00:00:00 EST 2017},
month = {Wed Nov 22 00:00:00 EST 2017}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Figure 1 Figure 1: Reading, decompressing and deserializing a dataset. The CMS dataset features about 70 columns holding information about event simulation performed with Geant4 (GenSim data tier). The Atlas input is a dataset meant for final analysis (xAOD format): it features about 200 columns. The runtime reduction is between 3 andmore » 3.5 on a machine offering four physical cores. No change in the user code was necessary, just a call to the global function ROOT::EnableImplicitMT().« less

Save / Share:
Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.