DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: CMS readiness for multi-core workload scheduling

Abstract

In the present run of the LHC, CMS data reconstruction and simulation algorithms benefit greatly from being executed as multiple threads running on several processor cores. The complexity of the Run 2 events requires parallelization of the code to reduce the memory-per- core footprint constraining serial execution programs, thus optimizing the exploitation of present multi-core processor architectures. The allocation of computing resources for multi-core tasks, however, becomes a complex problem in itself. The CMS workload submission infrastructure employs multi-slot partitionable pilots, built on HTCondor and GlideinWMS native features, to enable scheduling of single and multi-core jobs simultaneously. This provides a solution for the scheduling problem in a uniform way across grid sites running a diversity of gateways to compute resources and batch system technologies. This paper presents this strategy and the tools on which it has been implemented. The experience of managing multi-core resources at the Tier-0 and Tier-1 sites during 2015, along with the deployment phase to Tier-2 sites during early 2016 is reported. The process of performance monitoring and optimization to achieve efficient and flexible use of the resources is also described.

Authors:
 [1];  [2];  [3];  [4];  [5];  [6];  [7]
  1. Pord d'Informacio Cientifica, Barcelona (Spain); Research Centre for Energy, Environment and Technology (CIEMAT), Madrid (Spain)
  2. California Inst. of Technology (CalTech), Pasadena, CA (United States)
  3. Research Centre for Energy, Environment and Technology (CIEMAT), Madrid (Spain)
  4. Quaid-I-Azam Univ., Islamabad (Pakistan)
  5. Univ. of California, San Diego, CA (United States)
  6. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
  7. Bulgarian Academy of Sciences, Sofia (Bulgaria)
Publication Date:
Research Org.:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), High Energy Physics (HEP)
OSTI Identifier:
1420916
Report Number(s):
FERMILAB-CONF-16-755-CD
Journal ID: ISSN 1742-6588; 1638487
Grant/Contract Number:  
AC02-07CH11359
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Physics. Conference Series
Additional Journal Information:
Journal Volume: 898; Journal Issue: 5; Conference: 22nd International Conference on Computing in High Energy and Nuclear Physics, San Francisco, CA, 10/10-10/14/2016; Journal ID: ISSN 1742-6588
Publisher:
IOP Publishing
Country of Publication:
United States
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS; 97 MATHEMATICS AND COMPUTING

Citation Formats

Yzquierdo, A. Perez-Calero, Balcas, J., Hernandez, J., Aftab Khan, F., Letts, J., Mason, D., and Verguilov, V. CMS readiness for multi-core workload scheduling. United States: N. p., 2017. Web. doi:10.1088/1742-6596/898/5/052030.
Yzquierdo, A. Perez-Calero, Balcas, J., Hernandez, J., Aftab Khan, F., Letts, J., Mason, D., & Verguilov, V. CMS readiness for multi-core workload scheduling. United States. https://doi.org/10.1088/1742-6596/898/5/052030
Yzquierdo, A. Perez-Calero, Balcas, J., Hernandez, J., Aftab Khan, F., Letts, J., Mason, D., and Verguilov, V. Wed . "CMS readiness for multi-core workload scheduling". United States. https://doi.org/10.1088/1742-6596/898/5/052030. https://www.osti.gov/servlets/purl/1420916.
@article{osti_1420916,
title = {CMS readiness for multi-core workload scheduling},
author = {Yzquierdo, A. Perez-Calero and Balcas, J. and Hernandez, J. and Aftab Khan, F. and Letts, J. and Mason, D. and Verguilov, V.},
abstractNote = {In the present run of the LHC, CMS data reconstruction and simulation algorithms benefit greatly from being executed as multiple threads running on several processor cores. The complexity of the Run 2 events requires parallelization of the code to reduce the memory-per- core footprint constraining serial execution programs, thus optimizing the exploitation of present multi-core processor architectures. The allocation of computing resources for multi-core tasks, however, becomes a complex problem in itself. The CMS workload submission infrastructure employs multi-slot partitionable pilots, built on HTCondor and GlideinWMS native features, to enable scheduling of single and multi-core jobs simultaneously. This provides a solution for the scheduling problem in a uniform way across grid sites running a diversity of gateways to compute resources and batch system technologies. This paper presents this strategy and the tools on which it has been implemented. The experience of managing multi-core resources at the Tier-0 and Tier-1 sites during 2015, along with the deployment phase to Tier-2 sites during early 2016 is reported. The process of performance monitoring and optimization to achieve efficient and flexible use of the resources is also described.},
doi = {10.1088/1742-6596/898/5/052030},
journal = {Journal of Physics. Conference Series},
number = 5,
volume = 898,
place = {United States},
year = {Wed Nov 22 00:00:00 EST 2017},
month = {Wed Nov 22 00:00:00 EST 2017}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Figure 1 Figure 1: Schematic view of the CMS global pool including its main components.

Save / Share:

Works referenced in this record:

Using the CMS High Level Trigger as a Cloud Resource
journal, June 2014


Works referencing / citing this record:

Evolution of the CMS Global Submission Infrastructure for the HL-LHC Era
journal, January 2020

  • Pérez-Calero Yzquierdo, Antonio; Acosta Flechas, Maria; Davila Foyo, Diego
  • EPJ Web of Conferences, Vol. 245
  • DOI: 10.1051/epjconf/202024503016

Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.