Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.

doi:10.1088/1742-6596/762/1/012021

Title: Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

Journal Article · Sat Oct 01 00:00:00 EDT 2016 · Journal of Physics. Conference Series

DOI:https://doi.org/10.1088/1742-6596/762/1/012021· OSTI ID:1567418

Klimentov, A. ^[1]; De, K. ^[2]; Jha, S. ^[3]; Maeno, T. ^[1]; Nilsson, P. ^[1]; Oleynik, D. ^[4]; Panitkin, S. ^[1]; Wells, J. ^[5]; Wenaus, T. ^[1]

Brookhaven National Lab. (BNL), Upton, NY (United States). Dept. of Physics
Univ. of Texas, Arlington, TX (United States). Dept. of Physics
Rutgers Univ., Piscataway, NJ (United States). Dept. of Electrical and Computer Engineering
Univ. of Texas, Arlington, TX (United States). Dept. of Physics; Joint Inst. for Nuclear Research (JINR), Dubna (Russia)
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)

The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Brookhaven National Lab. (BNL), Upton, NY (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)

Sponsoring Organization:: USDOE Office of Science (SC)

Grant/Contract Number:: AC02-98CH10886; AC05-00OR22725

OSTI ID:: 1567418

Journal Information:: Journal of Physics. Conference Series, Vol. 762; Conference: 17. International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2016), Valparaiso (Chile), 18-22 Jan 2016; ISSN 1742-6588

Publisher:: IOP PublishingCopyright Statement

Country of Publication:: United States

Language:: English

References (1)

SAGA: A standardized access layer to heterogeneous Distributed Computing Infrastructure Merzky, Andre; Weidner, Ole; Jha, Shantenu SoftwareX, Vol. 1-2 https://doi.org/10.1016/j.softx.2015.03.001	journal	September 2015

Similar Records

Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

Conference · Fri Jan 01 00:00:00 EST 2016 · OSTI ID:1567418

De, K; Jha, S; Klimentov, A; +6 more

INTEGRATION OF PANDA WORKLOAD MANAGEMENT SYSTEM WITH SUPERCOMPUTERS

Conference · Fri Jan 01 00:00:00 EST 2016 · OSTI ID:1567418

De, K; Jha, S; Maeno, T; +13 more

Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

Journal Article · Fri May 22 00:00:00 EDT 2015 · Journal of Physics. Conference Series · OSTI ID:1567418

Klimentov, A.; Buncic, P.; De, K.; +12 more

Related Subjects

97 MATHEMATICS AND COMPUTING
Computer Science
Physics

Title: Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

Citation Formats

References (1)

Similar Records

Related Subjects