skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing

Journal Article · · Journal of Physics. Conference Series
 [1];  [2];  [3];  [4];  [1];  [5];  [1];  [3];  [1];  [3];  [6];  [7];  [8];  [7];  [1]
  1. Brookhaven National Lab. (BNL), Upton, NY (United States)
  2. European Organization for Nuclear Research (CERN), Geneva (Switzerland)
  3. Univ. of Texas, Arlington, TX (United States)
  4. Rutgers Univ., Piscataway, NJ (United States)
  5. SLAC National Accelerator Lab., Menlo Park, CA (United States)
  6. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  7. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  8. Argonne National Lab. (ANL), Argonne, IL (United States)

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(102) sites, O(105) cores, O(108) jobs per year, O(103) users, and ATLAS data volume is O(1017) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Science (SC), High Energy Physics (HEP)
Contributing Organization:
Brookhaven National Lab. (BNL), Upton, NY (United States); Univ. of Texas, Arlington, TX (United States)
Grant/Contract Number:
AC05-00OR22725; AC02-98CH10886; AC02-06CH11357
OSTI ID:
1265526
Journal Information:
Journal of Physics. Conference Series, Vol. 608, Issue 1; ISSN 1742-6588
Publisher:
IOP PublishingCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 16 works
Citation information provided by
Web of Science

References (4)

PanDA: distributed production and distributed analysis system for ATLAS journal July 2008
Open Science Grid Study of the Coupling between Conformation and Water Content in the Interior of a Protein journal October 2008
The antimatter spectrometer (AMS-02): A particle physics detector in space
  • Battiston, Roberto
  • Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 588, Issue 1-2 https://doi.org/10.1016/j.nima.2008.01.044
journal April 2008
Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC journal September 2012

Cited By (1)

PanDA Workload Management System Meta-data Segmentation journal January 2015

Similar Records

Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science
Conference · Fri Jan 01 00:00:00 EST 2016 · OSTI ID:1265526

Accelerating Science Impact through Big Data Workflow Management and Supercomputing
Journal Article · Tue Feb 09 00:00:00 EST 2016 · EPJ Web of Conferences · OSTI ID:1265526

Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science
Journal Article · Sat Oct 01 00:00:00 EDT 2016 · Journal of Physics. Conference Series · OSTI ID:1265526

Related Subjects