skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Next Generation Workload Management and Analysis System for Big Data

Abstract

We report on the activities and accomplishments of a four-year project (a three-year grant followed by a one-year no cost extension) to develop a next generation workload management system for Big Data. The new system is based on the highly successful PanDA software developed for High Energy Physics (HEP) in 2005. PanDA is used by the ATLAS experiment at the Large Hadron Collider (LHC), and the AMS experiment at the space station. The program of work described here was carried out by two teams of developers working collaboratively at Brookhaven National Laboratory (BNL) and the University of Texas at Arlington (UTA). These teams worked closely with the original PanDA team – for the sake of clarity the work of the next generation team will be referred to as the BigPanDA project. Their work has led to the adoption of BigPanDA by the COMPASS experiment at CERN, and many other experiments and science projects worldwide.

Authors:
ORCiD logo [1]
  1. Univ. of Texas, Arlington, TX (United States)
Publication Date:
Research Org.:
Univ. of Texas, Arlington, TX (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1352908
Report Number(s):
DOE-UTA-8635-1
DOE Contract Number:
SC0008635
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; PanDA

Citation Formats

De, Kaushik. Next Generation Workload Management and Analysis System for Big Data. United States: N. p., 2017. Web. doi:10.2172/1352908.
De, Kaushik. Next Generation Workload Management and Analysis System for Big Data. United States. doi:10.2172/1352908.
De, Kaushik. Mon . "Next Generation Workload Management and Analysis System for Big Data". United States. doi:10.2172/1352908. https://www.osti.gov/servlets/purl/1352908.
@article{osti_1352908,
title = {Next Generation Workload Management and Analysis System for Big Data},
author = {De, Kaushik},
abstractNote = {We report on the activities and accomplishments of a four-year project (a three-year grant followed by a one-year no cost extension) to develop a next generation workload management system for Big Data. The new system is based on the highly successful PanDA software developed for High Energy Physics (HEP) in 2005. PanDA is used by the ATLAS experiment at the Large Hadron Collider (LHC), and the AMS experiment at the space station. The program of work described here was carried out by two teams of developers working collaboratively at Brookhaven National Laboratory (BNL) and the University of Texas at Arlington (UTA). These teams worked closely with the original PanDA team – for the sake of clarity the work of the next generation team will be referred to as the BigPanDA project. Their work has led to the adoption of BigPanDA by the COMPASS experiment at CERN, and many other experiments and science projects worldwide.},
doi = {10.2172/1352908},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Apr 24 00:00:00 EDT 2017},
month = {Mon Apr 24 00:00:00 EDT 2017}
}

Technical Report:

Save / Share:
  • The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2) sites, O(10 5) cores, O(10 8) jobs per year, O(10 3) users, and ATLAS data volume is O(10 17) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.« less
  • This summary report describes data management and visualization activities in the Advanced Simulation and Computing (ASC) program at Lawrence Livermore National Laboratory (LLNL). The report covers the period from approximately October 2003 to June 2004 and describes activities within the Visual Interactive Environment for Weapons Simulation (VIEWS) ASC program element. This report and the references herein are intended to document the completion of the following Level 2 Milestone from the ASC FY04-05 Implementation Plan, due at the end of Quarter 3 in FY04:
  • BPA proposes to clear unwanted vegetation from the rights of way and access roads for BPA's Big Eddy-Ostrander Transmission Line, beginning April and ending in May, 2001. A Checklist was completed for this project in accordance to the requirements identified in the Bonneville Power Administrations Transmission System Vegetation Management Program FEIS (DOE/EIS-0285). The Checklist evaluated the following areas: (1) Description of right-of-way and vegetation management needed; (2) Vegetation to be controlled; (3) Surrounding land use and landowner; (4) Natural Resource; (5) Vegetation control methods; (6) Debris disposal; (7) Monitoring; and (8) Appropriate environmental documentation. In preparation of this Supplement Analysis,more » the Checklist was reviewed. Specific information regarding the areas as identified above are described the attached checklist. This Supplement Analysis finds that: (1) the proposed actions are substantially consistent with the Transmission System Vegetation Management Program FEIS (DOE/EIS0285) and ROD; and (2) there are no new circumstances or information relevant to environmental concerns and bearing on the proposed actions or their impacts. Therefore, no further NEPA documentation is required.« less
  • To perform remedial vegetation management for keeping vegetation a safe distance away from electric power facilities and controlling noxious weeds within a section of BPA's Big Eddy-Ostrander Transmission Corridor. During a site review conducted in late fall of 2001, the inspector observed various species of hardwood trees resprouted from stumps. The new vegetative growth encroached on the required “Minimum Safe Distance” between the top of vegetation and the conductor cables. The management action is necessary to reduce the current and potential future hazards that tall-growing vegetation poses to transmission conductors. In addition, BPA will include weed control as part ofmore » their remedial vegetation management action. Noxious weeds occur within the corridor. Under a 1999 Executive Order, all federal agencies are required to detect and control noxious weeds. In addition, BPA is required under the 1990 amendment to the Noxious Weed Act (7 USC 2801-2814) to manage undesirable plants on federal land. Also, the Bonneville Power Administration (BPA) has responsibility to manage noxious weeds under the Transmission System Vegetation Management Program Final Environmental Impact Statement (FEIS).1 State statutes and regulations also mandate action by BPA and the USFS to control noxious weeds. The Oregon Department of Agriculture (ODA) has requested that agencies aggressively control these weeds before additional spread occurs.« less