PanDA: Production and Distributed Analysis System
Journal Article
·
· Computing and Software for Big Science
- Brookhaven National Laboratory (BNL), Upton, NY (United States)
- University of Texas at Arlington, TX (United States)
- University of Pittsburgh, PA (United States)
The Production and Distributed Analysis (PanDA) system is a data-driven workload management system engineered to operate at the LHC data processing scale. The PanDA system provides a solution for scientific experiments to fully leverage their distributed heterogeneous resources, showcasing scalability, usability, flexibility, and robustness. The system has successfully proven itself through nearly two decades of steady operation in the ATLAS experiment, addressing the intricate requirements such as diverse resources distributed worldwide at about 200 sites, thousands of scientists analyzing the data remotely, the volume of processed data beyond the exabyte scale, dozens of scientific applications to support, and data processing over several billion hours of computing usage per year. PanDA’s flexibility and scalability make it suitable for the High Energy Physics community and wider science domains at the Exascale. Beyond High Energy Physics, PanDA’s relevance extends to other big data sciences, as evidenced by its adoption in the Vera C. Rubin Observatory and the sPHENIX experiment. As the significance of advanced workflows continues to grow, PanDA has transformed into a comprehensive ecosystem, effectively tackling challenges associated with emerging workflows and evolving computing technologies. The paper discusses PanDA’s prominent role in the scientific landscape, detailing its architecture, functionality, deployment strategies, project management approaches, results, and evolution into an ecosystem.
- Research Organization:
- Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), High Energy Physics (HEP)
- Grant/Contract Number:
- SC0012704
- OSTI ID:
- 2283314
- Report Number(s):
- BNL--225244-2024-JAAM
- Journal Information:
- Computing and Software for Big Science, Journal Name: Computing and Software for Big Science Journal Issue: 1 Vol. 8; ISSN 2510-2036
- Publisher:
- SpringerCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Integrating the PanDA Workload Management System with the Vera C. Rubin Observatory
Integrating the PanDA Workload Management System with the Vera C. Rubin Observatory
Preparation of the Multi-Site Data Processing at the Vera C. Rubin Observatory
Conference
·
Sun Dec 31 23:00:00 EST 2023
· EPJ Web Conf.
·
OSTI ID:2468771
Integrating the PanDA Workload Management System with the Vera C. Rubin Observatory
Journal Article
·
Sun May 05 20:00:00 EDT 2024
· EPJ Web of Conferences (Online)
·
OSTI ID:2281342
Preparation of the Multi-Site Data Processing at the Vera C. Rubin Observatory
Conference
·
Wed Oct 01 00:00:00 EDT 2025
· EPJ Web Conf.
·
OSTI ID:3003660