Performance Characterization and Provenance of Distributed Task-based Workflows on HPC Platforms
Understanding performance and provenance of task-based workflows poses significant challenges, particularly in distributed configurations where resources are shared by multiple applications. Task-based workflow management systems further complicate performance predictability because of their dynamicity that subtly alters task execution order from run to run. In this paper we propose a layered characterization framework for performance and task provenance for Dask.distributed workflows running on high-performance computing (HPC) platforms. It collects data from jobs, the workflow management system, and the operating system to aid in understanding the performance of these workflows. Our approach encompasses three main contributions: first, an extension of Dask.distributed to capture high-fidelity task provenance using Mochi data services; second, the adaptation of the established HPC I/O characterization tool Darshan to gather high-fidelity I/O data, thereby enhancing the granularity of our analysis; and third, a framework to combine and process the collected data and provide helpful insights into performance characterization and reproducibility, alongside our lessons learned.
- Research Organization:
- Argonne National Laboratory (ANL)
- Sponsoring Organization:
- US Department of Energy; USDOE Office of Science; USDOE Office of Science - Office of Advanced Scientific Computing Research (ASCR)
- DOE Contract Number:
- AC02-06CH11357
- OSTI ID:
- 2588773
- Country of Publication:
- United States
- Language:
- English
Similar Records
Enabling HPC Scientific Workflows for Serverless
Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization
Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization
Conference
·
Fri Nov 01 00:00:00 EDT 2024
·
OSTI ID:2538241
Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization
Conference
·
Sun Aug 06 00:00:00 EDT 2017
·
OSTI ID:1619260
Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization
Conference
·
Wed Aug 30 00:00:00 EDT 2017
· Proceedings
·
OSTI ID:1556918