I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis
Using parallel file systems efficiently is a tricky problem due to inter-dependencies among multiple layers of I/O software, including high-level I/O libraries (HDF5, netCDF, etc.), MPI-IO, POSIX, and file systems (GPFS, Lustre, etc.). Profiling tools such as Darshan collect traces to help understand the I/O performance behavior. However, there are significant gaps in analyzing the collected traces and then applying tuning options offered by various layers of I/O software. Seeking to connect the dots between I/O bottleneck detection and tuning, we propose DXT Explorer, an interactive log analysis tool. In this paper, we present a case study using our interactive log analysis tool to identify and apply various I/O optimizations. We report an evaluation of performance improvement achieved for four I/O kernels extracted from science applications.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1959023
- Journal Information:
- 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW), Journal Name: 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW)
- Country of Publication:
- United States
- Language:
- English
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS
|
conference | January 2001 |
Revisiting I/O behavior in large-scale storage systems
|
conference | November 2019 |
Six degrees of scientific data: reading patterns for extreme scale science IO
|
conference | January 2011 |
Active Learning-based Automatic Tuning and Prediction of Parallel I/O Performance
|
conference | November 2019 |
The ELPA library: scalable parallel eigenvalue solutions for electronic structure theory and computational science
|
journal | May 2014 |
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
|
conference | January 2008 |
The Software development process of FLASH, a multiphysics simulation code
|
conference | May 2013 |
Recorder 2.0: Efficient Parallel I/O Tracing and Analysis
|
conference | May 2020 |
A User-Friendly Approach for Tuning Parallel File Operations
|
conference | November 2014 |
Foundations of JSON Schema
|
conference | April 2016 |
CAPES: unsupervised storage performance tuning using neural network-based deep reinforcement learning
|
conference | January 2017 |
Taming parallel I/O complexity with auto-tuning
|
conference | November 2013 |
ScaLAPACK Users' Guide
|
book | January 1997 |
Data sieving and collective I/O in ROMIO
|
conference | January 1999 |
Understanding and improving computational science storage access through continuous characterization
|
conference | May 2011 |
Battle of the Defaults: Extracting Performance Characteristics of HDF5 under Production Load
|
conference | May 2021 |
ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems
|
journal | January 2020 |
Similar Records
Scalable I/O Tracing and Analysis
High Performance Computing Application I/O Traces