Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis

Conference ·

Using parallel file systems efficiently is a tricky problem due to inter-dependencies among multiple layers of I/O software, including high-level I/O libraries (HDF5, netCDF, etc.), MPI-IO, POSIX, and file systems (GPFS, Lustre, etc.). Profiling tools such as Darshan collect traces to help understand the I/O performance behavior. However, there are significant gaps in analyzing the collected traces and then applying tuning options offered by various layers of I/O software. Seeking to connect the dots between I/O bottleneck detection and tuning, we propose DXT Explorer, an interactive log analysis tool. In this paper, we present a case study using our interactive log analysis tool to identify and apply various I/O optimizations. We report an evaluation of performance improvement achieved for four I/O kernels extracted from science applications.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1871120
Resource Relation:
Conference: 6th International Parallel Data Systems Workshop (PDSW) - Virtual Meeting, Tennessee, United States of America - 11/14/2021 8:00:00 PM-11/20/2021 10:00:00 AM
Country of Publication:
United States
Language:
English

References (17)

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems journal January 2020
Recorder 2.0: Efficient Parallel I/O Tracing and Analysis conference May 2020
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS
  • Prost, Jean-Pierre; Treumann, Richard; Hedges, Richard
  • Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '01 https://doi.org/10.1145/582034.582051
conference January 2001
The Software development process of FLASH, a multiphysics simulation code conference May 2013
Revisiting I/O behavior in large-scale storage systems
  • Patel, Tirthak; Byna, Suren; Lockwood, Glenn K.
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356183
conference November 2019
Data sieving and collective I/O in ROMIO conference January 1999
Battle of the Defaults: Extracting Performance Characteristics of HDF5 under Production Load conference May 2021
A User-Friendly Approach for Tuning Parallel File Operations conference November 2014
Taming parallel I/O complexity with auto-tuning
  • Behzad, Babak; Luu, Huong Vu Thanh; Huchette, Joseph
  • SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/2503210.2503278
conference November 2013
Active Learning-based Automatic Tuning and Prediction of Parallel I/O Performance conference November 2019
Understanding and improving computational science storage access through continuous characterization conference May 2011
CAPES: unsupervised storage performance tuning using neural network-based deep reinforcement learning
  • Li, Yan; Chang, Kenneth; Bel, Oceane
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17 https://doi.org/10.1145/3126908.3126951
conference January 2017
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
  • Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten
  • Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08 https://doi.org/10.1145/1383529.1383533
conference January 2008
Six degrees of scientific data: reading patterns for extreme scale science IO conference January 2011
Foundations of JSON Schema
  • Pezoa, Felipe; Reutter, Juan L.; Suarez, Fernando
  • WWW '16: 25th International World Wide Web Conference, Proceedings of the 25th International Conference on World Wide Web https://doi.org/10.1145/2872427.2883029
conference April 2016
The ELPA library: scalable parallel eigenvalue solutions for electronic structure theory and computational science journal May 2014
ScaLAPACK Users' Guide book January 1997

Similar Records

I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis
Conference · 2021 · 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW) · OSTI ID:1959023

Scalable I/O Tracing and Analysis
Conference · 2009 · OSTI ID:986831

HPC Global File System Performance Analysis Using A Scientific-Application Derived Benchmark
Journal Article · 2008 · Parallel Computing Systems&Applications · OSTI ID:963541

Related Subjects