skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Reducing I/O variability using dynamic I/O path characterization in petascale storage systems

Journal Article · · Journal of Supercomputing
 [1];  [2];  [3];  [4];  [3]
  1. Univ. of Massachusetts, Lowell, MA (United States)
  2. Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
  3. Northwestern Univ., Evanston, IL (United States)
  4. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

In petascale systems with a million CPU cores, scalable and consistent I/O performance is becoming increasingly difficult to sustain mainly because of I/O variability. The I/O variability is caused by concurrently running processes/jobs competing for I/O or a RAID rebuild when a disk drive fails. We present a mechanism that stripes across a selected subset of I/O nodes with the lightest workload at runtime to achieve the highest I/O bandwidth available in the system. In this paper, we propose a probing mechanism to enable application-level dynamic file striping to mitigate I/O variability. We implement the proposed mechanism in the high-level I/O library that enables memory-to-file data layout transformation and allows transparent file partitioning using subfiling. Subfiling is a technique that partitions data into a set of files of smaller size and manages file access to them, making data to be treated as a single, normal file to users. We demonstrate that our bandwidth probing mechanism can successfully identify temporally slower I/O nodes without noticeable runtime overhead. Experimental results on NERSC’s systems also show that our approach isolates I/O variability effectively on shared systems and improves overall collective I/O performance with less variation.

Research Organization:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Sandia National Lab. (SNL-CA), Livermore, CA (United States); Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); National Science Foundation (NSF); USDOE Office of Science (SC), High Energy Physics (HEP)
Grant/Contract Number:
AC04-94AL85000; FG02-08ER25848; SC0001283; SC0005309; SC0005340; SC0007456; AC02-05CH11231; AC02-07CH11359
OSTI ID:
1356839
Alternate ID(s):
OSTI ID: 1469016
Report Number(s):
SAND-2017-3907J; FERMILAB-PUB-17-292-CD; PII: 1904
Journal Information:
Journal of Supercomputing, Vol. 73, Issue 5; ISSN 0920-8542
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 8 works
Citation information provided by
Web of Science

References (47)

Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols conference November 2008
Characterizing output bottlenecks in a supercomputer
  • Xie, Bing; Chase, Jeffrey; Dillow, David
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2012.28
conference November 2012
TRIO: Burst Buffer Based I/O Orchestration conference September 2015
FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes journal November 2000
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
  • Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten
  • Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08 https://doi.org/10.1145/1383529.1383533
conference January 2008
IO-Cop: Managing Concurrent Accesses to Shared Parallel File System
  • Thapaliya, Sagar; Bangalore, Purushotham; Lofstead, Jay
  • 2014 43nd International Conference on Parallel Processing Workshops (ICCPW), 2014 43rd International Conference on Parallel Processing Workshops https://doi.org/10.1109/ICPPW.2014.20
conference September 2014
PLFS: a checkpoint filesystem for parallel applications conference January 2009
IO strategies and data services for petascale data sets from a global cloud resolving model journal July 2007
Jitter-free co-processing on a prototype exascale storage stack conference April 2012
Breaking the Cloud Parameterization Deadlock journal November 2003
Scalable Design and Implementations for MPI Parallel Overlapping I/O journal November 2006
I/O-aware bandwidth allocation for petascale computing systems journal October 2016
Server-directed collective I/O in Panda conference January 1995
On the role of burst buffers in leadership-class storage systems conference April 2012
Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems conference November 2014
Y-lib: a user level library to increase the performance of MPI-IO in a lustre file system environment conference January 2009
A case study for scientific I/O: improving the FLASH astrophysics code journal January 2012
BurstMem: A high-performance burst buffer system for scientific applications conference October 2014
Log-Assisted Straggler-Aware I/O Scheduler for High-End Computing conference August 2016
How Much SSD Is Useful for Resilience in Supercomputers conference January 2015
Managing Variability in the IO Performance of Petascale Storage Systems
  • Lofstead, Jay; Zheng, Fang; Liu, Qing
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.32
conference November 2010
Managing I/O Interference in a Shared Burst Buffer System conference August 2016
Massively Parallel i/o for Partitioned Solver Systems journal December 2010
Data sieving and collective I/O in ROMIO conference January 1999
Improved parallel I/O via a two-phase run-time access strategy journal December 1993
ASCAR: Automating contention management for high-performance storage systems conference May 2015
Server-side I/O coordination for parallel file systems
  • Song, Huaiming; Yin, Yanlong; Sun, Xian-He
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11 https://doi.org/10.1145/2063384.2063407
conference January 2011
Parallel I/O Performance for Application-Level Checkpointing on the Blue Gene/P System conference September 2011
QoS support for end users of I/O-intensive applications using shared storage systems
  • Zhang, Xuechen; Davis, Kei; Jiang, Song
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11 https://doi.org/10.1145/2063384.2063408
conference January 2011
Efficient data restructuring and aggregation for I/O acceleration in PIDX
  • Kumar, Sidharth; Vishwanath, Venkatram; Carns, Philip
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/sc.2012.54
conference November 2012
Toward a General I/O Layer for Parallel-Visualization Applications journal November 2011
Exploiting Lustre File Joining for Effective Collective IO conference May 2007
The Tau Parallel Performance System journal May 2006
ParColl: Partitioned Collective I/O on the Cray XT conference September 2008
Using Subfiling to Improve Programming Flexibility and Performance of Parallel Shared-file I/O conference September 2009
Disk-directed I/O for MIMD multiprocessors journal February 1997
On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems conference May 2016
Direct numerical simulations of turbulent lean premixed combustion journal September 2006
I/O performance challenges at leadership scale conference January 2009
PIDX: Efficient Parallel I/O for Multi-resolution Multi-dimensional Scientific Datasets conference September 2011
Parallel netCDF: A High-Performance Scientific I/O Interface conference January 2003
Insights for exascale IO APIs from building a petascale IO API conference January 2013
CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination
  • Dorier, Matthieu; Antoniu, Gabriel; Ross, Rob
  • 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2014.27
conference May 2014
Dynamic file striping and data layout transformation on parallel system with fluctuating I/O workload conference September 2013
24/7 Characterization of petascale I/O workloads conference August 2009
A User-Level InfiniBand-Based File System and Checkpoint Strategy for Burst Buffers conference May 2014
PLFS: A Checkpoint Filesystem for Parallel Applications text January 2005

Cited By (1)

Approaches of enhancing interoperations among high performance computing and big data analytics via augmentation journal August 2019