Understanding I/O workload characteristics of a Peta-scale storage system
Abstract
Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the I/O workloads of scientific applications of one of the world's fastest high performance computing (HPC) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). OLCF flagship petascale simulation platform, Titan, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, storage space utilization, and the distribution of read requests to write requests for the Peta-scale Storage Systems. From this study, we develop synthesized workloads, and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. We also study the I/O load imbalance problems using I/O performance data collected from the Spider storage system.
- Authors:
-
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences
- Publication Date:
- Research Org.:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1185800
- Grant/Contract Number:
- AC05-00OR22725
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Supercomputing
- Additional Journal Information:
- Journal Volume: 71; Journal Issue: 3; Journal ID: ISSN 0920-8542
- Publisher:
- Springer
- Country of Publication:
- United States
- Language:
- English
- Subject:
- storage systems; I/O; load imbalance; read request; request size; workload characterization
Citation Formats
Kim, Youngjae, and Gunasekaran, Raghul. Understanding I/O workload characteristics of a Peta-scale storage system. United States: N. p., 2014.
Web. doi:10.1007/s11227-014-1321-8.
Kim, Youngjae, & Gunasekaran, Raghul. Understanding I/O workload characteristics of a Peta-scale storage system. United States. https://doi.org/10.1007/s11227-014-1321-8
Kim, Youngjae, and Gunasekaran, Raghul. Tue .
"Understanding I/O workload characteristics of a Peta-scale storage system". United States. https://doi.org/10.1007/s11227-014-1321-8. https://www.osti.gov/servlets/purl/1185800.
@article{osti_1185800,
title = {Understanding I/O workload characteristics of a Peta-scale storage system},
author = {Kim, Youngjae and Gunasekaran, Raghul},
abstractNote = {Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the I/O workloads of scientific applications of one of the world's fastest high performance computing (HPC) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). OLCF flagship petascale simulation platform, Titan, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, storage space utilization, and the distribution of read requests to write requests for the Peta-scale Storage Systems. From this study, we develop synthesized workloads, and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. We also study the I/O load imbalance problems using I/O performance data collected from the Spider storage system.},
doi = {10.1007/s11227-014-1321-8},
journal = {Journal of Supercomputing},
number = 3,
volume = 71,
place = {United States},
year = {Tue Nov 11 00:00:00 EST 2014},
month = {Tue Nov 11 00:00:00 EST 2014}
}
Web of Science
Works referenced in this record:
Workload Characterization and Performance Implications of Large-Scale Blog Servers
journal, November 2012
- Jeon, Myeongjae; Kim, Youngjae; Hwang, Jeaho
- ACM Transactions on the Web, Vol. 6, Issue 4
Characterization of storage workload traces from production Windows Servers
conference, October 2008
- Kavalanekar, Swaroop; Worthington, Bruce
- 2008 IEEE International Symposium on Workload Characterization (IISWC)
Evaluation of disk-level workloads at different time scales
journal, October 2009
- Riska, Alma; Riedel, Erik
- ACM SIGMETRICS Performance Evaluation Review, Vol. 37, Issue 2
Efficient management of idleness in storage systems
journal, June 2009
- Mi, Ningfang; Riska, Alma; Zhang, Qi
- ACM Transactions on Storage, Vol. 5, Issue 2
Internet Web servers: workload characterization and performance implications
journal, January 1997
- Arlitt, M. F.; Williamson, C. L.
- IEEE/ACM Transactions on Networking, Vol. 5, Issue 5
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
journal, August 2009
- Noeth, Michael; Ratn, Prasun; Mueller, Frank
- Journal of Parallel and Distributed Computing, Vol. 69, Issue 8
Probabilistic Communication and I/O Tracing with Deterministic Replay at Scale
conference, September 2011
- Wu, Xing; Vijayakumar, Karthik; Mueller, Frank
- 2011 International Conference on Parallel Processing (ICPP)
Understanding and improving computational science storage access through continuous characterization
conference, May 2011
- Carns, Philip; Harms, Kevin; Allcock, William
- 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)
Works referencing / citing this record:
Fair bandwidth allocating and strip-aware prefetching for concurrent read streams and striped RAIDs in distributed file systems
journal, May 2018
- Lee, Sangmin; Hyun, Soon J.; Kim, Hong-Yeon
- The Journal of Supercomputing, Vol. 74, Issue 8