DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Understanding I/O workload characteristics of a Peta-scale storage system

Abstract

Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the I/O workloads of scientific applications of one of the world's fastest high performance computing (HPC) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). OLCF flagship petascale simulation platform, Titan, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, storage space utilization, and the distribution of read requests to write requests for the Peta-scale Storage Systems. From this study, we develop synthesized workloads, and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. We also study the I/O load imbalance problems using I/O performance data collected from the Spider storage system.

Authors:
 [1];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1185800
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Supercomputing
Additional Journal Information:
Journal Volume: 71; Journal Issue: 3; Journal ID: ISSN 0920-8542
Publisher:
Springer
Country of Publication:
United States
Language:
English
Subject:
storage systems; I/O; load imbalance; read request; request size; workload characterization

Citation Formats

Kim, Youngjae, and Gunasekaran, Raghul. Understanding I/O workload characteristics of a Peta-scale storage system. United States: N. p., 2014. Web. doi:10.1007/s11227-014-1321-8.
Kim, Youngjae, & Gunasekaran, Raghul. Understanding I/O workload characteristics of a Peta-scale storage system. United States. https://doi.org/10.1007/s11227-014-1321-8
Kim, Youngjae, and Gunasekaran, Raghul. Tue . "Understanding I/O workload characteristics of a Peta-scale storage system". United States. https://doi.org/10.1007/s11227-014-1321-8. https://www.osti.gov/servlets/purl/1185800.
@article{osti_1185800,
title = {Understanding I/O workload characteristics of a Peta-scale storage system},
author = {Kim, Youngjae and Gunasekaran, Raghul},
abstractNote = {Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the I/O workloads of scientific applications of one of the world's fastest high performance computing (HPC) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). OLCF flagship petascale simulation platform, Titan, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, storage space utilization, and the distribution of read requests to write requests for the Peta-scale Storage Systems. From this study, we develop synthesized workloads, and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. We also study the I/O load imbalance problems using I/O performance data collected from the Spider storage system.},
doi = {10.1007/s11227-014-1321-8},
journal = {Journal of Supercomputing},
number = 3,
volume = 71,
place = {United States},
year = {Tue Nov 11 00:00:00 EST 2014},
month = {Tue Nov 11 00:00:00 EST 2014}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 12 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Workload Characterization and Performance Implications of Large-Scale Blog Servers
journal, November 2012

  • Jeon, Myeongjae; Kim, Youngjae; Hwang, Jeaho
  • ACM Transactions on the Web, Vol. 6, Issue 4
  • DOI: 10.1145/2382616.2382619

Characterization of storage workload traces from production Windows Servers
conference, October 2008

  • Kavalanekar, Swaroop; Worthington, Bruce
  • 2008 IEEE International Symposium on Workload Characterization (IISWC)
  • DOI: 10.1109/IISWC.2008.4636097

Evaluation of disk-level workloads at different time scales
journal, October 2009

  • Riska, Alma; Riedel, Erik
  • ACM SIGMETRICS Performance Evaluation Review, Vol. 37, Issue 2
  • DOI: 10.1145/1639562.1639589

Efficient management of idleness in storage systems
journal, June 2009


Internet Web servers: workload characterization and performance implications
journal, January 1997

  • Arlitt, M. F.; Williamson, C. L.
  • IEEE/ACM Transactions on Networking, Vol. 5, Issue 5
  • DOI: 10.1109/90.649565

ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
journal, August 2009

  • Noeth, Michael; Ratn, Prasun; Mueller, Frank
  • Journal of Parallel and Distributed Computing, Vol. 69, Issue 8
  • DOI: 10.1016/j.jpdc.2008.09.001

Probabilistic Communication and I/O Tracing with Deterministic Replay at Scale
conference, September 2011

  • Wu, Xing; Vijayakumar, Karthik; Mueller, Frank
  • 2011 International Conference on Parallel Processing (ICPP)
  • DOI: 10.1109/ICPP.2011.50

Understanding and improving computational science storage access through continuous characterization
conference, May 2011

  • Carns, Philip; Harms, Kevin; Allcock, William
  • 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)
  • DOI: 10.1109/MSST.2011.5937212

Works referencing / citing this record:

Fair bandwidth allocating and strip-aware prefetching for concurrent read streams and striped RAIDs in distributed file systems
journal, May 2018