Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Rethinking key–value store for parallel I/O optimization

Journal Article · · International Journal of High Performance Computing Applications
Key-value stores are being widely used as the storage system for large-scale internet services and cloud storage systems. However, they are rarely used in HPC systems, where parallel file systems are the dominant storage solution. In this study, we examine the architecture differences and performance characteristics of parallel file systems and key-value stores. We propose using key-value stores to optimize overall Input/Output (I/O) performance, especially for workloads that parallel file systems cannot handle well, such as the cases with intense data synchronization or heavy metadata operations. We conducted experiments with several synthetic benchmarks, an I/O benchmark, and a real application. We modeled the performance of these two systems using collected data from our experiments, and we provide a predictive method to identify which system offers better I/O performance given a specific workload. The results show that we can optimize the I/O performance in HPC systems by utilizing key-value stores.
Research Organization:
Argonne National Laboratory (ANL)
Sponsoring Organization:
National Science Foundation (NSF)
DOE Contract Number:
AC02-06CH11357
OSTI ID:
1390205
Journal Information:
International Journal of High Performance Computing Applications, Journal Name: International Journal of High Performance Computing Applications Journal Issue: 4 Vol. 31; ISSN 1094-3420
Publisher:
SAGE
Country of Publication:
United States
Language:
English

References (17)

The Google file system conference January 2003
ZHT: A Light-Weight Reliable Persistent Dynamic Scalable Zero-Hop Distributed Hash Table
  • Li, Tonglin; Zhou, Xiaobing; Brandstatter, Kevin
  • 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processing https://doi.org/10.1109/IPDPS.2013.110
conference May 2013
HyperDex: a distributed, searchable key-value store
  • Escriva, Robert; Wong, Bernard; Sirer, Emin Gün
  • Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication - SIGCOMM '12 https://doi.org/10.1145/2342356.2342360
conference January 2012
Small-file access in parallel file systems conference May 2009
IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion
  • Ren, Kai; Zheng, Qing; Patil, Swapnil
  • SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.25
conference November 2014
Boosting Application-Specific Parallel I/O Optimization Using IOSIG conference May 2012
Cassandra: a decentralized structured storage system journal April 2010
Integrating parallel file systems with object-based storage devices conference November 2007
Dynamo: amazon's highly available key-value store journal October 2007
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
  • Moody, Adam; Bronevetsky, Greg; Mohror, Kathryn
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.18
conference November 2010
The Hadoop Distributed File System conference May 2010
A cost-intelligent application-specific data layout scheme for parallel file systems conference June 2011
An Evolutionary Path to Object Storage Access
  • Goodell, David; Kim, Seong Jo; Latham, Robert
  • 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: High Performance Computing, Networking Storage and Analysis https://doi.org/10.1109/SC.Companion.2012.17
conference November 2012
BatchFS: Scaling the File System Control Plane with Client-Funded Metadata Servers conference November 2014
Chord: A scalable peer-to-peer lookup service for internet applications
  • Stoica, Ion; Morris, Robert; Karger, David
  • Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications - SIGCOMM '01 https://doi.org/10.1145/383059.383071
conference January 2001
iBridge: Improving Unaligned Parallel File Access with Solid-State Drives conference May 2013
Server-side I/O coordination for parallel file systems
  • Song, Huaiming; Yin, Yanlong; Sun, Xian-He
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11 https://doi.org/10.1145/2063384.2063407
conference January 2011