Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems

Journal Article · · IEEE Transactions on Parallel and Distributed Systems
The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the exception of the POSIX standard and its strict consistency model. The use of POSIX consistency imposes a performance penalty that becomes more significant as the scale of parallel file systems increases and the access time to storage devices, such as node-local solid storage devices, decreases. While some efforts have been made to adopt relaxed storage consistency models, these models are often defined informally and ambiguously as by-products of a particular implementation. Here in this work, we establish a connection between memory consistency models and storage consistency models and revisit the key design choices of storage consistency models from a high-level perspective. Further, we propose a formal and unified framework for defining storage consistency models and a layered implementation that can be used to easily evaluate their relative performance for different I/O workloads. Finally, we conduct a comprehensive performance comparison of two relaxed consistency models on a range of commonly seen parallel I/O workloads, such as checkpoint/restart of scientific applications and random reads of deep learning applications. We demonstrate that for certain I/O scenarios, a weaker consistency model can significantly improve the I/O performance. For instance, in small random reads that are typically found in deep learning applications, session consistency achieved a 5x improvement in I/O bandwidth compared to commit consistency, even at small scales.
Research Organization:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Organization:
National Science Foundation (NSF); USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
AC52-07NA27344
OSTI ID:
2370617
Report Number(s):
LLNL--JRNL-849174; 1074740
Journal Information:
IEEE Transactions on Parallel and Distributed Systems, Journal Name: IEEE Transactions on Parallel and Distributed Systems Journal Issue: 6 Vol. 35; ISSN 1045-9219
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (23)

Overview of the MPI-IO Parallel I/O Interface book January 1996
Gfarm/BB — Gfarm File System for Node-Local Burst Buffer journal January 2020
ECHOFS: A Scheduler-Guided Temporary Filesystem to Leverage Node-Local NVMS conference September 2018
Parallelizing Training of Deep Generative Models on Massive Scientific Datasets conference September 2019
ImageNet: A large-scale hierarchical image database
  • Deng, Jia; Dong, Wei; Socher, Richard
  • 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), 2009 IEEE Conference on Computer Vision and Pattern Recognition https://doi.org/10.1109/CVPR.2009.5206848
conference June 2009
Understanding HPC Application I/O Behavior Using System Level Statistics conference December 2020
UnifyFS: A User-level Shared File System for Unified Access to Distributed Local Storage conference May 2023
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
  • Moody, Adam; Bronevetsky, Greg; Mohror, Kathryn
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.18
conference November 2010
An Ephemeral Burst-Buffer File System for Scientific Applications
  • Wang, Teng; Mohror, Kathryn; Moody, Adam
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2016.68
conference November 2016
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs journal September 1979
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism journal January 2021
The Java memory model journal January 2005
Memory access buffering in multiprocessors journal May 1986
x86-TSO journal July 2010
LBANN: livermore big artificial neural network HPC toolkit conference January 2015
Towards Scalable Parallel Training of Deep Neural Networks
  • Jacobs, Sam Adé; Dryden, Nikoli; Pearce, Roger
  • SC '17: The International Conference for High Performance Computing, Networking, Storage and Analysis, Proceedings of the Machine Learning on HPC Environments https://doi.org/10.1145/3146347.3146353
conference November 2017
Weak ordering—a new definition journal June 1990
Memory consistency and event ordering in scalable shared-memory multiprocessors journal May 1990
End-to-end I/O portfolio for the summit supercomputing ecosystem
  • Oral, Sarp; Vazhkudai, Sudharshan S.; Wang, Feiyi
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356157
conference November 2019
Revisiting I/O behavior in large-scale storage systems
  • Patel, Tirthak; Byna, Suren; Lockwood, Glenn K.
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356183
conference November 2019
A massively parallel infrastructure for adaptive multiscale simulations: modeling RAS initiation pathway for cancer
  • Di Natale, Francesco; Bhatia, Harsh; Carpenter, Timothy S.
  • SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356197
conference November 2019
File System Semantics Requirements of HPC Applications conference June 2021
Clairvoyant prefetching for distributed machine learning I/O
  • Dryden, Nikoli; Böhringer, Roman; Ben-Nun, Tal
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3458817.3476181
conference November 2021

Figures / Tables (14)


Similar Records

Lightweight storage and overlay networks for fault tolerance.
Technical Report · Thu Dec 31 23:00:00 EST 2009 · OSTI ID:989384

...And Eat it Too: High Read Performance in Write-Optimized HPC I/O Middleware File Formats
Conference · Wed Dec 31 23:00:00 EST 2008 · OSTI ID:982187

Characterizing Machine Learning I/O Workloads on Leadership Scale HPC Systems
Conference · Mon Nov 01 00:00:00 EDT 2021 · OSTI ID:1885376