Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems
Journal Article
·
· IEEE Transactions on Parallel and Distributed Systems
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- University of Illinois at Urbana-Champaign, IL (United States)
The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the exception of the POSIX standard and its strict consistency model. The use of POSIX consistency imposes a performance penalty that becomes more significant as the scale of parallel file systems increases and the access time to storage devices, such as node-local solid storage devices, decreases. While some efforts have been made to adopt relaxed storage consistency models, these models are often defined informally and ambiguously as by-products of a particular implementation. Here in this work, we establish a connection between memory consistency models and storage consistency models and revisit the key design choices of storage consistency models from a high-level perspective. Further, we propose a formal and unified framework for defining storage consistency models and a layered implementation that can be used to easily evaluate their relative performance for different I/O workloads. Finally, we conduct a comprehensive performance comparison of two relaxed consistency models on a range of commonly seen parallel I/O workloads, such as checkpoint/restart of scientific applications and random reads of deep learning applications. We demonstrate that for certain I/O scenarios, a weaker consistency model can significantly improve the I/O performance. For instance, in small random reads that are typically found in deep learning applications, session consistency achieved a 5x improvement in I/O bandwidth compared to commit consistency, even at small scales.
- Research Organization:
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- National Science Foundation (NSF); USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- Grant/Contract Number:
- AC52-07NA27344
- OSTI ID:
- 2370617
- Report Number(s):
- LLNL--JRNL-849174; 1074740
- Journal Information:
- IEEE Transactions on Parallel and Distributed Systems, Journal Name: IEEE Transactions on Parallel and Distributed Systems Journal Issue: 6 Vol. 35; ISSN 1045-9219
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Overview of the MPI-IO Parallel I/O Interface
|
book | January 1996 |
Gfarm/BB — Gfarm File System for Node-Local Burst Buffer
|
journal | January 2020 |
ECHOFS: A Scheduler-Guided Temporary Filesystem to Leverage Node-Local NVMS
|
conference | September 2018 |
Parallelizing Training of Deep Generative Models on Massive Scientific Datasets
|
conference | September 2019 |
ImageNet: A large-scale hierarchical image database
|
conference | June 2009 |
Understanding HPC Application I/O Behavior Using System Level Statistics
|
conference | December 2020 |
UnifyFS: A User-level Shared File System for Unified Access to Distributed Local Storage
|
conference | May 2023 |
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
|
conference | November 2010 |
An Ephemeral Burst-Buffer File System for Scientific Applications
|
conference | November 2016 |
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
|
journal | September 1979 |
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
|
journal | January 2021 |
The Java memory model
|
journal | January 2005 |
Memory access buffering in multiprocessors
|
journal | May 1986 |
x86-TSO
|
journal | July 2010 |
LBANN: livermore big artificial neural network HPC toolkit
|
conference | January 2015 |
Towards Scalable Parallel Training of Deep Neural Networks
|
conference | November 2017 |
Weak ordering—a new definition
|
journal | June 1990 |
Memory consistency and event ordering in scalable shared-memory multiprocessors
|
journal | May 1990 |
End-to-end I/O portfolio for the summit supercomputing ecosystem
|
conference | November 2019 |
Revisiting I/O behavior in large-scale storage systems
|
conference | November 2019 |
A massively parallel infrastructure for adaptive multiscale simulations: modeling RAS initiation pathway for cancer
|
conference | November 2019 |
File System Semantics Requirements of HPC Applications
|
conference | June 2021 |
Clairvoyant prefetching for distributed machine learning I/O
|
conference | November 2021 |
Similar Records
Lightweight storage and overlay networks for fault tolerance.
...And Eat it Too: High Read Performance in Write-Optimized HPC I/O Middleware File Formats
Characterizing Machine Learning I/O Workloads on Leadership Scale HPC Systems
Technical Report
·
Thu Dec 31 23:00:00 EST 2009
·
OSTI ID:989384
...And Eat it Too: High Read Performance in Write-Optimized HPC I/O Middleware File Formats
Conference
·
Wed Dec 31 23:00:00 EST 2008
·
OSTI ID:982187
Characterizing Machine Learning I/O Workloads on Leadership Scale HPC Systems
Conference
·
Mon Nov 01 00:00:00 EDT 2021
·
OSTI ID:1885376