A Quantitative Approach to Architecting All-Flash Lustre File Systems
New experimental and AI-driven workloads are moving into the realm of extreme-scale HPC systems at the same time that high-performance flash is becoming cost-effective to deploy at scale. This confluence poses a number of new technical and economic challenges and opportunities in designing the next generation of HPC storage and I/O subsystems to achieve the right balance of bandwidth, latency, endurance, and cost. In this work, we present quantitative models that use workload data from existing, disk-based file systems to project the architectural requirements of all-flash Lustre file systems. Using data from NERSC’s Cori I/O subsystem, we then demonstrate the minimum required capacity for data, capacity for metadata and data-on-MDT, and SSD endurance for a future all-flash Lustre file system.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1827652
- Country of Publication:
- United States
- Language:
- English
Similar Records
Architecture and Performance of Perlmutter’s 35 PB ClusterStor E1000 All-Flash File System
Architecture and performance of Perlmutter's 35 PB ClusterStor E1000 all-flash file system
Conference
·
Thu Dec 31 23:00:00 EST 2020
·
OSTI ID:1798757
Architecture and performance of Perlmutter's 35 PB ClusterStor E1000 all-flash file system
Journal Article
·
Tue Jul 23 00:00:00 EDT 2024
· Concurrency and Computation. Practice and Experience
·
OSTI ID:2440410