End-to-end I/O portfolio for the summit supercomputing ecosystem
- ORNL
The I/O subsystem for the Summit supercomputer, No. 1 on the Top500 list, and its ecosystem of analysis platforms is composed of two distinct layers, namely the in-system layer and the center-wide parallel file system layer (PFS), Spider 3. The in-system layer uses node-local SSDs and provides 26.7 TB/s for reads, 9.7 TB/s for writes, and 4.6 billion IOPS to Summit. The Spider 3 PFS layer uses IBM's Spectrum Scale™ and provides 2.5 TB/s and 2.6 million IOPS to Summit and other systems. While deploying them as two distinct layers was operationally efficient, it also presented usability challenges in terms of multiple mount points and lack of transparency in data movement. To address these challenges, we have developed novel end-to-end I/O solutions for the concerted use of the two storage layers. We present the I/O subsystem architecture, the end-to-end I/O solution space, their design considerations and our deployment experience.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1619016
- Country of Publication:
- United States
- Language:
- English
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems
|
conference | November 2018 |
Comparative I/O workload characterization of two leadership class storage clusters
|
conference | January 2015 |
Entropy-Aware I/O Pipelining for Large-Scale Deep Learning on HPC Systems
|
conference | September 2018 |
Characterizing Deep-Learning I/O Workloads in TensorFlow
|
conference | November 2018 |
Fast Parallel Algorithms for Short-Range Molecular Dynamics
|
journal | March 1995 |
Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
|
conference | November 2010 |
Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines
|
conference | June 2011 |
Similar Records
Scaling the Summit: Deploying the World's Fastest Supercomputer
Announcing Supercomputer Summit
Strategies to Deploy and Scale Deep Learning on the Summit Supercomputer
Conference
·
Sat Jun 01 00:00:00 EDT 2019
·
OSTI ID:1561654
Announcing Supercomputer Summit
Multimedia
·
Tue Jun 28 00:00:00 EDT 2016
·
OSTI ID:1259664
Strategies to Deploy and Scale Deep Learning on the Summit Supercomputer
Conference
·
Fri Nov 01 00:00:00 EDT 2019
·
OSTI ID:1606652