An empirical study of I/O separation for burst buffers in HPC systems
Journal Article
·
· Journal of Parallel and Distributed Computing
- Seoul National Univ. (Korea, Republic of)
- Korea Aerospace Univ., Goyang (Korea, Republic of)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Korea Inst. of Science and Technology Information, Daejeon (Korea, Republic of)
To meet the exascale I/O requirements for the High-Performance Computing (HPC), a new I/O subsystem, Burst Buffer, based on solid state drives (SSD), has been developed. However, the diverse HPC workloads and the bursty I/O pattern cause severe data fragmentation that requires costly garbage collection (GC) and increases the number of bytes written to the SSD. To address this data fragmentation challenge, a new multi-stream feature has been developed for SSDs. In this work, we develop an I/O Separation scheme called BIOS to leverage this multi-stream feature to group the I/O streams based on the user IDs. We propose a stream-aware scheduling policy based on burst buffer pools in the workload manager, and integrate the BIOS with the workload manager to optimize the I/O separation scheme in burst buffer. We evaluate the proposed framework with a burst buffer I/O traces from Cori Supercomputer including a diverse set of applications. Experimental results show that the BIOS could improve the performance by 1.44x on average and reduce the Write Amplification Factor (WAF) by up to 1.20x. Finally, these demonstrate the potential benefits of the I/O separation scheme for solid state storage systems.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
- Sponsoring Organization:
- Korea Inst. of Science and Technology Information; National Research Foundation of Korea (NRF); USDOE; USDOE Office of Science (SC)
- Grant/Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1766544
- Alternate ID(s):
- OSTI ID: 1783404
- Journal Information:
- Journal of Parallel and Distributed Computing, Journal Name: Journal of Parallel and Distributed Computing Vol. 148; ISSN 0743-7315
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Nyx: A MASSIVELY PARALLEL AMR CODE FOR COMPUTATIONAL COSMOLOGY
|
journal | February 2013 |
An embedded boundary method for the Navier–Stokes equations on a time-dependent domain
|
journal | January 2012 |
Similar Records
Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior
Final Report for File System Support for Burst Buffers on HPC Systems
Conference
·
Fri Sep 01 00:00:00 EDT 2017
·
OSTI ID:1567463
Final Report for File System Support for Burst Buffers on HPC Systems
Technical Report
·
Sun Nov 26 23:00:00 EST 2017
·
OSTI ID:1411671