Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

An empirical study of I/O separation for burst buffers in HPC systems

Journal Article · · Journal of Parallel and Distributed Computing
 [1];  [2];  [3];  [4];  [4];  [3];  [4];  [3];  [3];  [1]
  1. Seoul National Univ. (Korea, Republic of)
  2. Korea Aerospace Univ., Goyang (Korea, Republic of)
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  4. Korea Inst. of Science and Technology Information, Daejeon (Korea, Republic of)
To meet the exascale I/O requirements for the High-Performance Computing (HPC), a new I/O subsystem, Burst Buffer, based on solid state drives (SSD), has been developed. However, the diverse HPC workloads and the bursty I/O pattern cause severe data fragmentation that requires costly garbage collection (GC) and increases the number of bytes written to the SSD. To address this data fragmentation challenge, a new multi-stream feature has been developed for SSDs. In this work, we develop an I/O Separation scheme called BIOS to leverage this multi-stream feature to group the I/O streams based on the user IDs. We propose a stream-aware scheduling policy based on burst buffer pools in the workload manager, and integrate the BIOS with the workload manager to optimize the I/O separation scheme in burst buffer. We evaluate the proposed framework with a burst buffer I/O traces from Cori Supercomputer including a diverse set of applications. Experimental results show that the BIOS could improve the performance by 1.44x on average and reduce the Write Amplification Factor (WAF) by up to 1.20x. Finally, these demonstrate the potential benefits of the I/O separation scheme for solid state storage systems.
Research Organization:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Organization:
Korea Inst. of Science and Technology Information; National Research Foundation of Korea (NRF); USDOE; USDOE Office of Science (SC)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1766544
Alternate ID(s):
OSTI ID: 1783404
Journal Information:
Journal of Parallel and Distributed Computing, Journal Name: Journal of Parallel and Distributed Computing Vol. 148; ISSN 0743-7315
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (2)

Nyx: A MASSIVELY PARALLEL AMR CODE FOR COMPUTATIONAL COSMOLOGY journal February 2013
An embedded boundary method for the Navier–Stokes equations on a time-dependent domain journal January 2012