DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Improving the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems with Eley

Abstract

Here, burst buffer is an effective solution for reducing the data transfer time and the I/O interference in HPC systems. Extending Burst Buffers (BBs) to handle Big Data applications is challenging because BBs must account for the large data inputs of Big Data applications and the Quality-of-Service (QoS) of HPC applications which are considered as first-class citizens in HPC systems. Existing BBs focus on only intermediate data of Big Data applications and incur a high performance degradation of both Big Data and HPC applications. We present Eley, a burst buffer solution that helps to accelerate the performance of Big Data applications while guaranteeing the QoS of HPC applications. To achieve this goal, Eley embraces interference-aware prefetching technique that makes reading data input faster while introducing low interference for HPC applications. Evaluations using a wide range of Big Data and HPC applications demonstrate that Eley improves the performance of Big Data applications by up to 30% compared to existing BBs while maintaining the QoS of HPC applications.

Authors:
 [1];  [2];  [3]
  1. Univ. Rennes, Rennes (France); Argonne National Lab. (ANL), Lemont, IL (United States)
  2. Shenzhen Univ., Shenzhen (China)
  3. Inria, Nantes (France)
Publication Date:
Research Org.:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Org.:
National Science Foundation (NSF); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); Shenzhen University
OSTI Identifier:
1467628
Alternate Identifier(s):
OSTI ID: 1583001
Grant/Contract Number:  
AC02-06CH11357
Resource Type:
Accepted Manuscript
Journal Name:
Future Generations Computer Systems
Additional Journal Information:
Journal Volume: 86; Journal Issue: C; Journal ID: ISSN 0167-739X
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Big Data; Burst Buffers; HPC; Interference; MapReduce; Parallel File Systems; Prefetch

Citation Formats

Yildiz, Orcun, Zhou, Amelie Chi, and Ibrahim, Shadi. Improving the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems with Eley. United States: N. p., 2018. Web. doi:10.1016/j.future.2018.03.029.
Yildiz, Orcun, Zhou, Amelie Chi, & Ibrahim, Shadi. Improving the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems with Eley. United States. https://doi.org/10.1016/j.future.2018.03.029
Yildiz, Orcun, Zhou, Amelie Chi, and Ibrahim, Shadi. Thu . "Improving the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems with Eley". United States. https://doi.org/10.1016/j.future.2018.03.029. https://www.osti.gov/servlets/purl/1467628.
@article{osti_1467628,
title = {Improving the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems with Eley},
author = {Yildiz, Orcun and Zhou, Amelie Chi and Ibrahim, Shadi},
abstractNote = {Here, burst buffer is an effective solution for reducing the data transfer time and the I/O interference in HPC systems. Extending Burst Buffers (BBs) to handle Big Data applications is challenging because BBs must account for the large data inputs of Big Data applications and the Quality-of-Service (QoS) of HPC applications which are considered as first-class citizens in HPC systems. Existing BBs focus on only intermediate data of Big Data applications and incur a high performance degradation of both Big Data and HPC applications. We present Eley, a burst buffer solution that helps to accelerate the performance of Big Data applications while guaranteeing the QoS of HPC applications. To achieve this goal, Eley embraces interference-aware prefetching technique that makes reading data input faster while introducing low interference for HPC applications. Evaluations using a wide range of Big Data and HPC applications demonstrate that Eley improves the performance of Big Data applications by up to 30% compared to existing BBs while maintaining the QoS of HPC applications.},
doi = {10.1016/j.future.2018.03.029},
journal = {Future Generations Computer Systems},
number = C,
volume = 86,
place = {United States},
year = {Thu Mar 22 00:00:00 EDT 2018},
month = {Thu Mar 22 00:00:00 EDT 2018}
}

Journal Article:

Citation Metrics:
Cited by: 4 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

MapReduce: simplified data processing on large clusters
journal, January 2008

  • Dean, Jeffrey; Ghemawat, Sanjay; Mehta, Brijesh
  • Communications of the ACM, Vol. 51, Issue 1
  • DOI: 10.1145/1327452.1327492

On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems
conference, May 2016

  • Yildiz, Orcun; Dorier, Matthieu; Ibrahim, Shadi
  • 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
  • DOI: 10.1109/IPDPS.2016.50

CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination
conference, May 2014

  • Dorier, Matthieu; Antoniu, Gabriel; Ross, Rob
  • 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium
  • DOI: 10.1109/IPDPS.2014.27

I/O-Aware Batch Scheduling for Petascale Computing Systems
conference, September 2015

  • Zhou, Zhou; Yang, Xu; Zhao, Dongfang
  • 2015 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2015.45

Damaris: Addressing Performance Variability in Data Management for Post-Petascale Simulations
journal, October 2016

  • Dorier, Matthieu; Antoniu, Gabriel; Cappello, Franck
  • ACM Transactions on Parallel Computing, Vol. 3, Issue 3
  • DOI: 10.1145/2987371

Performance Modelling and Analysis of Software-Defined Networking under Bursty Multimedia Traffic
journal, December 2016

  • Miao, Wang; Min, Geyong; Wu, Yulei
  • ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 12, Issue 5s
  • DOI: 10.1145/2983637

Enabling fast failure recovery in shared Hadoop clusters: Towards failure-aware scheduling
journal, September 2017


Works referencing / citing this record:

Approaches of enhancing interoperations among high performance computing and big data analytics via augmentation
journal, August 2019