Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Improving I/O Performance for Exascale Applications through Online Data Layout Reorganization

Journal Article · · IEEE Transactions on Parallel and Distributed Systems
 [1];  [2];  [2];  [3];  [1];  [1];  [1];  [4];  [1];  [5];  [5];  [2];  [1];  [2];  [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  3. Center for Advanced Systems Understanding (CASUS), Görlitz (Germany)
  4. Missouri Univ. of Science and Technology, Rolla, MO (United States)
  5. Argonne National Lab. (ANL), Lemont, IL (United States)
The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particle-mesh methods and use advanced algorithms, especially dynamic load-balancing and mesh-refinement, to achieve high performance on Exascale machines. Yet, as such algorithms improve parallel application efficiency, they raise new challenges for I/O logic due to their irregular and dynamic data distributions. Thus, while the enormous data rates of Exascale simulations already challenge existing file system write strategies, the need for efficient read and processing of generated data introduces additional constraints on the data layout strategies that can be used when writing data to secondary storage. We review these I/O challenges and introduce two online data layout reorganization approaches for achieving good tradeoffs between read and write performance. We demonstrate the benefits of using these two approaches for the ECP particle-in-cell simulation WarpX, which serves as a motif for a large class of important Exascale applications. Here, we show that by understanding application I/O patterns and carefully designing data layouts we can increase read performance by more than 80 percent.
Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1855220
Journal Information:
IEEE Transactions on Parallel and Distributed Systems, Journal Name: IEEE Transactions on Parallel and Distributed Systems Journal Issue: 4 Vol. 33; ISSN 1045-9219
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (25)

Interfacing HDF5 with a scalable object‐centric storage system on hierarchical storage journal March 2020
Querying Large Scientific Data Sets with Adaptable IO System ADIOS book January 2018
Optimizing checkpoint data placement with guaranteed burst buffer endurance in large-scale hierarchical storage systems journal February 2017
ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management journal July 2020
Modeling of a chain of three plasma accelerator stages with the WarpX electromagnetic PIC code on GPUs journal February 2021
An algorithm for point clustering and grid generation journal January 1991
Apply Block Index Technique to Scientific Data Analysis and I/O Systems conference May 2017
Usage Pattern-Driven Dynamic Data Layout Reorganization conference May 2016
Improving Parallel I/O Performance with Data Layout Awareness conference September 2010
EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization conference September 2011
TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers conference September 2017
Analysis and Modeling of the End-to-End I/O Performance on OLCF's Titan Supercomputer
  • Wan, Lipeng; Wolf, Matthew; Wang, Feiyi
  • 2017 IEEE 19th International Conference on High Performance Computing and Communications, IEEE 15th International Conference on Smart City and IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS) https://doi.org/10.1109/HPCC-SmartCity-DSS.2017.1
conference December 2017
Computing Just What You Need: Online Data Analysis and Reduction at Extreme Scales conference December 2017
Comprehensive Measurement and Analysis of the User-Perceived I/O Performance in a Production Leadership-Class Storage System conference June 2017
Model-Driven Data Layout Selection for Improving Read Performance conference May 2014
A Plugin for HDF5 Using PLFS for Improved I/O Performance and Semantic Analysis conference November 2012
Optimizing Parallel I/O Accesses through Pattern-Directed and Layout-Aware Replication journal February 2020
DataStager: scalable data staging services for petascale applications conference January 2009
Six degrees of scientific data: reading patterns for extreme scale science IO conference January 2011
Using active NVRAM for I/O staging
  • Kannan, Sudarsun; Gavrilovska, Ada; Schwan, Karsten
  • Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities - PDAC '11 https://doi.org/10.1145/2110205.2110209
conference January 2011
Disk-directed I/O for MIMD multiprocessors journal February 1997
Improving Collective MPI-IO Using Topology-Aware Stepwise Data Aggregation with I/O Throttling
  • Tsujita, Yuichi; Hori, Atsushi; Kameyama, Toyohisa
  • HPC Asia 2018: International Conference on High Performance Computing in Asia-Pacific Region, Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region https://doi.org/10.1145/3149457.3149464
conference January 2018
Spatially-aware Parallel I/O for Particle Data
  • Kumar, Sidharth; Petruzza, Steve; Usher, Will
  • ICPP 2019: 48th International Conference on Parallel Processing, Proceedings of the 48th International Conference on Parallel Processing https://doi.org/10.1145/3337821.3337875
conference August 2019
AMReX: a framework for block-structured adaptive mesh refinement journal May 2019
Modeling of a chain of three plasma accelerator stages with the WarpX electromagnetic PIC code on GPUs dataset January 2021

Similar Records

Usage Pattern-Driven Dynamic Data Layout Reorganization
Conference · Sun May 01 00:00:00 EDT 2016 · OSTI ID:1567419

Expediting Scientific Data Analysis with Reorganization of Data
Conference · Mon Aug 19 00:00:00 EDT 2013 · OSTI ID:1165204

SDS: A Framework for Scientific Data Services
Conference · Thu Oct 31 00:00:00 EDT 2013 · OSTI ID:1164907