Taming parallel I/O complexity with auto-tuning
- Univ. of Illinois, Urbana-Champaign, IL (United States)
- Rice Univ., Houston, TX (United States)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- The HDF Group, Champaign, IL (United States)
We present an auto-tuning system for optimizing I/O performance of HDF5 applications and demonstrate its value across platforms, applications, and at scale. The system uses a genetic algorithm to search a large space of tunable parameters and to identify effective settings at all layers of the parallel I/O stack. The parameter settings are applied transparently by the auto-tuning system via dynamically intercepted HDF5 calls. To validate our auto-tuning system, we applied it to three I/O benchmarks (VPIC, VORPAL, and GCRM) that replicate the I/O activity of their respective applications. We tested the system with different weak-scaling configurations (128, 2048, and 4096 CPU cores) that generate 30 GB to 1 TB of data, and executed these configurations on diverse HPC platforms (Cray XE6, IBM BG/P, and Dell Cluster). In all cases, the auto-tuning framework identified tunable parameters that substantially improved write performance over default system settings. In conclusion, we consistently demonstrate I/O write speedups between 2x and 100x for test configurations.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- Computational Research Division; USDOE
- Grant/Contract Number:
- AC02-06CH11357
- OSTI ID:
- 1311633
- Report Number(s):
- LBNL-1005953; ir:1005953
- Journal Information:
- Proceedings of the ACM/IEEE Supercomputing Conference, Vol. 2013; Conference: SC13-International Conference for High Performance Computing, Networking, Storage and Analysis, Denver, CO (United States), 17-22 Nov 2013; ISSN 1063-9635
- Publisher:
- ACM/IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Silo & HDF5 I/O Scaling Improvements on BG/P Systems
Tuning HDF5 subfiling performance on parallel file systems