Collective I/O Tuning Using Analytical and Machine-Learning Models
The ever larger demand of scientific applications for computation and data is currently driving a continuous increase in scale of parallel computers. The inherent complexity of scaling up a computing systems in terms of both hardware and software stack exposes an increasing number of factors impacting the performance and complicating the process of optimization. In particular, the optimization of parallel I/O has become increasingly challenging due to increasing storage hierarchy and well known performance variability of shared storage systems. This paper focuses on model-based autotuning of the two-phase collective I/O algorithm from a popular MPI distribution on the Blue Gene/Q architecture. We propose a novel hybrid model, constructed as a composition of analytical models for communication and storage operations and black-box models for the performance of the individual operations. We perform an in-depth study of the complexity involved in performance modeling including architecture, software stack and noise. In particular we address this challenges of modeling the performance of shared storage systems by building a benchmark that helps synthesizing factors such as topology, file caching, and noise. The experimental results show that the hybrid approach produces significantly better results than state-of-the-art machine learning approaches and shows a higher robustness to noise, at the cost of a higher modeling complexity
- Research Organization:
- Argonne National Lab. (ANL), Argonne, IL (United States)
- Sponsoring Organization:
- USDOE Office of Science - Office of Advanced Scientific Computing Research
- DOE Contract Number:
- AC02-06CH11357
- OSTI ID:
- 1351298
- Resource Relation:
- Conference: 2015 IEEE Cluster , 09/08/15 - 09/11/15, Chicago, IL, US
- Country of Publication:
- United States
- Language:
- English
Similar Records
Roofline Analysis in the Intel® Advisor to Deliver Optimized Performance for applications on Intel® Xeon Phi™ Processor
Quantum Monte Carlo Endstation for Petascale Computing