A Case for Application Oblivious Energy-Efficient MPI Runtime
Power has become the major impediment in designing large scale high-end systems. Message Passing Interface (MPI) is the {\em de facto} communication interface used as the back-end for designing applications, programming models and runtime for these systems. Slack --- the time spent by an MPI process in a single MPI call --- provides a potential for energy and power savings, if an appropriate power reduction technique such as core-idling/Dynamic Voltage and Frequency Scaling (DVFS) can be applied without perturbing application's execution time. Existing techniques that exploit slack for power savings assume that application behavior repeats across iterations/executions. However, an increasing use of adaptive, data-dependent workloads combined with system factors (OS noise, congestion) makes this assumption invalid. This paper proposes and implements Energy Aware MPI (EAM) --- an application-oblivious energy-efficient MPI runtime. EAM uses a combination of communication models of common MPI primitives (point-to-point, collective, progress, blocking/non-blocking) and an online observation of slack for maximizing energy efficiency. Each power lever incurs time overhead, which must be amortized over slack to minimize degradation. When predicted communication time exceeds a lever overhead, the lever is used {\em as soon as possible} --- to maximize energy efficiency. When mis-prediction occurs, the lever(s) are used automatically at specific intervals for amortization. We implement EAM using MVAPICH2 and evaluate it on ten applications using up to 4096 processes. Our performance evaluation on an InfiniBand cluster indicates that EAM can reduce energy consumption by 5--41\% in comparison to the default approach, with negligible (less than 4\% in all cases) performance loss.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1236332
- Report Number(s):
- PNNL-SA-113351
- Resource Relation:
- Conference: SC15 Proceedings: International Conference on High Performance Computing, Networking, Storage and Analysis, November 15-20, 2015, Austin, Texas, Paper No. 29
- Country of Publication:
- United States
- Language:
- English
Similar Records
Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models
Designing Energy Efficient Communication Runtime Systems: A View from PGAS Models