skip to main content

Title: Cross-Layer Self-Adaptive/Self-Aware System Software for Exascale Systems

The extreme level of parallelism coupled with the limited available power budget expected in the exascale era brings unprecedented challenges that demand optimization of performance, power and resiliency in unison. Scalability on such systems is of paramount importance, while power and reliability issues may change the execution environment in which a parallel application runs. To solve these challenges exascale systems will require an introspective system software that combines system and application observations across all system stack layers with online feedback and adaptation mechanisms. In this paper we propose the design of a novel self-aware, self-adaptive system software in which a kernel-level Monitor, which continuously inspects the evolution of the target system through observation of Sensors, is combined with a user-level Controller, which reacts to changes in the execution environment, explores opportunities to increase performance, save power and adapts applications to new execution scenarios. We show that the monitoring system accurately monitors the evolution of parallel applications with a runtime overhead below 1-2%. As a test case, we design and implement a user-runtime system that aims at optimizing application’s performance and system power consumption on complex hierarchical architectures. Our results show that our adaptive system reaches 98% of performance efficiency ofmore » manually-tuned applications.« less
Authors:
; ; ;
Publication Date:
OSTI Identifier:
1236937
Report Number(s):
PNNL-SA-104751
KJ0402000
DOE Contract Number:
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Conference: IEEE 26th International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2014), October 22-24, 2014, Paris, France, 326-333
Publisher:
IEEE, Piscataway, NJ, United States(US).
Research Org:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Org:
USDOE
Country of Publication:
United States
Language:
English
Subject:
High Performance computing, Self-aware/self-adaptive systems.