Achieving balanced execution through runtime detection of performance variation
Abstract
Systems, apparatuses, and methods for achieving balanced execution in a multi-node cluster through runtime detection of performance variation are described. During a training phase, performance counters and an amount of time spent waiting for synchronization is monitored for a plurality of tasks for each node of the multi-node cluster. These values are utilized to generate a model which correlates the values of the performance counters to the amount of time spent waiting for synchronization. Once the model is built, the values of the performance counters are monitored for a period of time at the start of each task, and these values are input into the model. The model generates a prediction of whether a given node is on the critical path. If the given node is predicted to be on the critical path, the power allocation of the given node is increased.
- Inventors:
- Issue Date:
- Research Org.:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1650744
- Patent Number(s):
- 10613957
- Application Number:
- 15/192,764
- Assignee:
- Advanced Micro Devices, Inc. (Santa Clara, CA)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
Y - NEW / CROSS SECTIONAL TECHNOLOGIES Y02 - TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE Y02D - CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THIR OWN ENERGY USE
- DOE Contract Number:
- AC02-05CH11231
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 06/24/2016
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Kocoloski, Brian J., Piga, Leonardo, Huang, Wei, and Paul, Indrani. Achieving balanced execution through runtime detection of performance variation. United States: N. p., 2020.
Web.
Kocoloski, Brian J., Piga, Leonardo, Huang, Wei, & Paul, Indrani. Achieving balanced execution through runtime detection of performance variation. United States.
Kocoloski, Brian J., Piga, Leonardo, Huang, Wei, and Paul, Indrani. Tue .
"Achieving balanced execution through runtime detection of performance variation". United States. https://www.osti.gov/servlets/purl/1650744.
@article{osti_1650744,
title = {Achieving balanced execution through runtime detection of performance variation},
author = {Kocoloski, Brian J. and Piga, Leonardo and Huang, Wei and Paul, Indrani},
abstractNote = {Systems, apparatuses, and methods for achieving balanced execution in a multi-node cluster through runtime detection of performance variation are described. During a training phase, performance counters and an amount of time spent waiting for synchronization is monitored for a plurality of tasks for each node of the multi-node cluster. These values are utilized to generate a model which correlates the values of the performance counters to the amount of time spent waiting for synchronization. Once the model is built, the values of the performance counters are monitored for a period of time at the start of each task, and these values are input into the model. The model generates a prediction of whether a given node is on the critical path. If the given node is predicted to be on the critical path, the power allocation of the given node is increased.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Apr 07 00:00:00 EDT 2020},
month = {Tue Apr 07 00:00:00 EDT 2020}
}
Works referenced in this record:
Semi-Static Power and Performance Optimization of Data Centers
patent-application, April 2014
- Breternitz, Mauricio; Piga, Leonard
- US Patent Application 13/651904; 20140108828
Time Slack Application Pipeline Balancing for Multi/Many-Core PLCs
patent-application, April 2015
- Martinez Canedo, Arquimedes; Feichtinger, Thomas; Al Faruque, Mohammad Abdullah
- US Patent Application 14/394395; 20150121396
Combined Dynamic and Static Power and Performance Optimization on Data Centers
patent-application, December 2014
- Breternitz, Mauricio; Piga, Leonardo; Kaminski, Patryk
- US Patent Application 13/917417; 20140372782