DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Managing variations among nodes in parallel system frameworks

Abstract

Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.

Inventors:
; ; ; ;
Issue Date:
Research Org.:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1568617
Patent Number(s):
10355966
Application Number:
15/081,558
Assignee:
Advanced Micro Devices, Inc. (Santa Clara, CA)
Patent Classifications (CPCs):
H - ELECTRICITY H04 - ELECTRIC COMMUNICATION TECHNIQUE H04L - TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
DOE Contract Number:  
AC52-07NA27344
Resource Type:
Patent
Resource Relation:
Patent File Date: 03/25/2016
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Wasmundt, Samuel Lawrence, Piga, Leonardo, Paul, Indrani, Huang, Wei, and Arora, Manish. Managing variations among nodes in parallel system frameworks. United States: N. p., 2019. Web.
Wasmundt, Samuel Lawrence, Piga, Leonardo, Paul, Indrani, Huang, Wei, & Arora, Manish. Managing variations among nodes in parallel system frameworks. United States.
Wasmundt, Samuel Lawrence, Piga, Leonardo, Paul, Indrani, Huang, Wei, and Arora, Manish. Tue . "Managing variations among nodes in parallel system frameworks". United States. https://www.osti.gov/servlets/purl/1568617.
@article{osti_1568617,
title = {Managing variations among nodes in parallel system frameworks},
author = {Wasmundt, Samuel Lawrence and Piga, Leonardo and Paul, Indrani and Huang, Wei and Arora, Manish},
abstractNote = {Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Jul 16 00:00:00 EDT 2019},
month = {Tue Jul 16 00:00:00 EDT 2019}
}

Works referenced in this record:

Method, System, and Device for Dynamic Energy Efficient Job Scheduling in a Cloud Computing Environment
patent-application, January 2014


Cloud Compute Scheduling Using a Heuristic Contention Model
patent-application, August 2015


Cross-Tenant Analysis of Similar Storage Environments to Recommend Storage Policy Changes
patent-application, March 2017


Adaptive Task Scheduling of Hadoop in a Virtualized Environment
patent-application, August 2014


Dynamic Hierarchical Performance Balancing of Computational Resources
patent-application, June 2016


Computing Resources Workload Scheduling
patent-application, April 2017


Data Analytics and Management of Computing Infrastructures
patent-application, February 2017