Managing variations among nodes in parallel system frameworks
Abstract
Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.
- Inventors:
- Issue Date:
- Research Org.:
- Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1568617
- Patent Number(s):
- 10355966
- Application Number:
- 15/081,558
- Assignee:
- Advanced Micro Devices, Inc. (Santa Clara, CA)
- Patent Classifications (CPCs):
-
H - ELECTRICITY H04 - ELECTRIC COMMUNICATION TECHNIQUE H04L - TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- DOE Contract Number:
- AC52-07NA27344
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 03/25/2016
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Wasmundt, Samuel Lawrence, Piga, Leonardo, Paul, Indrani, Huang, Wei, and Arora, Manish. Managing variations among nodes in parallel system frameworks. United States: N. p., 2019.
Web.
Wasmundt, Samuel Lawrence, Piga, Leonardo, Paul, Indrani, Huang, Wei, & Arora, Manish. Managing variations among nodes in parallel system frameworks. United States.
Wasmundt, Samuel Lawrence, Piga, Leonardo, Paul, Indrani, Huang, Wei, and Arora, Manish. Tue .
"Managing variations among nodes in parallel system frameworks". United States. https://www.osti.gov/servlets/purl/1568617.
@article{osti_1568617,
title = {Managing variations among nodes in parallel system frameworks},
author = {Wasmundt, Samuel Lawrence and Piga, Leonardo and Paul, Indrani and Huang, Wei and Arora, Manish},
abstractNote = {Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {7}
}
Works referenced in this record:
Method, System, and Device for Dynamic Energy Efficient Job Scheduling in a Cloud Computing Environment
patent-application, January 2014
- Jain, Nilesh K.; Willke, Theodore L.; Datta, Kushal
- US Patent Application 13/534324; 20140006534
Cloud Compute Scheduling Using a Heuristic Contention Model
patent-application, August 2015
- Sesha, Subramony; Patni, Archana; Narayanan, Anantha
- US Patent Application 14/368349; 20150236971
Cross-Tenant Analysis of Similar Storage Environments to Recommend Storage Policy Changes
patent-application, March 2017
- Acuna, Jorge D.; Bavishi, Pankaj S.; Huang, Dachuan
- US Patent Application 14/838302; 20170063654
Adaptive Task Scheduling of Hadoop in a Virtualized Environment
patent-application, August 2014
- Zhou, Li; Uttamchandani, Sandeep; Chen, Yizheng
- US Patent Application 13/778441; 20140245298
Dynamic Hierarchical Performance Balancing of Computational Resources
patent-application, June 2016
- Eastep, Jonathan M.; Sharapov, Ilya; Greco, Richard J.
- US Patent Application 14/583237; 20160187944
Computing Resources Workload Scheduling
patent-application, April 2017
- Ahuja, Nishi; Khanna, Rahul; Daniel, Abishai
- US Patent Application 15/089481; 20170109205
Data Analytics and Management of Computing Infrastructures
patent-application, February 2017
- Carroll, Theodore A.; Twito, Bruce; Scumniotales, John
- US Patent Application 15/222881; 20170034016