skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping

Abstract

Task mapping is an important problem in parallel and distributed computing. The goal in task mapping is to find an optimal layout of the processes of an application (or a task) onto a given network topology. We target this problem in the context of staging applications. A staging application consists of two or more parallel applications (also referred to as staging tasks) which run concurrently and exchange data over the course of computation. Task mapping becomes a more challenging problem in staging applications, because not only data is exchanged between the staging tasks, but also the processes of a staging task may exchange data with each other. We propose a novel method, called Task Graph Embedding (TGE), that harnesses the observable graph structures of parallel applications and network topologies. TGE employs a machine learning based algorithm to find the best representation of a graph, called an embedding, onto a space in which the task-to-processor mapping problem can be solved. We evaluate and demonstrate the effectiveness of TGE experimentally with the communication patterns extracted from runs of XGC, a large-scale fusion simulation code, on Titan.

Authors:
ORCiD logo [1];  [1]; ORCiD logo [1]; ORCiD logo [1];  [1];  [1]; ORCiD logo [1]; ORCiD logo [1];  [2];  [2];  [2];  [3];  [3]
  1. ORNL
  2. Rutgers University
  3. Princeton Plasma Physics Laboratory (PPPL)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1474472
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: 2017 IEEE International Conference on Cluster Computing (CLUSTER) - Honolulu, Hawaii, United States of America - 9/5/2017 4:00:00 AM-9/8/2017 4:00:00 AM
Country of Publication:
United States
Language:
English

Citation Formats

Choi, Jong Youl, Logan, Jeremy S., Wolf, Matthew D., Ostrouchov, George, Kurc, Tahsin M., Liu, Qing Gary, Podhorszki, Norbert, Klasky, Scott A., Romanus, Melissa, Sun, Qian, Parashar, Manish, Churchill, Michael, and Chang, C.S. TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping. United States: N. p., 2017. Web. doi:10.1109/CLUSTER.2017.67.
Choi, Jong Youl, Logan, Jeremy S., Wolf, Matthew D., Ostrouchov, George, Kurc, Tahsin M., Liu, Qing Gary, Podhorszki, Norbert, Klasky, Scott A., Romanus, Melissa, Sun, Qian, Parashar, Manish, Churchill, Michael, & Chang, C.S. TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping. United States. doi:10.1109/CLUSTER.2017.67.
Choi, Jong Youl, Logan, Jeremy S., Wolf, Matthew D., Ostrouchov, George, Kurc, Tahsin M., Liu, Qing Gary, Podhorszki, Norbert, Klasky, Scott A., Romanus, Melissa, Sun, Qian, Parashar, Manish, Churchill, Michael, and Chang, C.S. Fri . "TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping". United States. doi:10.1109/CLUSTER.2017.67. https://www.osti.gov/servlets/purl/1474472.
@article{osti_1474472,
title = {TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping},
author = {Choi, Jong Youl and Logan, Jeremy S. and Wolf, Matthew D. and Ostrouchov, George and Kurc, Tahsin M. and Liu, Qing Gary and Podhorszki, Norbert and Klasky, Scott A. and Romanus, Melissa and Sun, Qian and Parashar, Manish and Churchill, Michael and Chang, C.S.},
abstractNote = {Task mapping is an important problem in parallel and distributed computing. The goal in task mapping is to find an optimal layout of the processes of an application (or a task) onto a given network topology. We target this problem in the context of staging applications. A staging application consists of two or more parallel applications (also referred to as staging tasks) which run concurrently and exchange data over the course of computation. Task mapping becomes a more challenging problem in staging applications, because not only data is exchanged between the staging tasks, but also the processes of a staging task may exchange data with each other. We propose a novel method, called Task Graph Embedding (TGE), that harnesses the observable graph structures of parallel applications and network topologies. TGE employs a machine learning based algorithm to find the best representation of a graph, called an embedding, onto a space in which the task-to-processor mapping problem can be solved. We evaluate and demonstrate the effectiveness of TGE experimentally with the communication patterns extracted from runs of XGC, a large-scale fusion simulation code, on Titan.},
doi = {10.1109/CLUSTER.2017.67},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2017},
month = {9}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: