skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Distributed OpenCL Framework using Redundant Computation and Data Replication

Abstract

Applications written solely in OpenCL or CUDA cannot execute on a cluster as a whole. Most previous approaches that extend these programming models to clusters are based on a common idea: designating a centralized host node and coordinating the other nodes with the host for computation. However, the centralized host node is a serious performance bottleneck when the number of nodes is large. In this paper, we propose a scalable and distributed OpenCL framework called SnuCL-D for large-scale clusters. SnuCL-D's remote device virtualization provides an OpenCL application with an illusion that all compute devices in a cluster are confined in a single node. To reduce the amount of control-message and data communication between nodes, SnuCL-D replicates the OpenCL host program execution and data in each node. We also propose a new OpenCL host API function and a queueing optimization technique that significantly reduce the overhead incurred by the previous centralized approaches. To show the effectiveness of SnuCL-D, we evaluate SnuCL-D with a microbenchmark and eleven benchmark applications on a large-scale CPU cluster and a medium-scale GPU cluster.

Authors:
 [1];  [1];  [1];  [1]
  1. Seoul National University, Korea
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1295154
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: ACM SIGPLAN Conference on Programming Language Design and Implementation, Santa Barbara, CA, USA, 20160613, 20160617
Country of Publication:
United States
Language:
English

Citation Formats

Kim, Junghyun, Gangwon, Jo, Jaehoon, Jung, and Lee, Jaejin. A Distributed OpenCL Framework using Redundant Computation and Data Replication. United States: N. p., 2016. Web.
Kim, Junghyun, Gangwon, Jo, Jaehoon, Jung, & Lee, Jaejin. A Distributed OpenCL Framework using Redundant Computation and Data Replication. United States.
Kim, Junghyun, Gangwon, Jo, Jaehoon, Jung, and Lee, Jaejin. Fri . "A Distributed OpenCL Framework using Redundant Computation and Data Replication". United States. https://www.osti.gov/servlets/purl/1295154.
@article{osti_1295154,
title = {A Distributed OpenCL Framework using Redundant Computation and Data Replication},
author = {Kim, Junghyun and Gangwon, Jo and Jaehoon, Jung and Lee, Jaejin},
abstractNote = {Applications written solely in OpenCL or CUDA cannot execute on a cluster as a whole. Most previous approaches that extend these programming models to clusters are based on a common idea: designating a centralized host node and coordinating the other nodes with the host for computation. However, the centralized host node is a serious performance bottleneck when the number of nodes is large. In this paper, we propose a scalable and distributed OpenCL framework called SnuCL-D for large-scale clusters. SnuCL-D's remote device virtualization provides an OpenCL application with an illusion that all compute devices in a cluster are confined in a single node. To reduce the amount of control-message and data communication between nodes, SnuCL-D replicates the OpenCL host program execution and data in each node. We also propose a new OpenCL host API function and a queueing optimization technique that significantly reduce the overhead incurred by the previous centralized approaches. To show the effectiveness of SnuCL-D, we evaluate SnuCL-D with a microbenchmark and eleven benchmark applications on a large-scale CPU cluster and a medium-scale GPU cluster.},
doi = {},
url = {https://www.osti.gov/biblio/1295154}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2016},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: