skip to main content

SciTech ConnectSciTech Connect

Title: A survey of CPU-GPU heterogeneous computing techniques

As both CPU and GPU become employed in a wide range of applications, it has been acknowledged that both of these processing units (PUs) have their unique features and strengths and hence, CPU-GPU collaboration is inevitable to achieve high-performance computing. This has motivated significant amount of research on heterogeneous computing techniques, along with the design of CPU-GPU fused chips and petascale heterogeneous supercomputers. In this paper, we survey heterogeneous computing techniques (HCTs) such as workload-partitioning which enable utilizing both CPU and GPU to improve performance and/or energy efficiency. We review heterogeneous computing approaches at runtime, algorithm, programming, compiler and application level. Further, we review both discrete and fused CPU-GPU systems; and discuss benchmark suites designed for evaluating heterogeneous computing systems (HCSs). Furthermore, we believe that this paper will provide insights into working and scope of applications of HCTs to researchers and motivate them to further harness the computational powers of CPUs and GPUs to achieve the goal of exascale performance.
 [1] ;  [2]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Georgia Inst. of Technology, Atlanta, GA (United States)
Publication Date:
OSTI Identifier:
Grant/Contract Number:
Accepted Manuscript
Journal Name:
ACM Computing Surveys
Additional Journal Information:
Journal Volume: 47; Journal Issue: 4; Journal ID: ISSN 0360-0300
Association for Computing Machinery (ACM)
Research Org:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Org:
USDOE Office of Science (SC)
Country of Publication:
United States
97 MATHEMATICS AND COMPUTING experimentation; management; measurement; performance; analysis; CPU-GPU heterogeneous/hybrid/collaborative computing; workload division/partitioning; dynamic/static load-balancing; pipelining; programming frameworks; fused CPU-GPU chip