DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: PPT-GPU: Scalable GPU Performance Modeling

Abstract

Performance modeling is a challenging problem due to the complexities of hardware architectures. In this paper, we present PPT-GPU, a scalable and accurate simulation framework that enables GPU code developers and architects to predict the performance of applications in a fast, and accurate manner on different GPU architectures. PPT-GPU is part of the open source project, Performance Prediction Toolkit (PPT) developed at the Los Alamos National Laboratory. We extend the old GPU model in PPT that predict the runtimes of computational physics codes to offer better prediction accuracy, for which, we add models for different memory hierarchies found in GPUs and latencies for different instructions. To further show the utility of PPT-GPU, we compare our model against real GPU device(s) and the widely used cycle-accurate simulator, GPGPU-Sim using different workloads from RODINIA and Parboil benchmarks. The results indicate that the predicted performance of PPT-GPU is within a 10% error compared to the real device(s). In addition, PPT-GPU is highly scalable, where it is up to 450x faster than GPGPU-Sim with more accurate results.

Authors:
 [1];  [1]; ORCiD logo [2]; ORCiD logo [2]; ORCiD logo [2]
  1. New Mexico State Univ., Las Cruces, NM (United States)
  2. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1504654
Report Number(s):
LA-UR-18-30853
Journal ID: ISSN 1556-6056
Grant/Contract Number:  
89233218CNA000001; AC52-06NA25396
Resource Type:
Accepted Manuscript
Journal Name:
IEEE Computer Architecture Letters
Additional Journal Information:
Journal Volume: 18; Journal Issue: 1; Journal ID: ISSN 1556-6056
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Computer Hardware; Computer Science

Citation Formats

Arafa, Yehia, Badawy, Abdel Hameed A., Chennupati, Gopinath, Santhi, Nandakishore, and Eidenbenz, Stephan Johannes. PPT-GPU: Scalable GPU Performance Modeling. United States: N. p., 2019. Web. doi:10.1109/LCA.2019.2904497.
Arafa, Yehia, Badawy, Abdel Hameed A., Chennupati, Gopinath, Santhi, Nandakishore, & Eidenbenz, Stephan Johannes. PPT-GPU: Scalable GPU Performance Modeling. United States. https://doi.org/10.1109/LCA.2019.2904497
Arafa, Yehia, Badawy, Abdel Hameed A., Chennupati, Gopinath, Santhi, Nandakishore, and Eidenbenz, Stephan Johannes. Tue . "PPT-GPU: Scalable GPU Performance Modeling". United States. https://doi.org/10.1109/LCA.2019.2904497. https://www.osti.gov/servlets/purl/1504654.
@article{osti_1504654,
title = {PPT-GPU: Scalable GPU Performance Modeling},
author = {Arafa, Yehia and Badawy, Abdel Hameed A. and Chennupati, Gopinath and Santhi, Nandakishore and Eidenbenz, Stephan Johannes},
abstractNote = {Performance modeling is a challenging problem due to the complexities of hardware architectures. In this paper, we present PPT-GPU, a scalable and accurate simulation framework that enables GPU code developers and architects to predict the performance of applications in a fast, and accurate manner on different GPU architectures. PPT-GPU is part of the open source project, Performance Prediction Toolkit (PPT) developed at the Los Alamos National Laboratory. We extend the old GPU model in PPT that predict the runtimes of computational physics codes to offer better prediction accuracy, for which, we add models for different memory hierarchies found in GPUs and latencies for different instructions. To further show the utility of PPT-GPU, we compare our model against real GPU device(s) and the widely used cycle-accurate simulator, GPGPU-Sim using different workloads from RODINIA and Parboil benchmarks. The results indicate that the predicted performance of PPT-GPU is within a 10% error compared to the real device(s). In addition, PPT-GPU is highly scalable, where it is up to 450x faster than GPGPU-Sim with more accurate results.},
doi = {10.1109/LCA.2019.2904497},
journal = {IEEE Computer Architecture Letters},
number = 1,
volume = 18,
place = {United States},
year = {Tue Jan 01 00:00:00 EST 2019},
month = {Tue Jan 01 00:00:00 EST 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 22 works
Citation information provided by
Web of Science

Save / Share: