Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

PPT-GPU: Scalable GPU Performance Modeling

Journal Article · · IEEE Computer Architecture Letters
Performance modeling is a challenging problem due to the complexities of hardware architectures. In this paper, we present PPT-GPU, a scalable and accurate simulation framework that enables GPU code developers and architects to predict the performance of applications in a fast, and accurate manner on different GPU architectures. PPT-GPU is part of the open source project, Performance Prediction Toolkit (PPT) developed at the Los Alamos National Laboratory. We extend the old GPU model in PPT that predict the runtimes of computational physics codes to offer better prediction accuracy, for which, we add models for different memory hierarchies found in GPUs and latencies for different instructions. To further show the utility of PPT-GPU, we compare our model against real GPU device(s) and the widely used cycle-accurate simulator, GPGPU-Sim using different workloads from RODINIA and Parboil benchmarks. The results indicate that the predicted performance of PPT-GPU is within a 10% error compared to the real device(s). In addition, PPT-GPU is highly scalable, where it is up to 450x faster than GPGPU-Sim with more accurate results.
Research Organization:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
89233218CNA000001; AC52-06NA25396
OSTI ID:
1504654
Report Number(s):
LA-UR--18-30853
Journal Information:
IEEE Computer Architecture Letters, Journal Name: IEEE Computer Architecture Letters Journal Issue: 1 Vol. 18; ISSN 1556-6056
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

Similar Records

A performance model for GPUs with caches
Journal Article · Tue Jun 24 00:00:00 EDT 2014 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1333005

GPU code optimization using abstract kernel emulation and sensitivity analysis
Journal Article · Mon Jun 11 00:00:00 EDT 2018 · ACM SIGPLAN Notices · OSTI ID:1582638

$\mathrm{PPT}$-Multicore: performance prediction of Open$\mathrm{MP}$ applications using reuse profiles and analytical modeling
Journal Article · Mon Jun 28 00:00:00 EDT 2021 · Journal of Supercomputing · OSTI ID:1922761