skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

This content will become publicly available on November 11, 2020

Title: Hierarchical Roofline analysis for GPUs: Accelerating performance optimization for the NERSC‐9 Perlmutter system

Authors:
ORCiD logo [1]; ORCiD logo [1];  [2]
  1. National Energy Research Scientific Computing Center (NERSC)Lawrence Berkeley National Laboratory Berkeley California
  2. Computational Research Division (CRD)Lawrence Berkeley National Laboratory Berkeley California
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1574050
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Publisher's Accepted Manuscript
Journal Name:
Concurrency and Computation. Practice and Experience
Additional Journal Information:
Journal Name: Concurrency and Computation. Practice and Experience; Journal ID: ISSN 1532-0626
Publisher:
Wiley Blackwell (John Wiley & Sons)
Country of Publication:
United Kingdom
Language:
English

Citation Formats

Yang, Charlene, Kurth, Thorsten, and Williams, Samuel. Hierarchical Roofline analysis for GPUs: Accelerating performance optimization for the NERSC‐9 Perlmutter system. United Kingdom: N. p., 2019. Web. doi:10.1002/cpe.5547.
Yang, Charlene, Kurth, Thorsten, & Williams, Samuel. Hierarchical Roofline analysis for GPUs: Accelerating performance optimization for the NERSC‐9 Perlmutter system. United Kingdom. doi:10.1002/cpe.5547.
Yang, Charlene, Kurth, Thorsten, and Williams, Samuel. Tue . "Hierarchical Roofline analysis for GPUs: Accelerating performance optimization for the NERSC‐9 Perlmutter system". United Kingdom. doi:10.1002/cpe.5547.
@article{osti_1574050,
title = {Hierarchical Roofline analysis for GPUs: Accelerating performance optimization for the NERSC‐9 Perlmutter system},
author = {Yang, Charlene and Kurth, Thorsten and Williams, Samuel},
abstractNote = {},
doi = {10.1002/cpe.5547},
journal = {Concurrency and Computation. Practice and Experience},
number = ,
volume = ,
place = {United Kingdom},
year = {2019},
month = {11}
}

Journal Article:
Free Publicly Available Full Text
This content will become publicly available on November 11, 2020
Publisher's Version of Record

Save / Share:

Works referenced in this record:

Roofline: an insightful visual performance model for multicore architectures
journal, April 2009

  • Williams, Samuel; Waterman, Andrew; Patterson, David
  • Communications of the ACM, Vol. 52, Issue 4
  • DOI: 10.1145/1498765.1498785

Electron self-energy calculation using a general multi-pole approximation
journal, April 2003

  • Soininen, J. A.; Rehr, J. J.; Shirley, Eric L.
  • Journal of Physics: Condensed Matter, Vol. 15, Issue 17
  • DOI: 10.1088/0953-8984/15/17/312

Demystifying Parallel and Distributed Deep Learning: An In-depth Concurrency Analysis
journal, August 2019

  • Ben-Nun, Tal; Hoefler, Torsten
  • ACM Computing Surveys, Vol. 52, Issue 4
  • DOI: 10.1145/3320060