Roofline model toolkit: A practical tool for architectural and program analysis

Lo, Yu Jung; Williams, Samuel; Van Straalen, Brian; Ligocki, Terry J.; Cordery, Matthew J.; Wright, Nicholas J.; Hall, Mary W.; Oliker, Leonid

doi:10.1007/978-3-319-17248-4_7

Roofline model toolkit: A practical tool for architectural and program analysis

Conference · Sat Apr 18 00:00:00 EDT 2015

DOI:https://doi.org/10.1007/978-3-319-17248-4_7· OSTI ID:1407288

Lo, Yu Jung ^[1]; Williams, Samuel ^[1]; Van Straalen, Brian ^[1]; Ligocki, Terry J. ^[1]; Cordery, Matthew J. ^[1]; Wright, Nicholas J. ^[1]; Hall, Mary W. ^[1]; Oliker, Leonid ^[1]

Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

We present preliminary results of the Roofline Toolkit for multicore, many core, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behavior of different architectural features. Compared to previous work on performance characterization, these microbenchmarks focus on capturing the performance of each level of the memory hierarchy, along with thread-level parallelism, instruction-level parallelism and explicit SIMD parallelism, measured in the context of the compilers and run-time environments. We also measure sustained PCIe throughput with four GPU memory managed mechanisms. By combining results from the architecture characterization with the Roofline model based solely on architectural specifications, this work offers insights for performance prediction of current and future architectures and their software systems. To that end, we instrument three applications and plot their resultant performance on the corresponding Roofline model when run on a Blue Gene/Q architecture.

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)

DOE Contract Number:: AC02-05CH11231

OSTI ID:: 1407288

Country of Publication:: United States

Language:: English

References (4)

Impact of modern memory subsystems on cache optimizations for stencil computations Kamil, Shoaib; Husbands, Parry; Oliker, Leonid Proceedings of the 2005 workshop on Memory system performance - MSP '05 https://doi.org/10.1145/1111583.1111589	conference	January 2005
A Roofline Model of Energy Choi, Jee Whan; Bedard, Daniel; Fowler, Robert 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processing https://doi.org/10.1109/IPDPS.2013.77	conference	May 2013
Roofline: an insightful visual performance model for multicore architectures Williams, Samuel; Waterman, Andrew; Patterson, David Communications of the ACM, Vol. 52, Issue 4 https://doi.org/10.1145/1498765.1498785	journal	April 2009
Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors Datta, Kaushik; Kamil, Shoaib; Williams, Samuel SIAM Review, Vol. 51, Issue 1 https://doi.org/10.1137/070693199	journal	February 2009

Similar Records

Roofline scaling trajectories: A method for parallel application and architectural performance analysis

Conference · Mon Oct 29 00:00:00 EDT 2018 · OSTI ID:1827650

GMH: A Message Passing Toolkit for GPU Clusters

Conference · Fri Dec 31 23:00:00 EST 2010 · OSTI ID:1008855

A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

Journal Article · Sat Dec 31 23:00:00 EST 2011 · Communications of the ACM · OSTI ID:1039644

Related Subjects

97 MATHEMATICS AND COMPUTING
CUDA Unified Memory
Memory Bandwidth
Roofline

Roofline model toolkit: A practical tool for architectural and program analysis

Citation Formats

References (4)

Similar Records

Related Subjects