Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Roofline model toolkit: A practical tool for architectural and program analysis

Conference ·

We present preliminary results of the Roofline Toolkit for multicore, many core, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behavior of different architectural features. Compared to previous work on performance characterization, these microbenchmarks focus on capturing the performance of each level of the memory hierarchy, along with thread-level parallelism, instruction-level parallelism and explicit SIMD parallelism, measured in the context of the compilers and run-time environments. We also measure sustained PCIe throughput with four GPU memory managed mechanisms. By combining results from the architecture characterization with the Roofline model based solely on architectural specifications, this work offers insights for performance prediction of current and future architectures and their software systems. To that end, we instrument three applications and plot their resultant performance on the corresponding Roofline model when run on a Blue Gene/Q architecture.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1407288
Country of Publication:
United States
Language:
English

References (4)

Impact of modern memory subsystems on cache optimizations for stencil computations conference January 2005
A Roofline Model of Energy
  • Choi, Jee Whan; Bedard, Daniel; Fowler, Robert
  • 2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processing https://doi.org/10.1109/IPDPS.2013.77
conference May 2013
Roofline: an insightful visual performance model for multicore architectures journal April 2009
Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors journal February 2009

Similar Records

Roofline scaling trajectories: A method for parallel application and architectural performance analysis
Conference · Mon Oct 29 00:00:00 EDT 2018 · OSTI ID:1827650

GMH: A Message Passing Toolkit for GPU Clusters
Conference · Fri Dec 31 23:00:00 EST 2010 · OSTI ID:1008855

A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures
Journal Article · Sat Dec 31 23:00:00 EST 2011 · Communications of the ACM · OSTI ID:1039644