Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Roofline scaling trajectories: A method for parallel application and architectural performance analysis

Conference ·

The end of Dennard scaling signaled a shift in HPC supercomputer architectures from systems built from single-core processor architectures to systems built from multicore and eventually manycore architectures. This transition substantially complicated performance optimization and analysis as new programming models were created, new scaling methodologies deployed, and on-chip contention became a bottleneck to performance. Existing distributed memory performance models like logP and logGP were unable to capture this contention. The Roofline model was created to address this contention and its interplay with locality. However, to date, the Roofline model has focused on full-node concurrency. In this paper, we extend the Roofline model to capture the effects of concurrency on data locality and on-chip contention. We demonstrate the value of this new technique by evaluating the NAS parallel benchmarks on both multicore and manycore architectures under both strong-And weak-scaling regimes. In order to quantify the interplay between programming model and locality, we evaluate scaling under both the OpenMP and flat MPI programming models.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1827650
Country of Publication:
United States
Language:
English

Similar Records

Roofline model toolkit: A practical tool for architectural and program analysis
Conference · Sat Apr 18 00:00:00 EDT 2015 · OSTI ID:1407288

A Locality-Based Threading Algorithm for the Configuration-Interaction Method
Journal Article · Mon Jul 03 00:00:00 EDT 2017 · IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum · OSTI ID:1393243

Thread-Level Parallelization and Optimization of NWChem for the Intel MIC Architecture
Technical Report · Fri Oct 10 00:00:00 EDT 2014 · OSTI ID:1163233

Related Subjects