Locality Aware Concurrent Start for Stencil Applications

Shrestha, Sunil; Gao, Guang R.; Manzano Franco, Joseph B.; Marquez, Andres; Feo, John T.

doi:10.1109/CGO.2015.7054196

Locality Aware Concurrent Start for Stencil Applications

Conference · Mon Feb 09 23:00:00 EST 2015

DOI:https://doi.org/10.1109/CGO.2015.7054196· OSTI ID:1194299

Shrestha, Sunil; Gao, Guang R.; Manzano Franco, Joseph B.; Marquez, Andres; Feo, John T.

Stencil computations are at the heart of many physical simulations used in scientific codes. Thus, there exists a plethora of optimization efforts for this family of computations. Among these techniques, tiling techniques that allow concurrent start have proven to be very efficient in providing better performance for these critical kernels. Nevertheless, with many core designs being the norm, these optimization techniques might not be able to fully exploit locality (both spatial and temporal) on multiple levels of the memory hierarchy without compromising parallelism. It is no longer true that the machine can be seen as a homogeneous collection of nodes with caches, main memory and an interconnect network. New architectural designs exhibit complex grouping of nodes, cores, threads, caches and memory connected by an ever evolving network-on-chip design. These new designs may benefit greatly from carefully crafted schedules and groupings that encourage parallel actors (i.e. threads, cores or nodes) to be aware of the computational history of other actors in close proximity. In this paper, we provide an efficient tiling technique that allows hierarchical concurrent start for memory hierarchy aware tile groups. Each execution schedule and tile shape exploit the available parallelism, load balance and locality present in the given applications. We demonstrate our technique on the Intel Xeon Phi architecture with selected and representative stencil kernels. We show improvement ranging from 5.58% to 31.17% over existing state-of-the-art techniques.

🛈

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (US)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1194299

Report Number(s):: PNNL-SA-108612; KJ0402000

Country of Publication:: United States

Language:: English

Similar Records

Implicit and explicit optimizations for stencil computations

Conference · Sun Oct 22 00:00:00 EDT 2006 · OSTI ID:1407050

Gregarious Data Re-structuring in a Many Core Architecture

Conference · Mon Aug 24 00:00:00 EDT 2015 · OSTI ID:1236328

Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors

Journal Article · Fri Jun 01 00:00:00 EDT 2007 · SIAM Review (SIREV) Journal · OSTI ID:961524

Related Subjects

Locality Aware execution
jagged tiling
poyhedral framework

Locality Aware Concurrent Start for Stencil Applications

Citation Formats

Similar Records

Related Subjects