Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Implicit and explicit optimizations for stencil computations

Conference ·
 [1];  [2];  [2];  [1];  [1];  [3]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  2. Univ. of California, Berkeley, CA (United States)
  3. Univ. of California, Berkeley, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Stencil-based kernels constitute the core of many scientific applications on block-structured grids. Unfortunately, these codes achieve a low fraction of peak performance, due primarily to the disparity between processor and main memory speeds. We examine several optimizations on both the conventional cache-based memory systems of the Itanium 2, Opteron, and Power5, as well as the heterogeneous multicore design of the Cell processor. The optimizations target cache reuse across stencil sweeps, including both an implicit cache oblivious approach and a cache-aware algorithm blocked to match the cache structure. Finally, we consider stencil computations on a machine with an explicitly-managed memory hierarchy, the Cell processor. Overall, results show that a cache-aware approach is significantly faster than a cache oblivious approach and that the explicitly managed memory on Cell is more efficient: Relative to the Power5, it has almost 2x more memory bandwidth and is 3.7x faster.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1407050
Country of Publication:
United States
Language:
English

Similar Records

Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors
Journal Article · Fri Jun 01 00:00:00 EDT 2007 · SIAM Review (SIREV) Journal · OSTI ID:961524

Parallelizable adjoint stencil computations using transposed forward-mode algorithmic differentiation
Journal Article · Fri May 18 00:00:00 EDT 2018 · Optimization Methods and Software · OSTI ID:1477730

Snowflake: A Lightweight Portable Stencil DSL
Journal Article · Mon May 01 00:00:00 EDT 2017 · Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 · OSTI ID:1379895

Related Subjects