Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors

Datta, Kaushik; Kamil, Shoaib; Williams, Samuel; Oliker, Leonid; Shalf, John; Yelick, Katherine

Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors

Journal Article · Fri Jun 01 04:00:00 EDT 2007 · SIAM Review (SIREV) Journal

OSTI ID:961524

Datta, Kaushik; Kamil, Shoaib; Williams, Samuel; Oliker, Leonid; Shalf, John; Yelick, Katherine

Stencil-based kernels constitute the core of many important scientific applications on blockstructured grids. Unfortunately, these codes achieve a low fraction of peak performance, due primarily to the disparity between processor and main memory speeds. In this paper, we explore the impact of trends in memory subsystems on a variety of stencil optimization techniques and develop performance models to analytically guide our optimizations. Our work targets cache reuse methodologies across single and multiple stencil sweeps, examining cache-aware algorithms as well as cache-oblivious techniques on the Intel Itanium2, AMD Opteron, and IBM Power5. Additionally, we consider stencil computations on the heterogeneous multicore design of the Cell processor, a machine with an explicitly managed memory hierarchy. Overall our work represents one of the most extensive analyses of stencil optimizations and performance modeling to date. Results demonstrate that recent trends in memory system organization have reduced the efficacy of traditional cache-blocking optimizations. We also show that a cache-aware implementation is significantly faster than a cache-oblivious approach, while the explicitly managed memory on Cell enables the highest overall efficiency: Cell attains 88% of algorithmic peak while the best competing cache-based processor achieves only 54% of algorithmic peak performance.

Research Organization:: Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)

Sponsoring Organization:: Computational Research Division

DOE Contract Number:: AC02-05CH11231

OSTI ID:: 961524

Report Number(s):: LBNL-63192

Journal Information:: SIAM Review (SIREV) Journal, Journal Name: SIAM Review (SIREV) Journal Journal Issue: 10 Vol. 51

Country of Publication:: United States

Language:: English

Similar Records

Implicit and explicit optimizations for stencil computations

Conference · Sun Oct 22 00:00:00 EDT 2006 · OSTI ID:1407050

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.

Conference · Thu Nov 20 23:00:00 EST 2008 · OSTI ID:1407060

Stencil Computation Optimization and Auto-tuning on State-of-the-Art Multicore Architectures

Conference · Fri Aug 22 00:00:00 EDT 2008 · OSTI ID:964371

Related Subjects

99 GENERAL AND MISCELLANEOUS
ALGORITHMS
COMPUTER ARCHITECTURE
COMPUTER CALCULATIONS
COMPUTERS
DESIGN
EFFICIENCY
GRIDS
IMPLEMENTATION
KERNELS
MEMORY MANAGEMENT
MICROPROCESSORS
OPTIMIZATION
PERFORMANCE
SIMULATION

Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors

Citation Formats

Similar Records

Related Subjects