Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.

Datta, K.; Murphy, M.; Volkov, V.; Williams, S.; Carter, J.; Oliker, L.; Patterson, D. A.; Shalf, J.; Yelick, K. A.

doi:10.1109/SC.2008.5222004

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.

Conference · Fri Nov 21 04:00:00 EST 2008

DOI:https://doi.org/10.1109/SC.2008.5222004· OSTI ID:1407060

Datta, K. ^[1]; Murphy, M. ^[2]; Volkov, V. ^[2]; Williams, S. ^[1]; Carter, J. ^[3]; Oliker, L. ^[1]; Patterson, D. A. ^[1]; Shalf, J. ^[3]; Yelick, K. A. ^[1]

Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States)
Univ. of California, Berkeley, CA (United States)
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Understanding the most efficient design and utilization of emerging multicore systems is one of the most challenging questions faced by the mainstream and scientific computing industries in several decades. Our work explores multicore stencil (nearest-neighbor) computations — a class of algorithms at the heart of many structured grid codes, including PDE solvers. We develop a number of effective optimization strategies, and build an auto-tuning environment that searches over our optimizations and their parameters to minimize runtime, while maximizing performance portability. To evaluate the effectiveness of these strategies we explore the broadest set of multicore architectures in the current HPC literature, including the Intel Clovertown, AMD Barcelona, Sun Victoria Falls, IBM QS22 PowerXCell 8i, and NVIDIA GTX280. Overall, our auto-tuning optimization methodology results in the fastest multicore stencil performance to date. Finally, we present several key insights into the architectural tradeoffs of emerging multicore designs and their implications on scientific algorithm development.

Research Organization:: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)

DOE Contract Number:: AC02-05CH11231

OSTI ID:: 1407060

Country of Publication:: United States

Language:: English

Similar Records

Stencil Computation Optimization and Auto-tuning on State-of-the-Art Multicore Architectures

Conference · Fri Aug 22 00:00:00 EDT 2008 · OSTI ID:964371

PERI - Auto-tuning Memory Intensive Kernels for Multicore

Conference · Tue Jun 24 00:00:00 EDT 2008 · OSTI ID:936521

Optimization of a Lattice Boltzmann Computation on State-of-the-Art Multicore Platforms

Journal Article · Fri Apr 10 00:00:00 EDT 2009 · Journal of Parallel and Distributed Computing · OSTI ID:963653

Related Subjects

97 MATHEMATICS AND COMPUTING

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.

Citation Formats

Similar Records

Related Subjects