A Generalized Framework for Auto-tuning Stencil Computations
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
This work introduces a generalized framework for automatically tuning stencil computations to achieve superior performance on a broad range of multicore architectures. Stencil (nearest-neighbor) based kernels constitute the core of many important scientific applications involving block-structured grids. Auto-tuning systems search over optimization strategies to find the combination of tunable parameters that maximizes computational efficiency for a given algorithmic kernel. Although the auto-tuning strategy has been successfully applied to libraries, generalized stencil kernels are not amenable to packaging as libraries. Studied kernels in this work include both memory-bound kernels as well as a computation-bound bilateral filtering kernel. We introduce a generalized stencil auto-tuning framework that takes a straightforward Fortran expression of a stencil kernel and automatically generates tuned implementations of the kernel in C or Fortran to achieve performance portability across diverse computer architectures.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1407077
- Resource Relation:
- Conference: Cray User Group Conference (CUG 2009), Atlanta, GA (United States), 4-7 May 2009
- Country of Publication:
- United States
- Language:
- English
Similar Records
Stencil Computation Optimization and Auto-tuning on State-of-the-Art Multicore Architectures
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures.