Auto-Tuning Memory-Intensive Kernels for Multicore
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Univ. of California, Berkeley, CA (United States)
In this, chapter, we discuss the optimization of three memory-intensive computational kernels — sparse matrix-vector multiplication, the Laplacian differential operator applied to structured grids, and the collision() operator with the lattice Boltzmann magnetohydrodynamics (LBMHD) application. They are all implemented using a single-process, (POSIX) threaded, SPMD model. Unlike their computationally-intense dense linear algebra cousins, performance is ultimately limited by DRAM bandwidth and the volume of data that must be transfered. To provide performance portability across current and future multicore architectures, we utilize automatic performance tuning, or auto-tuning.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1407089
- Country of Publication:
- United States
- Language:
- English
Similar Records
PERI - Auto-tuning Memory Intensive Kernels for Multicore
Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4
Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms
Conference
·
Tue Jun 24 00:00:00 EDT 2008
·
OSTI ID:936521
Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4
Conference
·
Mon May 04 00:00:00 EDT 2009
·
OSTI ID:962937
Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms
Conference
·
Thu Jan 31 23:00:00 EST 2008
·
OSTI ID:964372