PERI - Auto-tuning Memory Intensive Kernels for Multicore
Conference
·
OSTI ID:936521
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of search-based performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to Sparse Matrix Vector Multiplication (SpMV), the explicit heat equation PDE on a regular grid (Stencil), and a lattice Boltzmann application (LBMHD). We explore one of the broadest sets of multicore architectures in the HPC literature, including the Intel Xeon Clovertown, AMD Opteron Barcelona, Sun Victoria Falls, and the Sony-Toshiba-IBM (STI) Cell. Rather than hand-tuning each kernel for each system, we develop a code generator for each kernel that allows us to identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our auto-tuned kernel applications often achieve a better than 4X improvement compared with the original code. Additionally, we analyze a Roofline performance model for each platform to reveal hardware bottlenecks and software challenges for future multicore systems and applications.
- Research Organization:
- Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, CA (US)
- Sponsoring Organization:
- Computational Research Division; National Energy Research Scientific Computing Division
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 936521
- Report Number(s):
- LBNL-845E
- Country of Publication:
- United States
- Language:
- English
Similar Records
Optimization of a Lattice Boltzmann Computation on State-of-the-Art Multicore Platforms
Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms
Lattice Boltzmann simulation optimization on leading multicore platforms
Journal Article
·
Fri Apr 10 00:00:00 EDT 2009
· Journal of Parallel and Distributed Computing
·
OSTI ID:963653
Lattice Boltzmann Simulation Optimization on Leading Multicore Platforms
Conference
·
Thu Jan 31 23:00:00 EST 2008
·
OSTI ID:964372
Lattice Boltzmann simulation optimization on leading multicore platforms
Conference
·
Mon Dec 31 23:00:00 EST 2007
·
OSTI ID:1407059