Performance Optimization and Auto-Tuning
In the broader computational research community, one subject of recent research is the problem of adapting algorithms to make effective use of multi- and many-core processors. Effective use of these architectures, which have complex memory hierarchies with many layers of cache, typically involves a careful examination of how an algorithm moves data through the memory hierarchy. Unfortunately, there is often a non-obvious relationship between algorithmic parameters like blocking strategies, and their impact on memory utilization, and, in turn, the relationship with runtime performance. Auto-tuning is an empirical method used to discover optimal values for tunable algorithmic parameters under such circumstances. The challenge is compounded by the fact that the settings that produce the best performance for a given problem and a given platform may not be the best for a different problem on the same platform, or the same problem on a different platform. The high performance visualization research community has begun to explore and adapt the principles of auto-tuning for the purpose of optimizing codes on modern multi- and many-core processors. This report focuses on how performance optimization studies reveal a dramatic variation in performance for two fundamental visualization algorithms: one based on a stencil operation having structured, uniform memory access, and the other is ray casting volume rendering, which uses unstructured memory access patterns. The two case studies highlighted in this report show the extra effort required to optimize such codes by adjusting the tunable algorithmic parameters can return substantial gains in performance. Additionally, these case studies also explore the potential impact of and the interaction between algorithmic optimizations and tunable algorithmic parameters, along with the potential performance gains resulting from leveraging architecture-specific features.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- DE-AC02-05CH11231
- OSTI ID:
- 1165083
- Report Number(s):
- LBNL-6466E
- Country of Publication:
- United States
- Language:
- English
Similar Records
Multi-core and many-core shared-memory parallel raycasting volume rendering optimization and tuning
A Generalized Framework for Auto-tuning Stencil Computations