Optimizing transformations of stencil operations for parallel object-oriented scientific frameworks on cache-based architectures
High-performance scientific computing relies increasingly on high-level large-scale object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: distributed data management, process management, inter-process communication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental component of such object-oriented frameworks--a parallel or serial array-class library--provides an opportunity for increasingly sophisticated compile-time optimization techniques. This paper describes two optimizing transformations suitable for certain classes of numerical algorithms, one for reducing the cost of inter-processor communication, and one for improving cache utilization; demonstrates and analyzes the resulting performance gains; and indicates how these transformations are being automated.
- Research Organization:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE Assistant Secretary for Human Resources and Administration, Washington, DC (United States)
- DOE Contract Number:
- W-7405-ENG-36
- OSTI ID:
- 304130
- Report Number(s):
- LA-UR-98-2404; CONF-981207-; ON: DE99001267; TRN: AHC29904%%206
- Resource Relation:
- Conference: ISCOPE `98: international symposium on computing in object oriented parallel environments, Santa Fe, NM (United States), 8-11 Dec 1998; Other Information: PBD: [1998]
- Country of Publication:
- United States
- Language:
- English
Similar Records
Optimizing transformations of stencil operations for parallel cache-based architectures
Improving scalability with loop transformations and message aggregation in parallel object-oriented frameworks for scientific computing