Optimizing transformations of stencil operations for parallel cache-based architectures
Abstract
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like operations for cache-based architectures. This technique takes advantage of the semantic knowledge implicity in stencil-like computations. The technique is implemented as a source-to-source program transformation; because of its specificity it could not be expected of a conventional compiler. Empirical results demonstrate a uniform factor of two speedup. The experiments clearly show the benefits of this technique to be a consequence, as intended, of the reduction in cache misses. The test codes are based on a 5-point stencil obtained by the discretization of the Poisson equation and applied to a two-dimensional uniform grid using the Jacobi method as an iterative solver. Results are presented for a 1-D tiling for a single processor, and in parallel using 1-D data partition. For the parallel case both blocking and non-blocking communication are tested. The same scheme of experiments has bee n performed for the 2-D tiling case. However, for the parallel case the 2-D partitioning is not discussed here, so the parallel case handled for 2-D is 2-D tiling with 1-D data partitioning.
- Authors:
- Publication Date:
- Research Org.:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Org.:
- US Department of Energy (US)
- OSTI Identifier:
- 757004
- Report Number(s):
- LA-UR-99-1119
TRN: AH200021%%317
- DOE Contract Number:
- W-7405-ENG-36
- Resource Type:
- Conference
- Resource Relation:
- Conference: 1999 International Conference on Parallel and Distributed Processing Techniques and Applications, Monte Carlo Resort, Las Vegas, NV (US), 06/28/1999--07/01/1999; Other Information: PBD: 28 Jun 1999
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; PARALLEL PROCESSING; TRANSFORMATIONS; COMPUTER ARCHITECTURE; ITERATIVE METHODS; TASK SCHEDULING; OPTIMIZATION; PARALLEL STENCIL OPERATIONS; OPTIMIZING TRANSFORMATION; PARALLEL CACHE-BASED ARCHITECTURES
Citation Formats
Bassetti, F, and Davis, K. Optimizing transformations of stencil operations for parallel cache-based architectures. United States: N. p., 1999.
Web.
Bassetti, F, & Davis, K. Optimizing transformations of stencil operations for parallel cache-based architectures. United States.
Bassetti, F, and Davis, K. 1999.
"Optimizing transformations of stencil operations for parallel cache-based architectures". United States. https://www.osti.gov/servlets/purl/757004.
@article{osti_757004,
title = {Optimizing transformations of stencil operations for parallel cache-based architectures},
author = {Bassetti, F and Davis, K},
abstractNote = {This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like operations for cache-based architectures. This technique takes advantage of the semantic knowledge implicity in stencil-like computations. The technique is implemented as a source-to-source program transformation; because of its specificity it could not be expected of a conventional compiler. Empirical results demonstrate a uniform factor of two speedup. The experiments clearly show the benefits of this technique to be a consequence, as intended, of the reduction in cache misses. The test codes are based on a 5-point stencil obtained by the discretization of the Poisson equation and applied to a two-dimensional uniform grid using the Jacobi method as an iterative solver. Results are presented for a 1-D tiling for a single processor, and in parallel using 1-D data partition. For the parallel case both blocking and non-blocking communication are tested. The same scheme of experiments has bee n performed for the 2-D tiling case. However, for the parallel case the 2-D partitioning is not discussed here, so the parallel case handled for 2-D is 2-D tiling with 1-D data partitioning.},
doi = {},
url = {https://www.osti.gov/biblio/757004},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jun 28 00:00:00 EDT 1999},
month = {Mon Jun 28 00:00:00 EDT 1999}
}