Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading
In this paper, we have developed a novel methodology that takes into consideration multithreaded many-core designs to better utilize memory/processing resources and improve memory residence on tileable applications. It takes advantage of polyhedral analysis and transformation in the form of PLUTO, combined with a highly optimized finegrain tile runtime to exploit parallelism at all levels. The main contributions of this paper include the introduction of multi-hierarchical tiling techniques that increases intra tile parallelism; and a data-flow inspired runtime library that allows the expression of parallel tiles with an efficient synchronization registry. Our current implementation shows performance improvements on an Intel Xeon Phi board up to 32.25% against instances produced by state-of-the-art compiler frameworks for selected stencil applications.
- Publication Date:
- OSTI Identifier:
- Report Number(s):
- DOE Contract Number:
- Resource Type:
- Resource Relation:
- Conference: Languages and Compilers for Parallel Computing: 27th International Workshop (LCPC 2014), September 15-17, 2014, Hillsboro, Oregon. Lecture Notes in Computer Science, 8967:161-175
- J Brodman and P Tu; Springer , New York, NY, United States(US).
- Research Org:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
- Sponsoring Org:
- Country of Publication:
- United States
- high performance; compiler; software optimization; locality