Gregarious Data Re-structuring in a Many Core Architecture
this paper, we have developed a new methodology that takes in consideration the access patterns from a single parallel actor (e.g. a thread), as well as, the access patterns of “grouped” parallel actors that share a resource (e.g. a distributed Level 3 cache). We start with a hierarchical tile code for our target machine and apply a series of transformations at the tile level to improve data residence in a given memory hierarchy level. The contribution of this paper includes (a) collaborative data restructuring for group reuse and (b) low overhead transformation technique to improve access pattern and bring closely connected data elements together. Preliminary results in a many core architecture, Tilera TileGX, shows promising improvements over optimized OpenMP code (up to 31% increase in GFLOPS) and over our own previous work on fine grained runtimes (up to 16%) for selected kernels
- Research Organization:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1236328
- Report Number(s):
- PNNL-SA-110971; KJ0402000
- Country of Publication:
- United States
- Language:
- English
Similar Records
Critical Path-Based Thread Placement for NUMA Systems
Optimizing Irregular Applications for Energy and Performance on the Tilera Many-core Architecture
Locality Aware Concurrent Start for Stencil Applications
Conference
·
Tue Nov 01 00:00:00 EDT 2011
·
OSTI ID:1035298
Optimizing Irregular Applications for Energy and Performance on the Tilera Many-core Architecture
Conference
·
Wed May 20 00:00:00 EDT 2015
·
OSTI ID:1194293
Locality Aware Concurrent Start for Stencil Applications
Conference
·
Mon Feb 09 23:00:00 EST 2015
·
OSTI ID:1194299