Exploring performance and energy tradeoffs for irregular applications: A case study on the Tilera many-core architecture

Panyala, Ajay; Chavarría-Miranda, Daniel; Manzano, Joseph B.; Tumeo, Antonino; Halappanavar, Mahantesh

doi:10.1016/j.jpdc.2016.06.006

Title: Exploring performance and energy tradeoffs for irregular applications: A case study on the Tilera many-core architecture

Journal Article · Thu Jun 01 00:00:00 EDT 2017 · Journal of Parallel and Distributed Computing

DOI:https://doi.org/10.1016/j.jpdc.2016.06.006· OSTI ID:1347851

; Chavarría-Miranda, Daniel; Manzano, Joseph B.; Tumeo, Antonino;

High performance, parallel applications with irregular data accesses are becoming a critical workload class for modern systems. In particular, the execution of such workloads on emerging many-core systems is expected to be a significant component of applications in data mining, machine learning, scientific computing and graph analytics. However, power and energy constraints limit the capabilities of individual cores, memory hierarchy and on-chip interconnect of such systems, thus leading to architectural and software trade-os that must be understood in the context of the intended application’s behavior. Irregular applications are notoriously hard to optimize given their data-dependent access patterns, lack of structured locality and complex data structures and code patterns. We have ported two irregular applications, graph community detection using the Louvain method (Grappolo) and high-performance conjugate gradient (HPCCG), to the Tilera many-core system and have conducted a detailed study of platform-independent and platform-specific optimizations that improve their performance as well as reduce their overall energy consumption. To conduct this study, we employ an auto-tuning based approach that explores the optimization design space along three dimensions - memory layout schemes, GCC compiler flag choices and OpenMP loop scheduling options. We leverage MIT’s OpenTuner auto-tuning framework to explore and recommend energy optimal choices for different combinations of parameters. We then conduct an in-depth architectural characterization to understand the memory behavior of the selected workloads. Finally, we perform a correlation study to demonstrate the interplay between the hardware behavior and application characteristics. Using auto-tuning, we demonstrate whole-node energy savings and performance improvements of up to 49:6% and 60% relative to a baseline instantiation, and up to 31% and 45:4% relative to manually optimized variants.

Cite

Export

Save

Research Organization:: Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1347851

Report Number(s):: PNNL-SA-118976; 400470000

Journal Information:: Journal of Parallel and Distributed Computing, Vol. 104; ISSN 0743-7315

Publisher:: Elsevier

Country of Publication:: United States

Language:: English

Similar Records

Optimizing Irregular Applications for Energy and Performance on the Tilera Many-core Architecture

Conference · Wed May 20 00:00:00 EDT 2015 · OSTI ID:1347851

Chavarría-Miranda, Daniel; Panyala, Ajay R.; Halappanavar, Mahantesh; +2 more

Scaling Graph Community Detection on the Tilera Many-core Architecture

Conference · Mon Dec 01 00:00:00 EST 2014 · OSTI ID:1347851

Chavarría-Miranda, Daniel; Halappanavar, Mahantesh; Kalyanaraman, Anantharaman

Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures

Journal Article · Fri Nov 30 00:00:00 EST 2012 · International Journal of High Performance Computing Applications, 26 (4 ):413-430 · OSTI ID:1347851

Halappanavar, Mahantesh; Feo, John T; Villa, Oreste; +2 more

Related Subjects

auto tuning
irregular applications
community detection
sparse conjugate gradient
energy optimization

Title: Exploring performance and energy tradeoffs for irregular applications: A case study on the Tilera many-core architecture

Citation Formats

Similar Records

Related Subjects