Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach

Denoyelle, Nicolas; Goglin, Brice; Jeannot, Emmanuel; Ropars, Thomas

doi:10.1145/3337821.3337893

Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach

Conference · Tue Jan 01 04:00:00 EST 2019

DOI:https://doi.org/10.1145/3337821.3337893· OSTI ID:1574309

Denoyelle, Nicolas; Goglin, Brice; Jeannot, Emmanuel; Ropars, Thomas

Nowadays, NUMA architectures are common in compute-intensive systems. Achieving high performance for multi-threaded application requires both a careful placement of threads on computing units and a thorough allocation of data in memory. Finding such a placement is a hard problem to solve, because performance depends on complex interactions in several layers of the memory hierarchy. In this paper we propose a black-box approach to decide if an application execution time can be impacted by the placement of its threads and data, and in such a case, to choose the best placement strategy to adopt. We show that it is possible to reach near-optimal placement policy selection. Furthermore, solutions work across several recent processor architectures and decisions can be taken with a single run of low overhead profiling.

Research Organization:: Argonne National Laboratory (ANL)

Sponsoring Organization:: Institut National de Recherche en Informatique et en Automatique (INRIA), Bordeaux Sud-Ouest; Centre National de la Recherche Scientifique (CNRS)

DOE Contract Number:: AC02-06CH11357

OSTI ID:: 1574309

Country of Publication:: United States

Language:: English

References (9)

A machine learning-based approach for thread mapping on transactional memory applications Castro, Marcio; Goes, Luis Fabricio Wanderley; Ribeiro, Christiane Pousa 2011 18th International Conference on High Performance Computing (HiPC) https://doi.org/10.1109/HiPC.2011.6152736	conference	December 2011
Traffic management: a holistic approach to memory placement on NUMA systems Dashti, Mohammad; Fedorova, Alexandra; Funston, Justin Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '13 https://doi.org/10.1145/2451116.2451157	conference	January 2013
The Nas Parallel Benchmarks Bailey, D. H.; Barszcz, E.; Barton, J. T. The International Journal of Supercomputing Applications, Vol. 5, Issue 3 https://doi.org/10.1177/109434209100500306	journal	September 1991
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications Broquedis, Franois; Clet-Ortega, Jerome; Moreaud, Stephanie 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010), 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing https://doi.org/10.1109/PDP.2010.67	conference	February 2010
Mapping parallelism to multi-cores: a machine learning based approach Wang, Zheng; O'Boyle, Micheal F. P. Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '09 https://doi.org/10.1145/1504176.1504189	conference	January 2008
Affinity-Based Thread and Data Mapping in Shared Memory Systems Diener, Matthias; Cruz, Eduardo H. M.; Alves, Marco A. Z. ACM Computing Surveys, Vol. 49, Issue 4 https://doi.org/10.1145/3006385	journal	December 2016
Hardware profile-guided automatic page placement for ccNUMA systems Marathe, Jaydeep; Mueller, Frank Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '06 https://doi.org/10.1145/1122971.1122987	conference	January 2006
The PARSEC benchmark suite: characterization and architectural implications Bienia, Christian; Kumar, Sanjeev; Singh, Jaswinder Pal Proceedings of the 17th international conference on Parallel architectures and compilation techniques - PACT '08 https://doi.org/10.1145/1454115.1454128	conference	January 2008
Addressing shared resource contention in multicore processors via scheduling Zhuravlev, Sergey; Blagodurov, Sergey; Fedorova, Alexandra Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems - ASPLOS '10 https://doi.org/10.1145/1736020.1736036	conference	January 2010

Similar Records

Critical Path-Based Thread Placement for NUMA Systems

Conference · Tue Nov 01 00:00:00 EDT 2011 · OSTI ID:1035298

Critical Path-Based Thread Placement for NUMA Systems

Journal Article · Sat Dec 31 23:00:00 EST 2011 · Performance Evaluation Review · OSTI ID:1048161

Page placement policies for NUMA multiprocessors

Journal Article · Thu Jan 31 23:00:00 EST 1991 · Journal of Parallel and Distributed Computing; (United States) · OSTI ID:5001639

Related Subjects

data
high-performance-computing
machine-learning
multicore-processors
numa
placement
threads

Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach

Citation Formats

References (9)

Similar Records

Related Subjects