skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach

Conference ·

Nowadays, NUMA architectures are common in compute-intensive systems. Achieving high performance for multi-threaded application requires both a careful placement of threads on computing units and a thorough allocation of data in memory. Finding such a placement is a hard problem to solve, because performance depends on complex interactions in several layers of the memory hierarchy. In this paper we propose a black-box approach to decide if an application execution time can be impacted by the placement of its threads and data, and in such a case, to choose the best placement strategy to adopt. We show that it is possible to reach near-optimal placement policy selection. Furthermore, solutions work across several recent processor architectures and decisions can be taken with a single run of low overhead profiling.

Research Organization:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Organization:
Institut National de Recherche en Informatique et en Automatique (INRIA), Bordeaux Sud-Ouest; Centre National de la Recherche Scientifique (CNRS)
DOE Contract Number:
AC02-06CH11357
OSTI ID:
1574309
Resource Relation:
Conference: 48th International Conference on Parallel Processing, 08/05/19 - 08/08/19, Kyoto, JP
Country of Publication:
United States
Language:
English

References (9)

The Nas Parallel Benchmarks journal September 1991
The PARSEC benchmark suite: characterization and architectural implications
  • Bienia, Christian; Kumar, Sanjeev; Singh, Jaswinder Pal
  • Proceedings of the 17th international conference on Parallel architectures and compilation techniques - PACT '08 https://doi.org/10.1145/1454115.1454128
conference January 2008
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications
  • Broquedis, Franois; Clet-Ortega, Jerome; Moreaud, Stephanie
  • 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010), 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing https://doi.org/10.1109/PDP.2010.67
conference February 2010
A machine learning-based approach for thread mapping on transactional memory applications conference December 2011
Traffic management: a holistic approach to memory placement on NUMA systems
  • Dashti, Mohammad; Fedorova, Alexandra; Funston, Justin
  • Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '13 https://doi.org/10.1145/2451116.2451157
conference January 2013
Affinity-Based Thread and Data Mapping in Shared Memory Systems journal December 2016
Hardware profile-guided automatic page placement for ccNUMA systems conference January 2006
Mapping parallelism to multi-cores: a machine learning based approach conference January 2008
Addressing shared resource contention in multicore processors via scheduling
  • Zhuravlev, Sergey; Blagodurov, Sergey; Fedorova, Alexandra
  • Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems - ASPLOS '10 https://doi.org/10.1145/1736020.1736036
conference January 2010

Similar Records

Critical Path-Based Thread Placement for NUMA Systems
Conference · Tue Nov 01 00:00:00 EDT 2011 · OSTI ID:1574309

Critical Path-Based Thread Placement for NUMA Systems
Journal Article · Sun Jan 01 00:00:00 EST 2012 · Performance Evaluation Review · OSTI ID:1574309

Page placement policies for NUMA multiprocessors
Journal Article · Fri Feb 01 00:00:00 EST 1991 · Journal of Parallel and Distributed Computing; (United States) · OSTI ID:1574309