Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach

Conference ·

Nowadays, NUMA architectures are common in compute-intensive systems. Achieving high performance for multi-threaded application requires both a careful placement of threads on computing units and a thorough allocation of data in memory. Finding such a placement is a hard problem to solve, because performance depends on complex interactions in several layers of the memory hierarchy. In this paper we propose a black-box approach to decide if an application execution time can be impacted by the placement of its threads and data, and in such a case, to choose the best placement strategy to adopt. We show that it is possible to reach near-optimal placement policy selection. Furthermore, solutions work across several recent processor architectures and decisions can be taken with a single run of low overhead profiling.

Research Organization:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Organization:
Institut National de Recherche en Informatique et en Automatique (INRIA), Bordeaux Sud-Ouest; Centre National de la Recherche Scientifique (CNRS)
DOE Contract Number:
AC02-06CH11357
OSTI ID:
1574309
Resource Relation:
Conference: 48th International Conference on Parallel Processing, 08/05/19 - 08/08/19, Kyoto, JP
Country of Publication:
United States
Language:
English

References (9)

The Nas Parallel Benchmarks September 1991
The PARSEC benchmark suite: characterization and architectural implications January 2008
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications
  • No authors listed
  • 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 2010), 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing https://doi.org/10.1109/PDP.2010.67
February 2010
A machine learning-based approach for thread mapping on transactional memory applications December 2011
Traffic management: a holistic approach to memory placement on NUMA systems
  • No authors listed
  • Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '13 https://doi.org/10.1145/2451116.2451157
January 2013
Affinity-Based Thread and Data Mapping in Shared Memory Systems December 2016
Hardware profile-guided automatic page placement for ccNUMA systems January 2006
Mapping parallelism to multi-cores: a machine learning based approach January 2008
Addressing shared resource contention in multicore processors via scheduling January 2010

Similar Records

Critical Path-Based Thread Placement for NUMA Systems
Conference · 2011 · OSTI ID:1035298

Critical Path-Based Thread Placement for NUMA Systems
Journal Article · 2012 · Performance Evaluation Review · OSTI ID:1048161

Page placement policies for NUMA multiprocessors
Journal Article · 1991 · Journal of Parallel and Distributed Computing; (United States) · OSTI ID:5001639