skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach

Abstract

Nowadays, NUMA architectures are common in compute-intensive systems. Achieving high performance for multi-threaded application requires both a careful placement of threads on computing units and a thorough allocation of data in memory. Finding such a placement is a hard problem to solve, because performance depends on complex interactions in several layers of the memory hierarchy. In this paper we propose a black-box approach to decide if an application execution time can be impacted by the placement of its threads and data, and in such a case, to choose the best placement strategy to adopt. We show that it is possible to reach near-optimal placement policy selection. Furthermore, solutions work across several recent processor architectures and decisions can be taken with a single run of low overhead profiling.

Authors:
; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
Institut National de Recherche en Informatique et en Automatique (INRIA), Bordeaux Sud-Ouest; Centre National de la Recherche Scientifique (CNRS)
OSTI Identifier:
1574309
DOE Contract Number:  
AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: 48th International Conference on Parallel Processing, 08/05/19 - 08/08/19, Kyoto, JP
Country of Publication:
United States
Language:
English
Subject:
data; high-performance-computing; machine-learning; multicore-processors; numa; placement; threads

Citation Formats

Denoyelle, Nicolas, Goglin, Brice, Jeannot, Emmanuel, and Ropars, Thomas. Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach. United States: N. p., 2019. Web. doi:10.1145/3337821.3337893.
Denoyelle, Nicolas, Goglin, Brice, Jeannot, Emmanuel, & Ropars, Thomas. Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach. United States. doi:10.1145/3337821.3337893.
Denoyelle, Nicolas, Goglin, Brice, Jeannot, Emmanuel, and Ropars, Thomas. Tue . "Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach". United States. doi:10.1145/3337821.3337893.
@article{osti_1574309,
title = {Data and Thread Placement in NUMA Architectures: A Statistical Learning Approach},
author = {Denoyelle, Nicolas and Goglin, Brice and Jeannot, Emmanuel and Ropars, Thomas},
abstractNote = {Nowadays, NUMA architectures are common in compute-intensive systems. Achieving high performance for multi-threaded application requires both a careful placement of threads on computing units and a thorough allocation of data in memory. Finding such a placement is a hard problem to solve, because performance depends on complex interactions in several layers of the memory hierarchy. In this paper we propose a black-box approach to decide if an application execution time can be impacted by the placement of its threads and data, and in such a case, to choose the best placement strategy to adopt. We show that it is possible to reach near-optimal placement policy selection. Furthermore, solutions work across several recent processor architectures and decisions can be taken with a single run of low overhead profiling.},
doi = {10.1145/3337821.3337893},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:

Works referenced in this record:

The Nas Parallel Benchmarks
journal, September 1991

  • Bailey, D. H.; Barszcz, E.; Barton, J. T.
  • The International Journal of Supercomputing Applications, Vol. 5, Issue 3
  • DOI: 10.1177/109434209100500306

Affinity-Based Thread and Data Mapping in Shared Memory Systems
journal, December 2016

  • Diener, Matthias; Cruz, Eduardo H. M.; Alves, Marco A. Z.
  • ACM Computing Surveys, Vol. 49, Issue 4
  • DOI: 10.1145/3006385