Empirical Memory-Access Cost Models in Multicore NUMA Architectures
Conference
·
OSTI ID:1011076
- Los Alamos National Laboratory
- Virginia Tech
Data location is of prime importance when scheduling tasks in a non-uniform memory access (NUMA) architecture. The characteristics of the NUMA architecture must be understood so tasks can be scheduled onto processors that are close to the task's data. However, in modern NUMA architectures, such as AMD Magny-Cours and Intel Nehalem, there may be a relatively large number of memory controllers with sockets that are connected in a non-intuitive manner, leading to performance degradation due to uninformed task-scheduling decisions. In this paper, we provide a method for experimentally characterizing memory-access costs for modern NUMA architectures via memory latency and bandwidth microbenchmarks. Using the results of these benchmarks, we propose a memory-access cost model to improve task-scheduling decisions by scheduling tasks near the data they need. Simple task-scheduling experiments using the memory-access cost models validate the use of empirical memory-access cost models to significantly improve program performance.
- Research Organization:
- Los Alamos National Laboratory (LANL)
- Sponsoring Organization:
- DOE/LANL
- DOE Contract Number:
- AC52-06NA25396
- OSTI ID:
- 1011076
- Report Number(s):
- LA-UR-11-10315
- Country of Publication:
- United States
- Language:
- English
Similar Records
NUMA-Aware Thread Scheduling for Big Data Transfers over Terabits Network Infrastructure
Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures
Program partitioning for NUMA multiprocessor computer systems. [Nonuniform memory access]
Journal Article
·
Sun May 06 20:00:00 EDT 2018
· Scientific Programming
·
OSTI ID:1565699
Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures
Journal Article
·
Thu Nov 29 23:00:00 EST 2012
· International Journal of High Performance Computing Applications, 26 (4 ):413-430
·
OSTI ID:1057347
Program partitioning for NUMA multiprocessor computer systems. [Nonuniform memory access]
Journal Article
·
Sun Oct 31 23:00:00 EST 1993
· Journal of Parallel and Distributed Computing; (United States)
·
OSTI ID:5703692