Feedback-directed page placement for ccNUMA via hardware-generated memory traces
Journal Article
·
· Journal of Parallel and Distributed Computing
Non-uniform memory architectures with cache coherence (ccNUMA) are becoming increasingly common, not just for large-scale high performance platforms but also in the context of multi-core architectures. Under ccNUMA, data placement may influence overall application performance significantly as references resolved locally to a processor/core impose lower latencies than remote ones. This work develops a novel hardware-assisted page placement paradigm based on automated tracing of the memory references made by application threads. Two placement schemes, modeling both singlelevel and multi-level latencies, allocate pages near processors that most frequently access that memory page. These schemes leverage performance monitoring capabilities of contemporary microprocessors to efficiently extract an approximate trace of memory accesses. This information is used to decide page affinity, i.e., the node to which the page is bound. The method operates entirely in user space, is widely automated, and handles not only static but also dynamic memory allocation. Experiments show that this method, although based on lossy tracing, can efficiently and effectively improve page placement, leading to an average wall-clock execution time saving of over 20% for the tested benchmarks on the SGI Altix with a 2x remote access penalty and 12% on AMD Opterons with a 1.3–2.0x access penalty. This is accompanied by a one-time tracing overhead of 2.7% over the overall original program wallclock time.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF), Oak Ridge, TN (United States); UT-Battelle LLC/ORNL, Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1564736
- Journal Information:
- Journal of Parallel and Distributed Computing, Journal Name: Journal of Parallel and Distributed Computing Journal Issue: 12 Vol. 70; ISSN 0743-7315
- Publisher:
- Elsevier
- Country of Publication:
- United States
- Language:
- English
Similar Records
Page placement policies for NUMA multiprocessors
Critical Path-Based Thread Placement for NUMA Systems
Critical Path-Based Thread Placement for NUMA Systems
Journal Article
·
Thu Jan 31 23:00:00 EST 1991
· Journal of Parallel and Distributed Computing; (United States)
·
OSTI ID:5001639
Critical Path-Based Thread Placement for NUMA Systems
Journal Article
·
Sat Dec 31 23:00:00 EST 2011
· Performance Evaluation Review
·
OSTI ID:1048161
Critical Path-Based Thread Placement for NUMA Systems
Conference
·
Tue Nov 01 00:00:00 EDT 2011
·
OSTI ID:1035298