skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Feedback-directed page placement for ccNUMA via hardware-generated memory traces

Journal Article · · Journal of Parallel and Distributed Computing

Non-uniform memory architectures with cache coherence (ccNUMA) are becoming increasingly common, not just for large-scale high performance platforms but also in the context of multi-core architectures. Under ccNUMA, data placement may influence overall application performance significantly as references resolved locally to a processor/core impose lower latencies than remote ones. This work develops a novel hardware-assisted page placement paradigm based on automated tracing of the memory references made by application threads. Two placement schemes, modeling both singlelevel and multi-level latencies, allocate pages near processors that most frequently access that memory page. These schemes leverage performance monitoring capabilities of contemporary microprocessors to efficiently extract an approximate trace of memory accesses. This information is used to decide page affinity, i.e., the node to which the page is bound. The method operates entirely in user space, is widely automated, and handles not only static but also dynamic memory allocation. Experiments show that this method, although based on lossy tracing, can efficiently and effectively improve page placement, leading to an average wall-clock execution time saving of over 20% for the tested benchmarks on the SGI Altix with a 2x remote access penalty and 12% on AMD Opterons with a 1.3–2.0x access penalty. This is accompanied by a one-time tracing overhead of 2.7% over the overall original program wallclock time.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); UT-Battelle LLC/ORNL, Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1564736
Journal Information:
Journal of Parallel and Distributed Computing, Vol. 70, Issue 12; ISSN 0743-7315
Publisher:
Elsevier
Country of Publication:
United States
Language:
English

Similar Records

Page placement policies for NUMA multiprocessors
Journal Article · Fri Feb 01 00:00:00 EST 1991 · Journal of Parallel and Distributed Computing; (United States) · OSTI ID:1564736

De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers
Journal Article · Mon Sep 04 00:00:00 EDT 2006 · The International Journal of High Performance Computing Applications, vol. 22, no. 1, February 1, 2008, pp. 113-128 · OSTI ID:1564736

Bringing large-scale multiple genome analysis one step closer: ScalaBLAST and beyond
Technical Report · Fri Jun 01 00:00:00 EDT 2007 · OSTI ID:1564736

Related Subjects