skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: On the memory attribution problem: A solution and case study using MPI

Journal Article · · Concurrency and Computation. Practice and Experience
DOI:https://doi.org/10.1002/cpe.5159· OSTI ID:1495167

As parallel applications running on large–scale computing systems become increasingly memory constrained, the ability to attribute memory usage to the various components of the application is becoming increasingly important. We present the design and implementation of memnesia, a novel memory usage profiler for parallel and distributed message–passing applications. Our approach captures both application– and message–passing library–specific memory usage statistics from unmodified binaries dynamically linked to a message–passing communication library. Using microbenchmarks and proxy applications, we evaluated our profiler across three Message Passing Interface (MPI) implementations and two hardware platforms. Furthermore, the results show that our approach and the corresponding implementation can accurately quantify memory resource usage as a function of time, scale, communication workload, and software or hardware system architecture, clearly distinguishing between application and MPI library memory usage at a per–process level. With this new capability, we show that job size, communication workload, and hardware/software architecture influence peak runtime memory usage. In practice, this tool provides a potentially valuable source of information for application developers seeking to measure and optimize memory usage.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
89233218CNA000001; AC52‐06NA25396
OSTI ID:
1495167
Alternate ID(s):
OSTI ID: 1493495
Report Number(s):
LA-UR-18-30292
Journal Information:
Concurrency and Computation. Practice and Experience, Vol. 32, Issue 3; ISSN 1532-0626
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

References (11)

Valgrind: a framework for heavyweight dynamic binary instrumentation journal June 2007
Memory registration caching correctness conference January 2005
Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation book January 2004
A uGNI-Based MPICH2 Nemesis Network Module for the Cray XE book January 2011
DAS User Manual journal January 2008
Technology-Driven, Highly-Scalable Dragonfly Topology
  • Kim, John; Dally, Wiliam J.; Scott, Steve
  • 2008 35th International Symposium on Computer Architecture (ISCA), 2008 International Symposium on Computer Architecture https://doi.org/10.1109/ISCA.2008.19
conference June 2008
A high-performance, portable implementation of the MPI message passing interface standard journal September 1996
MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools conference January 2003
Improving the reliability of commodity operating systems journal February 2005
Technology-Driven, Highly-Scalable Dragonfly Topology journal June 2008
Improving the reliability of commodity operating systems conference January 2003