DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: On the memory attribution problem: A solution and case study using MPI

Abstract

As parallel applications running on large–scale computing systems become increasingly memory constrained, the ability to attribute memory usage to the various components of the application is becoming increasingly important. We present the design and implementation of memnesia, a novel memory usage profiler for parallel and distributed message–passing applications. Our approach captures both application– and message–passing library–specific memory usage statistics from unmodified binaries dynamically linked to a message–passing communication library. Using microbenchmarks and proxy applications, we evaluated our profiler across three Message Passing Interface (MPI) implementations and two hardware platforms. Furthermore, the results show that our approach and the corresponding implementation can accurately quantify memory resource usage as a function of time, scale, communication workload, and software or hardware system architecture, clearly distinguishing between application and MPI library memory usage at a per–process level. With this new capability, we show that job size, communication workload, and hardware/software architecture influence peak runtime memory usage. In practice, this tool provides a potentially valuable source of information for application developers seeking to measure and optimize memory usage.

Authors:
ORCiD logo [1];  [2]; ORCiD logo [3]; ORCiD logo [3]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Univ. of New Mexico, Albuquerque, NM (United States)
  2. Emory Univ., Atlanta, GA (United States)
  3. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1495167
Alternate Identifier(s):
OSTI ID: 1493495
Report Number(s):
LA-UR-18-30292
Journal ID: ISSN 1532-0626
Grant/Contract Number:  
89233218CNA000001; AC52‐06NA25396
Resource Type:
Accepted Manuscript
Journal Name:
Concurrency and Computation. Practice and Experience
Additional Journal Information:
Journal Volume: 32; Journal Issue: 3; Journal ID: ISSN 1532-0626
Publisher:
Wiley
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Computer Science

Citation Formats

Gutiérrez, Samuel Keith, Arnold, Dorian C., Davis, Kei Marion, and McCormick, Patrick Sean. On the memory attribution problem: A solution and case study using MPI. United States: N. p., 2019. Web. doi:10.1002/cpe.5159.
Gutiérrez, Samuel Keith, Arnold, Dorian C., Davis, Kei Marion, & McCormick, Patrick Sean. On the memory attribution problem: A solution and case study using MPI. United States. https://doi.org/10.1002/cpe.5159
Gutiérrez, Samuel Keith, Arnold, Dorian C., Davis, Kei Marion, and McCormick, Patrick Sean. Mon . "On the memory attribution problem: A solution and case study using MPI". United States. https://doi.org/10.1002/cpe.5159. https://www.osti.gov/servlets/purl/1495167.
@article{osti_1495167,
title = {On the memory attribution problem: A solution and case study using MPI},
author = {Gutiérrez, Samuel Keith and Arnold, Dorian C. and Davis, Kei Marion and McCormick, Patrick Sean},
abstractNote = {As parallel applications running on large–scale computing systems become increasingly memory constrained, the ability to attribute memory usage to the various components of the application is becoming increasingly important. We present the design and implementation of memnesia, a novel memory usage profiler for parallel and distributed message–passing applications. Our approach captures both application– and message–passing library–specific memory usage statistics from unmodified binaries dynamically linked to a message–passing communication library. Using microbenchmarks and proxy applications, we evaluated our profiler across three Message Passing Interface (MPI) implementations and two hardware platforms. Furthermore, the results show that our approach and the corresponding implementation can accurately quantify memory resource usage as a function of time, scale, communication workload, and software or hardware system architecture, clearly distinguishing between application and MPI library memory usage at a per–process level. With this new capability, we show that job size, communication workload, and hardware/software architecture influence peak runtime memory usage. In practice, this tool provides a potentially valuable source of information for application developers seeking to measure and optimize memory usage.},
doi = {10.1002/cpe.5159},
journal = {Concurrency and Computation. Practice and Experience},
number = 3,
volume = 32,
place = {United States},
year = {Mon Feb 04 00:00:00 EST 2019},
month = {Mon Feb 04 00:00:00 EST 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Valgrind: a framework for heavyweight dynamic binary instrumentation
journal, June 2007


Memory registration caching correctness
conference, January 2005

  • Wyckoff, P.; Wu, J.
  • CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005.
  • DOI: 10.1109/CCGRID.2005.1558671

Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation
book, January 2004

  • Gabriel, Edgar; Fagg, Graham E.; Bosilca, George
  • Recent Advances in Parallel Virtual Machine and Message Passing Interface
  • DOI: 10.1007/978-3-540-30218-6_19

A uGNI-Based MPICH2 Nemesis Network Module for the Cray XE
book, January 2011


DAS User Manual
journal, January 2008


Technology-Driven, Highly-Scalable Dragonfly Topology
conference, June 2008

  • Kim, John; Dally, Wiliam J.; Scott, Steve
  • 2008 35th International Symposium on Computer Architecture (ISCA), 2008 International Symposium on Computer Architecture
  • DOI: 10.1109/ISCA.2008.19

A high-performance, portable implementation of the MPI message passing interface standard
journal, September 1996


MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools
conference, January 2003

  • Roth, Philip C.; Arnold, Dorian C.; Miller, Barton P.
  • Proceedings of the 2003 ACM/IEEE conference on Supercomputing - SC '03
  • DOI: 10.1145/1048935.1050172

Improving the reliability of commodity operating systems
journal, February 2005

  • Swift, Michael M.; Bershad, Brian N.; Levy, Henry M.
  • ACM Transactions on Computer Systems, Vol. 23, Issue 1
  • DOI: 10.1145/1047915.1047919

Technology-Driven, Highly-Scalable Dragonfly Topology
journal, June 2008

  • Kim, John; Dally, Wiliam J.; Scott, Steve
  • ACM SIGARCH Computer Architecture News, Vol. 36, Issue 3
  • DOI: 10.1145/1394608.1382129

Improving the reliability of commodity operating systems
conference, January 2003

  • Swift, Michael M.; Bershad, Brian N.; Levy, Henry M.
  • Proceedings of the nineteenth ACM symposium on Operating systems principles - SOSP '03
  • DOI: 10.1145/945445.945466