Single-node Partitioned-Memory for Huge Graph Analytics: Cost and Performance Trade-offs
- BATTELLE (PACIFIC NW LAB)
- Intel
- Washington State University
Nonvolatile memory NVDIMMs, available as Intel Optane, are less expensive than DRAM and bring large byte-addressable storage within reach to many applications. Evaluations on graph analytics have shown promising performance only when DRAM is used as a hardware cache (Memory mode). An open question is whether graph applications can exploit Optane and DRAM directly (AppDirect mode) and achieving better-than-DRAM average bandwidth and run times. We evaluate Optane as a volatile pool on two large-scale graph applications with very different computational patterns, Grappolo and Ripples. We show that AppDirect mode can deliver better-than-DRAM performance, by allocating data structures to Optane and DRAM according to their access characteristics, resulting in higher average memory bandwidth and lower average latency. Memory mode provides DRAM-competitive performance with capacity equal to persistent memory. We demonstrate occasional 4x improvement using the latest AppDirect option and frequently observe competitive performance between Optane AppDirect Memory modes and DRAM.
- Research Organization:
- Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1836010
- Report Number(s):
- PNNL-SA-161359
- Country of Publication:
- United States
- Language:
- English
Similar Records
Sage: parallel semi-asymmtric graph algorithms for NVRAMs
Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation
Can high bandwidth and latency justify large cache blocks in scalable multiprocessors?
Journal Article
·
Fri Jun 26 00:00:00 EDT 2020
· Proceedings of the VLDB Endowment
·
OSTI ID:1803480
Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation
Journal Article
·
Mon Jan 02 23:00:00 EST 2017
· Journal of Parallel and Distributed Computing
·
OSTI ID:1371471
Can high bandwidth and latency justify large cache blocks in scalable multiprocessors?
Conference
·
Fri Dec 30 23:00:00 EST 1994
·
OSTI ID:98914