Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Sage: parallel semi-asymmtric graph algorithms for NVRAMs

Journal Article · · Proceedings of the VLDB Endowment
 [1];  [2];  [3];  [4];  [2];  [2];  [5]
  1. Carnegie Mellon Univ., Pittsburgh, PA (United States); OSTI
  2. Carnegie Mellon Univ., Pittsburgh, PA (United States)
  3. Tsinghua Univ., Beijing (China)
  4. Univ. of California, Riverside, CA (United States)
  5. Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Computer Science and Artificial Intelligence Lab. (CSAIL)
Non-volatile main memory (NVRAM) technologies provide an attractive set of features for large-scale graph analytics, including byte-addressability, low idle power, and improved memory-density. NVRAM systems today have an order of magnitude more NVRAM than traditional memory (DRAM). NVRAM systems could therefore potentially allow very large graph problems to be solved on a single machine, at a modest cost. However, a significant challenge in achieving high performance is in accounting for the fact that NVRAM writes can be much more expensive than NVRAM reads. In this paper, we propose an approach to parallel graph analytics using the Parallel Semi-Asymmetric Model (PSAM), in which the graph is stored as a read-only data structure (in NVRAM), and the amount of mutable memory is kept proportional to the number of vertices. Similar to the popular semi-external and semi-streaming models for graph analytics, the PSAM approach assumes that the vertices of the graph fit in a fast read-write memory (DRAM), but the edges do not. In NVRAM systems, our approach eliminates writes to the NVRAM, among other benefits. To experimentally study this new setting, we develop Sage, a parallel semi-asymmetric graph engine with which we implement provably-efficient (and often work-optimal) PSAM algorithms for over a dozen fundamental graph problems. We experimentally study Sage using a 48--core machine on the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) equipped with Optane DC Persistent Memory, and show that Sage outperforms the fastest prior systems designed for NVRAM. Importantly, we also show that Sage nearly matches the fastest prior systems running solely in DRAM, by effectively hiding the costs of repeatedly accessing NVRAM versus DRAM.
Research Organization:
Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)
Sponsoring Organization:
USDOE Office of Science (SC); National Science Foundation (NSF); Defense Advanced Research Projects Agency (DARPA)
Grant/Contract Number:
SC0018947
OSTI ID:
1803480
Journal Information:
Proceedings of the VLDB Endowment, Journal Name: Proceedings of the VLDB Endowment Journal Issue: 9 Vol. 13; ISSN 2150-8097
Publisher:
Association for Computing Machinery (ACM)Copyright Statement
Country of Publication:
United States
Language:
English

References (87)

Influence of deepening and mesoscale organization of shallow convection on stratiform cloudiness in the downstream trades
  • Vogel, Raphaela; Nuijens, Louise; Stevens, Bjorn
  • Quarterly Journal of the Royal Meteorological Society, Vol. 146, Issue 726 https://doi.org/10.1002/qj.3664
journal November 2019
A Functional Approach to External Graph Algorithms journal March 2002
Multicore triangle computations without tuning conference April 2015
Write-Avoiding Algorithms conference May 2016
Revisiting the I/O-Complexity of Fast Matrix Multiplication with Recomputations conference May 2019
Direction-optimizing Breadth-First Search
  • Beamer, Scott; Asanovic, Krste; Patterson, David
  • 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/sc.2012.50
conference November 2012
Ligra journal February 2013
A lightweight infrastructure for graph analytics conference January 2013
Thinking Like a Vertex journal October 2015
A persistent lock-free queue for non-volatile memory
  • Friedman, Michal; Herlihy, Maurice; Marathe, Virendra
  • PPoPP '18: 23nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming https://doi.org/10.1145/3178487.3178490
conference February 2018
Making pull-based graph processing performant conference February 2018
Parallel Write-Efficient Algorithms and Data Structures for Computational Geometry conference July 2018
Integer Compression in NVRAM-centric Data Stores conference July 2019
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable journal March 2021
The input/output complexity of sorting and related problems journal August 1988
Efficient Algorithms with Asymmetric Read and Write Costs text January 2016
Tracking in Order to Recover: Detectable Recovery of Lock-Free Data Structures conferencepaper January 2022
Defining and evaluating network communities based on ground-truth journal October 2013
Design and implementation of skiplist-based key-value store on non-volatile memory journal March 2019
The anatomy of a large-scale hypertextual Web search engine journal April 1998
On graph problems in a semi-streaming model journal December 2005
Smaller and Faster: Parallel Processing of Compressed Graphs with Ligra+ conference April 2015
Initial Experience with 3D XPoint Main Memory conference April 2019
GraphMP: An Efficient Semi-External-Memory Big Graph Processing System on a Single Machine conference December 2017
Write-Avoiding Algorithms conference May 2016
Implicit Decomposition for Write-Efficient Connectivity Algorithms conference May 2018
Revisiting the I/O-Complexity of Fast Matrix Multiplication with Recomputations conference May 2019
HART: A Concurrent Hash-Assisted Radix Tree for DRAM-PM Hybrid Memory Systems conference May 2019
iDO: Compiler-Directed Failure Atomicity for Nonvolatile Memory conference October 2018
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
  • Pearce, Roger; Gokhale, Maya; Amato, Nancy M.
  • 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.34
conference November 2010
Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs journal May 2017
What is Twitter, a social network or a news media? conference January 2010
Cilk
  • Blumofe, Robert D.; Joerg, Christopher F.; Kuszmaul, Bradley C.
  • Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming https://doi.org/10.1145/209936.209958
conference August 1995
Green-Marl
  • Hong, Sungpack; Chafi, Hassan; Sedlar, Edic
  • Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems https://doi.org/10.1145/2150976.2151013
conference March 2012
Ligra: a lightweight graph processing framework for shared memory conference January 2013
Parallel graph decompositions using random shifts conference July 2013
Graph structure in the web --- revisited conference April 2014
Querying k-truss community in large and dynamic graphs conference June 2014
A simple and practical linear-work parallel algorithm for connectivity
  • Shun, Julian; Dhulipala, Laxman; Blelloch, Guy
  • SPAA '14: 26th ACM Symposium on Parallelism in Algorithms and Architectures, Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures https://doi.org/10.1145/2612669.2612692
conference June 2014
Graph stream algorithms journal May 2014
NUMA-aware graph-structured analytics conference January 2015
Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions conference January 2015
Improved Parallel Algorithms for Spanners and Hopsets conference June 2015
A Top-Down Parallel Semisort conference June 2015
Sorting with Asymmetric Read and Write Costs conference June 2015
Parallel Algorithms for Asymmetric Read-Write Costs conference July 2016
Just Join for Parallel Ordered Sets conference July 2016
2-3 Cuckoo Filters for Faster Triangle Listing and Set Intersection conference May 2017
How to Build a Non-Volatile Memory Database Management System conference May 2017
Mosaic conference April 2017
GraphGrind conference June 2017
Julienne conference July 2017
Lower Bounds in the Asymmetric External Memory Model conference July 2017
Graph Processing on GPUs journal January 2018
Pam conference February 2018
Managing Non-Volatile Memory in Database Systems conference May 2018
Romulus conference July 2018
The Inherent Cost of Remembering Consistently conference July 2018
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable conference July 2018
Scheduling multithreaded computations by work stealing journal September 1999
GraphIt: a high-performance graph DSL journal October 2018
Distance-generalized Core Decomposition conference June 2019
AutoPersist: an easy-to-use Java NVM framework based on reachability conference June 2019
Crash recoverable ARMv8-oriented B+-tree for byte-addressable persistent memory
  • Wang, Chundong; Chattopadhyay, Sudipta; Brihadiswarn, Gunavaran
  • Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems https://doi.org/10.1145/3316482.3326358
conference June 2019
Efficient Checkpointing with Recompute Scheme for Non-volatile Main Memory
  • Alshboul, Mohammad; Elnawawy, Hussein; Elkhouly, Reem
  • ACM Transactions on Architecture and Code Optimization, Vol. 16, Issue 2 https://doi.org/10.1145/3323091
journal May 2019
Optimizing Persistent Transactions (Brief Announcement) conference June 2019
Delay-Free Concurrency on Faulty Persistent Memory conference June 2019
Persistent Memory I/O Primitives conference July 2019
Persistent Buffer Management with Optimistic Consistency conference July 2019
Optimal Parallel Algorithms in the Binary-Forking Model conference July 2020
Tracking in Order to Recover - Detectable Recovery of Lock-Free Data Structures conference July 2020
The webgraph framework I: compression techniques conference January 2004
Efficient subgraph matching on billion node graphs journal May 2012
Truss decomposition in massive networks journal May 2012
Write-limited sorts and joins for persistent memory journal January 2014
GraMi: frequent subgraph and pattern mining in a single large graph journal March 2014
Persistent B + -trees in non-volatile main memory journal February 2015
Parallel local graph clustering journal August 2016
Memory management techniques for large-scale persistent-main-memory systems journal August 2017
Experimental analysis of distributed graph systems journal June 2018
Local algorithms for hierarchical dense subgraph discovery journal September 2018
On supporting efficient snapshot isolation for hybrid workloads with multi-versioned indexes journal October 2019
Single machine graph analytics on massive datasets using Intel optane DC persistent memory journal April 2020
Data Streams: Algorithms and Applications journal January 2005
The Graph Structure in the Web – Analyzed on Different Aggregation Levels journal August 2015
CoSimRank: A Flexible and Efficient Graph-Theoretic Similarity Measure
  • Rothe, Sascha; Schütze, Hinrich
  • Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) https://doi.org/10.3115/v1/P14-1131
conference January 2014
The Matrix KV Storage System Based on NVM Devices journal May 2019

Cited By (1)


Similar Records

Single-node Partitioned-Memory for Huge Graph Analytics: Cost and Performance Trade-offs
Conference · Sat Nov 13 23:00:00 EST 2021 · OSTI ID:1836010

ConnectIt: a framework for static and incremental parallel graph connectivity algorithms
Journal Article · Mon Nov 30 23:00:00 EST 2020 · Proceedings of the VLDB Endowment · OSTI ID:1852729

Co-design of Advanced Architectures for Graph Analytics using Machine Learning
Conference · Tue Jun 01 00:00:00 EDT 2021 · OSTI ID:1808193