A Block-Based Triangle Counting Algorithm on Heterogeneous Environments

Yasar, Abdurrahman; Rajamanickam, Sivasankaran; Berry, Jonathan W.; Catalyurek, Umit V.

doi:10.1109/tpds.2021.3093240

Title: A Block-Based Triangle Counting Algorithm on Heterogeneous Environments

Journal Article · Tue Feb 01 00:00:00 EST 2022 · IEEE Transactions on Parallel and Distributed Systems

DOI:https://doi.org/10.1109/tpds.2021.3093240· OSTI ID:1810367

Yasar, Abdurrahman ^[1]; Rajamanickam, Sivasankaran ^[2]; Berry, Jonathan W. ^[2]; Catalyurek, Umit V. ^[1]

Georgia Institute of Technology, Atlanta, GA (United States)
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Center for Computing Research

Triangle counting is a fundamental building block in graph algorithms. In this article, we propose a block-based triangle counting algorithm to reduce data movement during both sequential and parallel execution. Our block-based formulation makes the algorithm naturally suitable for heterogeneous architectures. The problem of partitioning the adjacency matrix of a graph is well-studied. Our task decomposition goes one step further: it partitions the set of triangles in the graph. By streaming these small tasks to compute resources, we can solve problems that do not fit on a device. We demonstrate the effectiveness of our approach by providing an implementation on a compute node with multiple sockets, cores and GPUs. The current state-of-the-art in triangle enumeration processes the Friendster graph in 2.1 seconds, not including data copy time between CPU and GPU. Using that metric, our approach is 20 percent faster. When copy times are included, our algorithm takes 3.2 seconds. This is 5.6 times faster than the fastest published CPU-only time.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

Sponsoring Organization:: USDOE National Nuclear Security Administration (NNSA); National Science Foundation (NFS)

Grant/Contract Number:: AC04-94AL85000; CF-1919021; NA0003525

OSTI ID:: 1810367

Report Number(s):: SAND-2021-7901J; 697218

Journal Information:: IEEE Transactions on Parallel and Distributed Systems, Vol. 33, Issue 2; ISSN 1045-9219

Publisher:: IEEECopyright Statement

Country of Publication:: United States

Language:: English

References (39)

Experimental evaluation of efficient sparse matrix distributions Ujaldón, Manuel; Sharma, Shamik D.; Zapata, Emilio L. Proceedings of the 10th international conference on Supercomputing - ICS '96 https://doi.org/10.1145/237578.237588	conference	January 1996
Performance-portable sparse matrix-matrix multiplication for many-core architectures Deveci, Mehmet; Trott, Christian; Rajamanickam, Sivasankaran 2017 IEEE International Parallel and Distributed Processing Symposium: Workshops (IPDPSW), 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) https://doi.org/10.1109/IPDPSW.2017.8	conference	May 2017
Fast linear algebra-based triangle counting with KokkosKernels Wolf, Michael M.; Deveci, Mehmet; Berry, Jonathan W. 2017 IEEE High-Performance Extreme Computing Conference (HPEC), 2017 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2017.8091043	conference	September 2017
Run-time optimizations for replicated dataflows on heterogeneous environments Teodoro, George; Hartley, Timothy D. R.; Catalyurek, Umit Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing - HPDC '10 https://doi.org/10.1145/1851476.1851479	conference	January 2010
Sparsity: Optimization Framework for Sparse Matrix Kernels Im, Eun-Jin; Yelick, Katherine; Vuduc, Richard The International Journal of High Performance Computing Applications, Vol. 18, Issue 1 https://doi.org/10.1177/1094342004041296	journal	February 2004
A Partitioning Strategy for Nonuniform Problems on Multiprocessors IEEE Transactions on Computers, Vol. C-36, Issue 5 https://doi.org/10.1109/TC.1987.1676942	journal	May 1987
Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication Catalyurek, U. V.; Aykanat, C. IEEE Transactions on Parallel and Distributed Systems, Vol. 10, Issue 7 https://doi.org/10.1109/71.780863	journal	July 1999
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs Karypis, George; Kumar, Vipin SIAM Journal on Scientific Computing, Vol. 20, Issue 1 https://doi.org/10.1137/S1064827595287997	journal	January 1998
A 2D Parallel Triangle Counting Algorithm for Distributed-Memory Architectures Tom, Ancy Sarah; Karypis, George Proceedings of the 48th International Conference on Parallel Processing https://doi.org/10.1145/3337821.3337853	conference	August 2019
Shaping communities out of triangles Prat-Pérez, Arnau; Dominguez-Sal, David; Brunat, Josep M. Proceedings of the 21st ACM international conference on Information and knowledge management https://doi.org/10.1145/2396761.2398496	conference	October 2012
Dynamic partitioning of non-uniform structured workloads with spacefilling curves Pilkington, J. R.; Baden, S. B. IEEE Transactions on Parallel and Distributed Systems, Vol. 7, Issue 3 https://doi.org/10.1109/71.491582	journal	March 1996
Curvature of co-links uncovers hidden thematic layers in the World Wide Web Eckmann, J. -P.; Moses, E. Proceedings of the National Academy of Sciences, Vol. 99, Issue 9 https://doi.org/10.1073/pnas.032093399	journal	April 2002
On triangulation-based dense neighborhood graph discovery Wang, Nan; Zhang, Jingbo; Tan, Kian-Lee Proceedings of the VLDB Endowment, Vol. 4, Issue 2 https://doi.org/10.14778/1921071.1921073	journal	November 2010
Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation Tsourakakis, Charalampos E.; Drineas, Petros; Michelakis, Eirinaios Social Network Analysis and Mining, Vol. 1, Issue 2 https://doi.org/10.1007/s13278-010-0001-9	journal	August 2010
Collaborative (CPU + GPU) algorithms for triangle counting and truss decomposition on the Minsky architecture: Static graph challenge: Subgraph isomorphism Date, Ketan; Feng, Keven; Nagi, Rakesh 2017 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2017.8091042	conference	September 2017
On Two-Dimensional Sparse Matrix Partitioning: Models, Methods, and a Recipe Çatalyürek, Ümt V.; Aykanat, Cevdet; Uçar, Bora SIAM Journal on Scientific Computing, Vol. 32, Issue 2 https://doi.org/10.1137/080737770	journal	January 2010
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable Dhulipala, Laxman; Blelloch, Guy E.; Shun, Julian Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures https://doi.org/10.1145/3210377.3210414	conference	July 2018
Data-flow algorithms for parallel matrix computation O'Leary, Dianne P.; Stewart, G. W. Communications of the ACM, Vol. 28, Issue 8 https://doi.org/10.1145/4021.4025	journal	August 1985
TriCore: Parallel Triangle Counting on GPUs Hu, Yang; Liu, Hang; Huang, H. Howie SC18: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2018.00017	conference	November 2018
Scalable Triangle Counting on Distributed-Memory Systems Acer, Seher; Yasar, Abdurrahman; Rajamanickam, Sivasankaran 2019 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2019.8916302	conference	September 2019
Finding a Minimum Circuit in a Graph Itai, Alon; Rodeh, Michael SIAM Journal on Computing, Vol. 7, Issue 4 https://doi.org/10.1137/0207033	journal	November 1978
Why do simple algorithms for triangle enumeration work in the real world? Berry, Jonathan W.; Fostvedt, Luke K.; Nordman, Daniel J. ITCS'14: Innovations in Theoretical Computer Science, Proceedings of the 5th conference on Innovations in theoretical computer science https://doi.org/10.1145/2554797.2554819	conference	January 2014
An Efficient Parallel Algorithm for Matrix-Vector Multiplication Hendrickson, Bruce; Leland, Robert; Plimpton, Steve International Journal of High Speed Computing, Vol. 07, Issue 01 https://doi.org/10.1142/S0129053395000051	journal	March 1995
Finding and counting given length cycles Alon, N.; Yuster, R.; Zwick, U. Algorithmica, Vol. 17, Issue 3 https://doi.org/10.1007/BF02523189	journal	March 1997
Multicore triangle computations without tuning Shun, Julian; Tangwongsan, Kanat 2015 IEEE 31st International Conference on Data Engineering https://doi.org/10.1109/ICDE.2015.7113280	conference	April 2015
The input/output complexity of triangle enumeration Pagh, Rasmus; Silvestri, Francesco Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems https://doi.org/10.1145/2594538.2594552	conference	June 2014
Main-memory triangle computations for very large (sparse (power-law)) graphs Latapy, Matthieu Theoretical Computer Science, Vol. 407, Issue 1-3 https://doi.org/10.1016/j.tcs.2008.07.017	journal	November 2008
Load-balancing spatially located computations using rectangular partitions Saule, Erik; Baş, Erdeniz Ö.; Çatalyürek, Ümit V. Journal of Parallel and Distributed Computing, Vol. 72, Issue 10 https://doi.org/10.1016/j.jpdc.2012.05.013	journal	October 2012
The university of Florida sparse matrix collection Davis, Timothy A.; Hu, Yifan ACM Transactions on Mathematical Software, Vol. 38, Issue 1 https://doi.org/10.1145/2049662.2049663	journal	November 2011
High-Performance Triangle Counting on GPUs Hu, Yang; Liu, Hang; Huang, H. Howie 2018 IEEE High Performance extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2018.8547570	conference	September 2018
1.5D Parallel Sparse Matrix-Vector Multiply Kayaaslan, Enver; Aykanat, Cevdet; Uçar, Bora SIAM Journal on Scientific Computing, Vol. 40, Issue 1 https://doi.org/10.1137/16M1105591	journal	January 2018
Benchmarking optimization software with performance profiles Dolan, Elizabeth D.; Moré, Jorge J. Mathematical Programming, Vol. 91, Issue 2 https://doi.org/10.1007/s101070100263	journal	January 2002
Scalable matrix computations on large scale-free graphs using 2D graph partitioning Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13 https://doi.org/10.1145/2503210.2503293	conference	January 2013
TriX: Triangle counting at extreme scale Hu, Yang; Kumar, Pradeep; Swope, Guy 2017 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2017.8091036	conference	September 2017
Rectilinear Partitioning of Irregular Data Parallel Computations Nicol, D. M. Journal of Parallel and Distributed Computing, Vol. 23, Issue 2 https://doi.org/10.1006/jpdc.1994.1126	journal	November 1994
CHARM++: a portable concurrent object oriented system based on C++ Kale, Laxmikant V.; Krishnan, Sanjeev Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications - OOPSLA '93 https://doi.org/10.1145/165854.165874	conference	January 1993
Optimizing nonzero-based sparse matrix partitioning models via reducing latency Acer, Seher; Selvitopi, Oguz; Aykanat, Cevdet Journal of Parallel and Distributed Computing, Vol. 122 https://doi.org/10.1016/j.jpdc.2018.08.005	journal	December 2018
Kokkos Array performance-portable manycore programming model Edwards, H. Carter; Sunderland, Daniel Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '12 https://doi.org/10.1145/2141702.2141703	conference	January 2012
The Input/Output Complexity of Triangle Enumeration Pagh, Rasmus; Silvestri, Francesco arXiv https://doi.org/10.48550/arxiv.1312.0723	text	January 2013

Similar Records

A Block-Based Triangle Counting Algorithm on Heterogeneous Environments

Technical Report · Thu Oct 01 00:00:00 EDT 2020 · OSTI ID:1810367

Yasar, Abdurrahman; Rajamanickam, Sivasankaran; Berry, Jonathan W.; +1 more

Trust: Triangle Counting Reloaded on GPUs

Journal Article · Tue Mar 09 00:00:00 EST 2021 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1810367

Pandey, Santosh; Wang, Zhibin; Zhong, Sheng; +8 more

Wedge sampling for computing clustering coefficients and triangle counts on large graphs

Journal Article · Thu May 08 00:00:00 EDT 2014 · Statistical Analysis and Data Mining · OSTI ID:1810367

Seshadhri, C.; Pinar, Ali; Kolda, Tamara G.

Related Subjects

97 MATHEMATICS AND COMPUTING
triangle counting
task-based
block-based
sub-graph
multi-core
multi-GPU

Title: A Block-Based Triangle Counting Algorithm on Heterogeneous Environments

Citation Formats

References (39)

Similar Records

Related Subjects