SPARTA: High-Level Synthesis of Parallel Multi-Threaded Accelerators

Gozzi, Giovanni; Fiorito, Michele; Curzel, Serena; Barone, Claudio; Castellana, Vito Giovanni; Minutoli, Marco; Tumeo, Antonino; Ferrandi, Fabrizio

doi:10.1145/3677035

SPARTA: High-Level Synthesis of Parallel Multi-Threaded Accelerators

Journal Article · Wed Dec 25 00:00:00 EST 2024 · ACM Transactions on Reconfigurable Technology and Systems

DOI:https://doi.org/10.1145/3677035· OSTI ID:2570267

^[1]; ^[1]; ^[1]; ^[2]; ^[2]; ^[2]; ^[2]; ^[1]

Politecnico di Milano (Italy)
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

This article presents a methodology for the Synthesis of PARallel multi-Threaded Accelerators (SPARTA) from OpenMP annotated C/C++ specifications. SPARTA extends an open-source HLS tool, enabling the generation of accelerators that provide latency tolerance for irregular memory accesses through multithreading, support fine-grained memory-level parallelism through a hot-potato deflection-based network-on-chip (NoC), support synchronization constructs, and can instantiate memory-side caches. Our approach is based on a custom runtime OpenMP library, providing flexibility and extensibility. Experimental results show high scalability when synthesizing irregular graph kernels. The accelerators generated with our approach are, on average, 2.29x faster than state-of-the-art HLS methodologies.

View Accepted Manuscript (DOE)

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE Laboratory Directed Research and Development (LDRD) Program

Grant/Contract Number:: AC05-76RL01830

OSTI ID:: 2570267

Report Number(s):: PNNL-SA--185565

Journal Information:: ACM Transactions on Reconfigurable Technology and Systems, Journal Name: ACM Transactions on Reconfigurable Technology and Systems Journal Issue: 1 Vol. 18; ISSN 1936-7414; ISSN 1936-7406

Publisher:: Association for Computing Machinery (ACM)Copyright Statement

Country of Publication:: United States

Language:: English

References (20)

A Survey on Graph Processing Accelerators: Challenges and Opportunities Gui, Chuang-Yi; Zheng, Long; He, Bingsheng Journal of Computer Science and Technology, Vol. 34, Issue 2 https://doi.org/10.1007/s11390-019-1914-z	journal	March 2019
Parallel heuristics for scalable community detection Lu, Hao; Halappanavar, Mahantesh; Kalyanaraman, Ananth Parallel Computing, Vol. 47 https://doi.org/10.1016/j.parco.2015.03.003	journal	August 2015
LUBM: A benchmark for OWL knowledge base systems Guo, Yuanbo; Pan, Zhengxiang; Heflin, Jeff Journal of Web Semantics, Vol. 3, Issue 2-3 https://doi.org/10.1016/j.websem.2005.06.005	journal	October 2005
Enhancing Butterfly Fat Tree NoCs for FPGAs with Lightweight Flow Control Malik, Gurshaant Singh; Kapre, Nachiket 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) https://doi.org/10.1109/FCCM.2019.00030	conference	April 2019
Hoplite: Building austere overlay NoCs for FPGAs Kapre, Nachiket; Gray, Jan 2015 25th International Conference on Field Programmable Logic and Applications (FPL) https://doi.org/10.1109/FPL.2015.7293956	conference	September 2015
From software threads to parallel hardware in high-level synthesis for FPGAs Choi, Jongsok; Brown, Stephen; Anderson, Jason 2013 International Conference on Field-Programmable Technology (FPT) https://doi.org/10.1109/FPT.2013.6718365	conference	December 2013
MachSuite: Benchmarks for accelerator design and customized architectures Reagen, Brandon; Adolf, Robert; Shao, Yakun Sophia 2014 IEEE International Symposium on Workload Characterization (IISWC) https://doi.org/10.1109/IISWC.2014.6983050	conference	October 2014
Exploring DataVortex Systems for Irregular Applications Gioiosa, Roberto; Tumeo, Antonino; Yin, Jian 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) https://doi.org/10.1109/IPDPS.2017.121	conference	May 2017
The Data Vortex Optical Packet Switched Interconnection Network Liboiron-Ladouceur, Odile; Shacham, Assaf; Small, Benjamin A. Journal of Lightwave Technology, Vol. 26, Issue 13 https://doi.org/10.1109/JLT.2007.913739	journal	July 2008
Svelto: High-Level Synthesis of Multi-Threaded Accelerators for Graph Analytics Minutoli, Marco; Castellana, Vito Giovanni; Saporetti, Nicola IEEE Transactions on Computers, Vol. 71, Issue 3 https://doi.org/10.1109/TC.2021.3057860	journal	March 2022
A case for bufferless routing in on-chip networks Moscibroda, Thomas; Mutlu, Onur Proceedings of the 36th annual international symposium on Computer architecture https://doi.org/10.1145/1555754.1555781	conference	June 2009
A scalable processing-in-memory accelerator for parallel graph processing Ahn, Junwhan; Hong, Sungpack; Yoo, Sungjoo Proceedings of the 42nd Annual International Symposium on Computer Architecture https://doi.org/10.1145/2749469.2750386	conference	June 2015
Boosting the Performance of FPGA-based Graph Processor using Hybrid Memory Cube Zhang, Jialiang; Khoram, Soroosh; Li, Jing Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays https://doi.org/10.1145/3020078.3021737	conference	February 2017
Hoplite Kapre, Nachiket; Gray, Jan ACM Transactions on Reconfigurable Technology and Systems, Vol. 10, Issue 2 https://doi.org/10.1145/3027486	journal	March 2017
An FPGA framework for edge-centric graph processing Zhou, Shijie; Kannan, Rajgopal; Zeng, Hanqing Proceedings of the 15th ACM International Conference on Computing Frontiers https://doi.org/10.1145/3203217.3203233	conference	May 2018
Gpop Lakhotia, Kartik; Kannan, Rajgopal; Pati, Sourav Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming https://doi.org/10.1145/3293883.3299108	conference	February 2019
Automated accelerator optimization aided by graph neural networks Sohrabizadeh, Atefeh; Bai, Yunsheng; Sun, Yizhou Proceedings of the 59th ACM/IEEE Design Automation Conference https://doi.org/10.1145/3489517.3530409	conference	July 2022
Accelerator design with decoupled hardware customizations Pal, Debjit; Lai, Yi-Hsiang; Xiang, Shaojie Proceedings of the 59th ACM/IEEE Design Automation Conference https://doi.org/10.1145/3489517.3530681	conference	July 2022
HLS-based High-throughput and Work-efficient Synthesizable Graph Processing Template Pipeline Ahangari, Hamzeh; Özdal, Muhammet Mustafa; Öztürk, Özcan ACM Transactions on Embedded Computing Systems, Vol. 22, Issue 2 https://doi.org/10.1145/3529256	journal	January 2023
FPGA HLS Today: Successes, Challenges, and Opportunities Cong, Jason; Lau, Jason; Liu, Gai ACM Transactions on Reconfigurable Technology and Systems, Vol. 15, Issue 4 https://doi.org/10.1145/3530775	journal	August 2022

Similar Records

A Parallel Graph Environment for Real-World Data Analytics Workflows

Conference · Mon Mar 25 00:00:00 EDT 2019 · OSTI ID:1591771

Hierarchical resilience with lightweight threads.

Technical Report · Sat Oct 01 00:00:00 EDT 2011 · OSTI ID:1029809

Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

Journal Article · Mon Dec 31 19:00:00 EST 2012 · Scientific Programming · OSTI ID:1197887

Related Subjects

Design automation
FPGA architecture
graph algorithms
parallelism

SPARTA: High-Level Synthesis of Parallel Multi-Threaded Accelerators

Citation Formats

References (20)

Similar Records

Related Subjects