Athena: High-Performance Sparse Tensor Contraction Sequence on Heterogeneous Memory
- University of California, Merced
- BATTELLE (PACIFIC NW LAB)
Sparse tensor contraction sequence has been widely employed in many fields, such as chemistry and physics. However, how to efficiently implement the sequence faces multiple challenges, such as redundant computations and memory operations, massive memory consumption, and inefficient utilization of hardware. To address the above challenges, we introduce Athena, a high-performance framework for SpTC sequences. Athena introduces new data structures, leverages emerging Optane-based heterogeneous memory (HM) architecture, and stage parallelism. In particular, Athena introduces shared hash table-represented sparse accumulator to eliminate unnecessary input processing and data migration; Athena uses a novel data-semantic guided dynamic migration solution to make the best use of the Optane-based HM for high performance; Athena also co-runs execution phases with different characteristics to enable high hardware utilization. Evaluating with 12 datasets, we show that Athena brings 327-7362× speedup over the state-of-the-art SpTC algorithm. With the dynamic data placement guided by data semantics, Athena brings performance improvement on Optane-based HM over a state-of-the-art software-based data management solution, a hardware-based data management solution, and PMM-only by 1.58×, 1.82×, and 2.34× respectively.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1813585
- Report Number(s):
- PNNL-SA-159741
- Resource Relation:
- Conference: Proceedings of the ACM International Conference on Supercomputing (ICS 2021) June 14-17, 2021, Virtual, Online
- Country of Publication:
- United States
- Language:
- English
Similar Records
High-Performance Sparse Matrix-Matrix Products on Intel KNL and Multicore Architectures
Bringing large-scale multiple genome analysis one step closer: ScalaBLAST and beyond