A GEMM interface and implementation on NVIDIA GPUs for multiple small matrices
Journal Article
·
· Journal of Parallel and Distributed Computing
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- SC0004439
- OSTI ID:
- 1250088
- Journal Information:
- Journal of Parallel and Distributed Computing, Journal Name: Journal of Parallel and Distributed Computing Vol. 75 Journal Issue: C; ISSN 0743-7315
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Cited by: 12 works
Citation information provided by
Web of Science
Web of Science
Similar Records
Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA Tesla GPU Cluster
Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs
Performance Portable Sparse Matrix-Matrix Multiplication on Intel Knights Landing and NVIDIA GPUs.
Conference
·
Mon Aug 31 00:00:00 EDT 2009
·
OSTI ID:1250088
Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs
Journal Article
·
Fri Jul 01 00:00:00 EDT 2016
· IEEE Transactions on Parallel and Distributed Systems
·
OSTI ID:1250088
+1 more
Performance Portable Sparse Matrix-Matrix Multiplication on Intel Knights Landing and NVIDIA GPUs.
Conference
·
Tue Nov 01 00:00:00 EDT 2016
·
OSTI ID:1250088