LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory System
- Univ. of Tennessee, Knoxville, TN (United States)
- Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Univ. of Manchester (United Kingdom)
LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance LINPACK benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1173291
- Report Number(s):
- LBNL-5787E
- Country of Publication:
- United States
- Language:
- English
Similar Records
Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems
Comparing LLC-Memory Traffic between CPU and GPU Architectures
Distributed out-of-memory NMF on CPU/GPU architectures
Conference
·
Sat Jan 01 00:00:00 EST 2011
·
OSTI ID:1173291
+4 more
Comparing LLC-Memory Traffic between CPU and GPU Architectures
Conference
·
Mon Nov 01 00:00:00 EDT 2021
·
OSTI ID:1173291
+1 more
Distributed out-of-memory NMF on CPU/GPU architectures
Journal Article
·
Fri Sep 08 00:00:00 EDT 2023
· Journal of Supercomputing
·
OSTI ID:1173291
+4 more