An efficient parallel algorithm for matrix-vector multiplication
The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.
- Research Organization:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE; USDOE, Washington, DC (United States)
- DOE Contract Number:
- AC04-76DP00789
- OSTI ID:
- 6519330
- Report Number(s):
- SAND-92-2765; ON: DE93015125
- Country of Publication:
- United States
- Language:
- English
Similar Records
A Work-Efficient Parallel Sparse Matrix-Sparse Vector Multiplication Algorithm
Parallel and fault-tolerant algorithms for hypercube multiprocessors