Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Sparse matrix-vector multiplication on a reconfigurable supercomputer

Journal Article · · ACM Transactions on Reconfigurable Technology and Systems (TRETS)

Double precision floating point Sparse Matrix-Vector Multiplication (SMVM) is a critical computational kernel used in iterative solvers for systems of sparse linear equations. The poor data locality exhibited by sparse matrices along with the high memory bandwidth requirements of SMVM result in poor performance on general purpose processors. Field Programmable Gate Arrays (FPGAs) offer a possible alternative with their customizable and application-targeted memory sub-system and processing elements. In this work we investigate two separate implementations of the SMVM on an SRC-6 MAPStation workstation. The first implementation investigates the peak performance capability, while the second implementation balances the amount of instantiated logic with the available sustained bandwidth of the FPGA subsystem. Both implementations yield the same sustained performance with the second producing a much more efficient solution. The metrics of processor and application balance are introduced to help provide some insight into the efficiencies of the FPGA and CPU based solutions explicitly showing the tight coupling of the available bandwidth to peak floating point performance. Due to the FPGA's ability to balance the amount of implemented logic to the available memory bandwidth it can provide a much more efficient solution. Finally, making use of the lessons learned implementing the SMVM, we present an fully implemented nonpreconditioned Conjugate Gradient Algorithm utilizing the second SMVM design.

Research Organization:
Los Alamos National Laboratory (LANL)
Sponsoring Organization:
DOE
DOE Contract Number:
AC52-06NA25396
OSTI ID:
962276
Report Number(s):
LA-UR-08-06989; LA-UR-08-6989
Journal Information:
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Journal Name: ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Country of Publication:
United States
Language:
English

References (6)

Sparse Matrix-Vector Multiplication Design on FPGAs conference April 2007
Improving the memory-system performance of sparse-matrix vector multiplication journal November 1997
The Idea Behind Krylov Methods journal January 1998
FPGAs vs. CPUs conference February 2004
Floating-point sparse matrix-vector multiply for FPGAs conference February 2005
Sparse Matrix-Vector multiplication on FPGAs conference February 2005

Similar Records

A complete implementation of the conjugate gradient algorithm on a reconfigurable supercomputer
Journal Article · Mon Dec 31 23:00:00 EST 2007 · Journal: ACM TRETS (Transactions on Reconfigurable Tech and Systems and Systems) · OSTI ID:957777

Mapping Sparse Matrix-Vector Multiplication on FPGAs
Conference · Sun Dec 31 23:00:00 EST 2006 · OSTI ID:931886

A Work-Efficient Parallel Sparse Matrix-Sparse Vector Multiplication Algorithm
Journal Article · Mon Jul 03 00:00:00 EDT 2017 · Proceedings - IEEE International Parallel and Distributed Processing Symposium (IPDPS) · OSTI ID:1525227