skip to main content

SciTech ConnectSciTech Connect

Title: Mixed Mode Matrix Multiplication

In modern clustering environments where the memory hierarchy has many layers (distributed memory, shared memory layer, cache,...), an important question is how to fully utilize all available resources and identify the most dominant layer in certain computations. When combining algorithms on all layers together, what would be the best method to get the best performance out of all the resources we have? Mixed mode programming model that uses thread programming on the shared memory layer and message passing programming on the distributed memory layer is a method that many researchers are using to utilize the memory resources. In this paper, they take an algorithmic approach that uses matrix multiplication as a tool to show how cache algorithms affect the performance of both shared memory and distributed memory algorithms. They show that with good underlying cache algorithm, overall performance is stable. When underlying cache algorithm is bad, superlinear speedup may occur, and an increasing number of threads may also improve performance.
Authors:
; ;
Publication Date:
OSTI Identifier:
832904
Report Number(s):
IS-M 921
TRN: US200430%%917
DOE Contract Number:
W-7405-Eng-82
Resource Type:
Conference
Resource Relation:
Conference: Conference title not supplied, Conference location not supplied, Conference dates not supplied; Other Information: PBD: 30 Sep 2004
Research Org:
Ames Laboratory, Ames, IA (US)
Sponsoring Org:
USDOE Office of Science (SC) (US)
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; ALGORITHMS; PERFORMANCE; PROGRAMMING; MEMORY MANAGEMENT