skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Experience with mixed MPI/threaded programming models

Conference ·
OSTI ID:11308

A shared memory cluster is a parallel computer that consists of multiple nodes connected through an interconnection network. Each node is a symmetric multiprocessor (SMP) unit in which multiple CPUs share uniform access to a pool of main memory. The SGI Origin 2000, Compaq (formerly DEC) AlphaServer Cluster, and recent IBM RS6000/SP systems are all variants of this architecture. The SGI Origin 2000 has hardware that allows tasks running on any processor to access any main memory location in the system, so all the memory in the nodes forms a single shared address space. This is called a nonuniform memory access (NUMA) architecture because it gives programs a single shared address space, but the access time to different memory locations varies. In the IBM and Compaq systems, each node's memory forms a separate address space, and tasks communicate between nodes by passing messages or using other explicit mechanisms. Many large parallel codes use standard MPI calls to exchange data between tasks in a parallel job, and this is a natural programming model for distributed memory architectures. On a shared memory architecture, message passing is unnecessary if the code is written to use multithreading: threads run in parallel on different processors, and they exchange data simply by reading and writing shared memory locations. Shared memory clusters combine architectural elements of both distributed memory and shared memory systems, and they support both message passing and multithreaded programming models. Application developers are now trying to determine which programming model is best for these machines. This paper presents initial results of a study aimed at answering that question. We interviewed developers representing nine scientific code groups at Lawrence Livermore National Laboratory (LLNL). All of these groups are attempting to optimize their codes to run on shared memory clusters, specifically the IBM and DEC platforms at LLNL. This paper will focus on ease-of-use issues. We plan in a future paper to analyze the performance of various programming models. Section 2 describes the common programming models available on shared memory clusters. In Section 3 we briefly describe the architectures of the IBM and DEC machines at LLNL. Section 4 describes the codes we surveyed and the parallel programming models they use. We conclude in Section 5 with a summary of the lessons we have learned so far about multilevel parallelism.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE Office of Defense Programs (DP) (US)
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
11308
Report Number(s):
UCRL-JC-133213; YN0100000; 99-ERD-009; YN0100000; 99-ERD-009; TRN: AH200128%%752
Resource Relation:
Conference: High Performance Scientific Computation with Applications, Las Vegas, NV (US), 06/28/1999--07/01/1999; Other Information: PBD: 1 Apr 1999
Country of Publication:
United States
Language:
English