QMP: LQCD Message Passing API
Recent changes are: (1) There is no longer a logical node number, only a node number which does not change as the logical machine is define. Thus there are two styles of messaging: messages are sent to a node by node number, or messages are sent to a relative (logical) node. (2) Methods related to node numbers have been changed (some dropped, some added). This note presents: (1) the requirements for message passing within Lattice QCD applications; (2) a draft message API for both C and C++; and (3) implementation design ideas. The API is intended to be sufficiently flexible to be used by all Lattice QCD applications, and execute efficiently on all existing and anticipated platforms, so that there is no need to directly call non-portable message passing routines. Because of the highly regular grid communications with LQCD, MPI calls (which are more general) impose some additional overhead that is predicted to be non-negligible for large machines. Depending upon demand, a subset of MPI could be implemented above this new API so that legacy codes which use MPI could function on the new architectures which implement (only) the new API. Further, the new API has been implemented atop MPI so that new applications using this new API can still be run on older machines for which only MPI is available. Interspersed with the API description are some descriptions for how the API could be implemented for myrinet clusters and the QCDOC machine. These are meant to more fully illustrate the functionality, and are not intended as the final design. At the time of writing, the following implementations exist: (1) QMP-GM -- Uses GM; (2) QMP-MPI -- Uses MPI; tested above MPICH-GM, MPICH-SM (shared memory), and MPICH-P4 (sockets).
- Research Organization:
- Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States)
- Sponsoring Organization:
- USDOE Office of Energy Research (ER) (US)
- DOE Contract Number:
- AC05-84ER40150
- OSTI ID:
- 808810
- Report Number(s):
- JLAB-THY-03-24; DOE/ER/40150-2454; TRN: US200306%%698
- Resource Relation:
- Other Information: PBD: 1 Mar 2003
- Country of Publication:
- United States
- Language:
- English
Similar Records
First LQCD Physics Runs with MILC and P4RHMC
PVFS : a parallel file system for linux clusters