Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Message Passing for Linux Clusters with Gigabit Ethernet Mesh Connections

Conference ·

Multiple copper-based commodity Gigabit Ethernet (GigE) interconnects (adapters) on a single host can lead to Linux clusters with mesh/torus connections without using expensive switches and high speed network interconnects (NICs). However traditional message passing systems based on TCP for GigE will not perform well for this type of clusters because of the overhead of TCP for multiple GigE links. In this paper, we present two os-bypass message passing systems that are based on a modified M-VIA (an implementation of VIA specification) for two production GigE mesh clusters: one is constructed as a 4 x 8 x 8 (256 nodes) torus and has been in production use for a year; the other is constructed as a 6 x 8 x 8 (384 nodes) torus and was deployed recently. One of the message passing systems targets to a specific application domain and is called QMP and the other is an implementation of MPI specification 1.1. The GigE mesh clusters using these two message passing systems achieve about 18.5 {micro}s half-way round trip latency and 400MB/s total bandwidth, which compare reasonably well to systems using specialized high speed adapters in a switched architecture at much lower costs.

Research Organization:
Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States)
Sponsoring Organization:
USDOE; USDOE Office of Energy Research (ER) (US)
DOE Contract Number:
AC05-84ER40150
OSTI ID:
840531
Report Number(s):
JLAB-CIO-05-01; DOE/ER/40150-3423; TRN: US200512%%73
Resource Relation:
Conference: The Workshop on Communication Architecture for Clusters of IPDPS 05, Denver, CO (US), 04/04/2005; Other Information: PBD: 1 Apr 2005
Country of Publication:
United States
Language:
English