Certification of Completion of Level-2 Milestone 464: Complete Phase 1 Integration of Site-Wide Global Parallel File System (SWGPFS)
There has been substantial development of the Lustre parallel filesystem prior to the configuration described below for this milestone. The initial Lustre filesystems that were deployed were directly connected to the cluster interconnect, i.e. Quadrics Elan3. That is, the clients (OSSes) and Meta-data Servers (MDS) were all directly connected to the cluster's internal high speed interconnect. This configuration serves a single cluster very well, but does not provide sharing of the filesystem among clusters. LLNL funded the development of high-efficiency ''portals router'' code by CFS (the company that develops Lustre) to enable us to move the Lustre servers to a GigE-connected network configuration, thus making it possible to connect to the servers from several clusters. With portals routing available, here is what changes: (1) another storage-only cluster is deployed to front the Lustre storage devices (these become the Lustre OSSes and MDS), (2) this ''Lustre cluster'' is attached via GigE connections to a large GigE switch/router cloud, (3) a small number of compute-cluster nodes are designated as ''gateway'' or ''portal router'' nodes, and (4) the portals router nodes are GigE-connected to the switch/router cloud. The Lustre configuration is then changed to reflect the new network paths. A typical example of this is a compute cluster and a related visualization cluster: the compute cluster produces the data (writes it to the Lustre filesystem), and the visualization cluster consumes some of the data (reads it from the Lustre filesystem). This process can be expanded by aggregating several collections of Lustre backend storage resources into one or more ''centralized'' Lustre filesystems, and then arranging to have several ''client'' clusters mount these centralized filesystems. The ''client clusters'' can be any combination of compute, visualization, archiving, or other types of cluster. This milestone demonstrates the operation and performance of a scaled-down version of such a large, centralized, shared Lustre filesystem concept.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 898470
- Report Number(s):
- UCRL-TR-218537; TRN: US200708%%108
- Country of Publication:
- United States
- Language:
- English
Similar Records
Engineering the CernVM-Filesystem as a High Bandwidth Distributed Filesystem for Auxiliary Physics Data
Learning Concave-Convex Profiles of Data Transport Over Dedicated Connections