I/O Router Placement and Fine-Grained Routing on Titan to Support Spider II
- ORNL
- None
The Oak Ridge Leadership Computing Facility (OLCF) introduced the concept of Fine-Grained Routing in 2008 to improve I/O performance between the Jaguar supercomputer and Spider, OLCF s center-wide Lustre file system. Fine-grained routing organizes I/O paths to minimize congestion. Jaguar has since been upgraded to Titan, providing more than a ten-fold improvement in peak performance. To support the center s increased computational capacity and I/O demand, the Spider file system has been replaced with Spider II. Building on the lessons learned from Spider, an improved method for placing LNET routers was developed and implemented for Spider II. The fine-grained routing scripts and configuration have been updated to provide additional optimizations and better match the system setup. This paper presents a brief history of fine-grained routing at OLCF, an introduction to the architectures of Titan and Spider II, methods for placing routers in Titan, and details about the fine-grained routing configuration.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- DE-AC05-00OR22725
- OSTI ID:
- 1131532
- Resource Relation:
- Conference: Cray User Group 2014, Lugano, Switzerland, 20140505, 20140508
- Country of Publication:
- United States
- Language:
- English
Similar Records
A Next-Generation Parallel File System Environment for the OLCF
I/O Congestion Avoidance via Routing and Object Placement