skip to main content

Title: I/O Router Placement and Fine-Grained Routing on Titan to Support Spider II

The Oak Ridge Leadership Computing Facility (OLCF) introduced the concept of Fine-Grained Routing in 2008 to improve I/O performance between the Jaguar supercomputer and Spider, OLCF s center-wide Lustre file system. Fine-grained routing organizes I/O paths to minimize congestion. Jaguar has since been upgraded to Titan, providing more than a ten-fold improvement in peak performance. To support the center s increased computational capacity and I/O demand, the Spider file system has been replaced with Spider II. Building on the lessons learned from Spider, an improved method for placing LNET routers was developed and implemented for Spider II. The fine-grained routing scripts and configuration have been updated to provide additional optimizations and better match the system setup. This paper presents a brief history of fine-grained routing at OLCF, an introduction to the architectures of Titan and Spider II, methods for placing routers in Titan, and details about the fine-grained routing configuration.
 [1] ;  [2] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1]
  1. ORNL
  2. None
Publication Date:
OSTI Identifier:
DOE Contract Number:
Resource Type:
Resource Relation:
Conference: Cray User Group 2014, Lugano, Switzerland, 20140505, 20140508
Research Org:
Oak Ridge National Laboratory (ORNL); Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org:
SC USDOE - Office of Science (SC)
Country of Publication:
United States