The Spider Center Wide File System; From Concept to Reality
Abstract
The Leadership Computing Facility (LCF) at Oak Ridge National Laboratory (ORNL) has a diverse portfolio of computational resources ranging from a petascale XT4/XT5 simulation system (Jaguar) to numerous other systems supporting development, visualization, and data analytics. In order to support vastly different I/O needs of these systems Spider, a Lustre-based center wide file system was designed and deployed to provide over 240 GB/s of aggregate throughput with over 10 Petabytes of formatted capacity. A multi-stage InfiniBand network, dubbed as Scalable I/O Network (SION), with over 889 GB/s of bisectional bandwidth was deployed as part of Spider to provide connectivity to our simulation, development, visualization, and other platforms. To our knowledge, while writing this paper, Spider is the largest and fastest POSIX-compliant parallel file system in production. This paper will detail the overall architecture of the Spider system, challenges in deploying and initial testings of a file system of this scale, and novel solutions to these challenges which offer key insights into file system design in the future.
- Authors:
-
- ORNL
- Publication Date:
- Research Org.:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1016038
- DOE Contract Number:
- DE-AC05-00OR22725
- Resource Type:
- Conference
- Resource Relation:
- Conference: CUG 2009, Atlanta, GA, USA, 20090404, 20090404
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; COMPUTER ARCHITECTURE; CAPACITY; COMPUTER CODES; DESIGN; ORNL; PROGRAMMING; PRODUCTION; SIMULATION
Citation Formats
Shipman, Galen M, Dillow, David A, Oral, H Sarp, and Wang, Feiyi. The Spider Center Wide File System; From Concept to Reality. United States: N. p., 2009.
Web.
Shipman, Galen M, Dillow, David A, Oral, H Sarp, & Wang, Feiyi. The Spider Center Wide File System; From Concept to Reality. United States.
Shipman, Galen M, Dillow, David A, Oral, H Sarp, and Wang, Feiyi. 2009.
"The Spider Center Wide File System; From Concept to Reality". United States.
@article{osti_1016038,
title = {The Spider Center Wide File System; From Concept to Reality},
author = {Shipman, Galen M and Dillow, David A and Oral, H Sarp and Wang, Feiyi},
abstractNote = {The Leadership Computing Facility (LCF) at Oak Ridge National Laboratory (ORNL) has a diverse portfolio of computational resources ranging from a petascale XT4/XT5 simulation system (Jaguar) to numerous other systems supporting development, visualization, and data analytics. In order to support vastly different I/O needs of these systems Spider, a Lustre-based center wide file system was designed and deployed to provide over 240 GB/s of aggregate throughput with over 10 Petabytes of formatted capacity. A multi-stage InfiniBand network, dubbed as Scalable I/O Network (SION), with over 889 GB/s of bisectional bandwidth was deployed as part of Spider to provide connectivity to our simulation, development, visualization, and other platforms. To our knowledge, while writing this paper, Spider is the largest and fastest POSIX-compliant parallel file system in production. This paper will detail the overall architecture of the Spider system, challenges in deploying and initial testings of a file system of this scale, and novel solutions to these challenges which offer key insights into file system design in the future.},
doi = {},
url = {https://www.osti.gov/biblio/1016038},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Jan 01 00:00:00 EST 2009},
month = {Thu Jan 01 00:00:00 EST 2009}
}