skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Ad Hoc File Systems for High-Performance Computing

Authors:
 [1];  [2];  [3];  [4];  [5];  [6];  [7];  [8];  [4];  [1]
  1. Johannes Gutenberg Univ., Mainz (Germany)
  2. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  3. Florida State Univ., Tallahassee, FL (United States)
  4. Argonne National Lab. (ANL), Argonne, IL (United States)
  5. Univ. Politecnica de Catalunya, Barcelona (Spain)
  6. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  7. Barcelona Supercomputing Center, Barcelona (Spain)
  8. Fraunhofer Inst. for Industrial Mathematics ITWM, Kaiserslautern (Germany)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States); Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); German Research Foundation (DFG); European Union (EU); Spanish Ministry of Science and Innovation (MICINN); National Science Foundation (NSF); USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1596689
Alternate Identifier(s):
OSTI ID: 1606092
Report Number(s):
LLNL-JRNL-779789
Journal ID: ISSN 1000-9000; 155300
Grant/Contract Number:  
AC02-06CH11357; 1561041; 1564647; 1744336; 1763547; 1822737; 2014-SGR-1051; TIN2015-65316; 671591; AC52-07NA27344
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Computer Science and Technology
Additional Journal Information:
Journal Volume: 35; Journal Issue: 1; Journal ID: ISSN 1000-9000
Publisher:
Springer Nature
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Burst Buffers; Distributed File Systems; High-Performance Computing; POSIX; Parallel Architectures

Citation Formats

Brinkmann, André, Mohror, Kathryn, Yu, Weikuan, Carns, Philip, Cortes, Toni, Klasky, Scott A., Miranda, Alberto, Pfreundt, Franz-Josef, Ross, Robert B., and Vef, Marc-André. Ad Hoc File Systems for High-Performance Computing. United States: N. p., 2020. Web. doi:10.1007/s11390-020-9801-1.
Brinkmann, André, Mohror, Kathryn, Yu, Weikuan, Carns, Philip, Cortes, Toni, Klasky, Scott A., Miranda, Alberto, Pfreundt, Franz-Josef, Ross, Robert B., & Vef, Marc-André. Ad Hoc File Systems for High-Performance Computing. United States. doi:10.1007/s11390-020-9801-1.
Brinkmann, André, Mohror, Kathryn, Yu, Weikuan, Carns, Philip, Cortes, Toni, Klasky, Scott A., Miranda, Alberto, Pfreundt, Franz-Josef, Ross, Robert B., and Vef, Marc-André. Fri . "Ad Hoc File Systems for High-Performance Computing". United States. doi:10.1007/s11390-020-9801-1. https://www.osti.gov/servlets/purl/1596689.
@article{osti_1596689,
title = {Ad Hoc File Systems for High-Performance Computing},
author = {Brinkmann, André and Mohror, Kathryn and Yu, Weikuan and Carns, Philip and Cortes, Toni and Klasky, Scott A. and Miranda, Alberto and Pfreundt, Franz-Josef and Ross, Robert B. and Vef, Marc-André},
abstractNote = {},
doi = {10.1007/s11390-020-9801-1},
journal = {Journal of Computer Science and Technology},
number = 1,
volume = 35,
place = {United States},
year = {2020},
month = {1}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:

Works referenced in this record:

Characterizing output bottlenecks in a supercomputer
conference, November 2012

  • Xie, Bing; Chase, Jeffrey; Dillow, David
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2012.28

TRIO: Burst Buffer Based I/O Orchestration
conference, September 2015

  • Wang, Teng; Oral, Sarp; Pritchard, Michael
  • 2015 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2015.38

Scaling Embedded In-Situ Indexing with DeltaFS
conference, November 2018

  • Zheng, Qing; Cranor, Charles D.; Guo, Danhao
  • SC18: International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2018.00006

An introduction to disk drive modeling
journal, March 1994


‘Big data’, Hadoop and cloud computing in genomics
journal, October 2013

  • O’Driscoll, Aisling; Daugelaite, Jurate; Sleator, Roy D.
  • Journal of Biomedical Informatics, Vol. 46, Issue 5
  • DOI: 10.1016/j.jbi.2013.07.001

Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)
conference, January 2008

  • Lofstead, Jay F.; Klasky, Scott; Schwan, Karsten
  • Proceedings of the 6th international workshop on Challenges of large applications in distributed environments - CLADE '08
  • DOI: 10.1145/1383529.1383533

The IBM Blue Gene/Q interconnection network and message unit
conference, January 2011

  • Chen, Dong; Parker, Jeffrey J.; Eisley, Noel A.
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
  • DOI: 10.1145/2063384.2063419

Entropy-Aware I/O Pipelining for Large-Scale Deep Learning on HPC Systems
conference, September 2018

  • Zhu, Yue; Chowdhury, Fahim; Fu, Huansong
  • 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)
  • DOI: 10.1109/MASCOTS.2018.00023

LPCC: hierarchical persistent client caching for lustre
conference, November 2019

  • Qian, Yingjin; Li, Xi; Ihara, Shuichi
  • SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1145/3295500.3356139

Task-based programming in COMPSs to converge from HPC to big data
journal, April 2017

  • Conejero, Javier; Corella, Sandra; Badia, Rosa M.
  • The International Journal of High Performance Computing Applications, Vol. 32, Issue 1
  • DOI: 10.1177/1094342017701278

On the role of burst buffers in leadership-class storage systems
conference, April 2012

  • Liu, Ning; Cope, Jason; Carns, Philip
  • 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST)
  • DOI: 10.1109/MSST.2012.6232369

FusionFS: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems
conference, October 2014


MCREngine: A scalable checkpointing system using data-aware aggregation and compression
conference, November 2012

  • Islam, Tanzima Zerin; Mohror, Kathryn; Bagchi, Saurabh
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2012.77

Direct lookup and hash-based metadata placement for local file systems
conference, January 2013

  • Lensing, Paul Hermann; Cortes, Toni; Brinkmann, André
  • Proceedings of the 6th International Systems and Storage Conference on - SYSTOR '13
  • DOI: 10.1145/2485732.2485741

An overview of the HDF5 technology suite and its applications
conference, January 2011

  • Folk, Mike; Heber, Gerd; Koziol, Quincey
  • Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases - AD '11
  • DOI: 10.1145/1966895.1966900

Cray Cascade: A scalable HPC system based on a Dragonfly network
conference, November 2012

  • Faanes, Greg; Bataineh, Abdulla; Roweth, Duncan
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2012.39

Efficient Data-Movement for Lightweight I/O
conference, September 2006

  • Oldfield, Ron; Widener, Patrick; Maccabe, Arthur
  • 2006 IEEE International Conference on Cluster Computing
  • DOI: 10.1109/CLUSTR.2006.311897

Harmonia: An Interference-Aware Dynamic I/O Scheduler for Shared Non-volatile Burst Buffers
conference, September 2018

  • Kougkas, Anthony; Devarajan, Hariharan; Sun, Xian-He
  • 2018 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2018.00046

Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System
conference, November 2010

  • Moody, Adam; Bronevetsky, Greg; Mohror, Kathryn
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2010.18

Methodology for the Rapid Development of Scalable HPC Data Services
conference, November 2018

  • Dorier, Matthieu; Settlemyer, Brad; Shipman, Galen
  • 2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS)
  • DOI: 10.1109/PDSW-DISCS.2018.00013

Data Elevator: Low-Contention Data Movement in Hierarchical Storage System
conference, December 2016

  • Dong, Bin; Byna, Suren; Wu, Kesheng
  • 2016 IEEE 23rd International Conference on High Performance Computing (HiPC)
  • DOI: 10.1109/HiPC.2016.026

Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In-Situ Workflows
conference, November 2018

  • Subedi, Pradeep; Davis, Philip; Duan, Shaohua
  • SC18: International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2018.00076

ROOT — A C++ framework for petabyte data storage, statistical analysis and visualization
journal, June 2011

  • Antcheva, I.; Ballintijn, M.; Bellenot, B.
  • Computer Physics Communications, Vol. 182, Issue 6
  • DOI: 10.1016/j.cpc.2011.02.008

A Brief Introduction to the OpenFabrics Interfaces - A New Network API for Maximizing High Performance Application Efficiency
conference, August 2015

  • Grun, Paul; Hefty, Sean; Sur, Sayantan
  • 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects (HOTI)
  • DOI: 10.1109/HOTI.2015.19

Deduplication Potential of HPC Applications’ Checkpoints
conference, September 2016

  • Kaiser, Jurgen; Gad, Ramy; SuB, Tim
  • 2016 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2016.32

UCX: An Open Source Framework for HPC Network APIs and Beyond
conference, August 2015

  • Shamis, Pavel; Venkata, Manjunath Gorentla; Lopez, M. Graham
  • 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects (HOTI)
  • DOI: 10.1109/HOTI.2015.13

GekkoFS - A Temporary Distributed File System for HPC Applications
conference, September 2018

  • Vef, Marc-Andre; Moti, Nafiseh; SuB, Tim
  • 2018 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2018.00049

High-Performance Design of YARN MapReduce on Modern HPC Clusters with Lustre and RDMA
conference, May 2015

  • Wasi-ur-Rahman, Md.; Lu, Xiaoyi; Islam, Nusrat Sharmin
  • 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
  • DOI: 10.1109/IPDPS.2015.83

Improving Collective I/O Performance Using Non-volatile Memory Devices
conference, September 2016

  • Congiu, Giuseppe; Narasimhamurthy, Sai; Suss, Tim
  • 2016 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2016.37

On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems
conference, May 2016

  • Yildiz, Orcun; Dorier, Matthieu; Ibrahim, Shadi
  • 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
  • DOI: 10.1109/IPDPS.2016.50

NetCDF: an interface for scientific data access
journal, July 1990

  • Rew, R.; Davis, G.
  • IEEE Computer Graphics and Applications, Vol. 10, Issue 4
  • DOI: 10.1109/38.56302

Optimizing a hybrid SSD/HDD HPC storage system based on file size distributions
conference, May 2013

  • Welch, Brent; Noer, Geoffrey
  • 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST)
  • DOI: 10.1109/MSST.2013.6558449

Apache Spark: a unified engine for big data processing
journal, October 2016

  • Zaharia, Matei; Franklin, Michael J.; Ghodsi, Ali
  • Communications of the ACM, Vol. 59, Issue 11
  • DOI: 10.1145/2934664

Search and clustering orders of magnitude faster than BLAST
journal, August 2010


A configurable rule based classful token bucket filter network request scheduler for the lustre file system
conference, January 2017

  • Qian, Yingjin; Li, Xi; Ihara, Shuichi
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17
  • DOI: 10.1145/3126908.3126932

The Hadoop Distributed File System
conference, May 2010

  • Shvachko, Konstantin; Kuang, Hairong; Radia, Sanjay
  • 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
  • DOI: 10.1109/MSST.2010.5496972

Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web
conference, January 1997

  • Karger, David; Lehman, Eric; Leighton, Tom
  • Proceedings of the twenty-ninth annual ACM symposium on Theory of computing - STOC '97
  • DOI: 10.1145/258533.258660

Scientific computing meets big data technology: An astronomy use case
conference, October 2015

  • Zhang, Zhao; Barbary, Kyle; Nothaft, Frank Austin
  • 2015 IEEE International Conference on Big Data (Big Data)
  • DOI: 10.1109/BigData.2015.7363840

Challenges and Solutions for Tracing Storage Systems: A Case Study with Spectrum Scale
journal, April 2018

  • Vef, Marc-André; Tarasov, Vasily; Hildebrand, Dean
  • ACM Transactions on Storage, Vol. 14, Issue 2
  • DOI: 10.1145/3149376

Understanding and Improving Computational Science Storage Access through Continuous Characterization
journal, October 2011

  • Carns, Philip; Harms, Kevin; Allcock, William
  • ACM Transactions on Storage, Vol. 7, Issue 3, p. 1-26
  • DOI: 10.1145/2027066.2027068

PLFS: a checkpoint filesystem for parallel applications
conference, January 2009


An Overview of the Atmospheric Component of the Energy Exascale Earth System Model
journal, August 2019

  • Rasch, P. J.; Xie, S.; Ma, P. ‐L.
  • Journal of Advances in Modeling Earth Systems, Vol. 11, Issue 8
  • DOI: 10.1029/2019MS001629

On the Quality of Wall Time Estimates for Resource Allocation Prediction
conference, January 2019

  • Soysal, Mehmet; Berghoff, Marco; Klusáček, Dalibor
  • Proceedings of the 48th International Conference on Parallel Processing: Workshops - ICPP 2019
  • DOI: 10.1145/3339186.3339204

Mercury: Enabling remote procedure call for high-performance computing
conference, September 2013

  • Soumagne, Jerome; Kimpe, Dries; Zounmevo, Judicael
  • 2013 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2013.6702617

Qthreads: An API for programming with millions of lightweight threads
conference, April 2008

  • Wheeler, Kyle B.; Murphy, Richard C.; Thain, Douglas
  • Distributed Processing Symposium (IPDPS), 2008 IEEE International Symposium on Parallel and Distributed Processing
  • DOI: 10.1109/IPDPS.2008.4536359

Managing Variability in the IO Performance of Petascale Storage Systems
conference, November 2010

  • Lofstead, Jay; Zheng, Fang; Liu, Qing
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2010.32

Snakemake—a scalable bioinformatics workflow engine
journal, May 2018


Managing I/O Interference in a Shared Burst Buffer System
conference, August 2016

  • Thapaliya, Sagar; Bangalore, Purushotham; Lofstead, Jat
  • 2016 45th International Conference on Parallel Processing (ICPP)
  • DOI: 10.1109/ICPP.2016.54

SSD Failures in Datacenters: What? When? and Why?
conference, January 2016

  • Narayanan, Iyswarya; Vaid, Kushagra; Wang, Di
  • Proceedings of the 9th ACM International on Systems and Storage Conference - SYSTOR '16
  • DOI: 10.1145/2928275.2928278

Argobots: A Lightweight Low-Level Threading and Tasking Framework
journal, March 2018

  • Seo, Sangmin; Amer, Abdelhalim; Balaji, Pavan
  • IEEE Transactions on Parallel and Distributed Systems, Vol. 29, Issue 3
  • DOI: 10.1109/TPDS.2017.2766062

Exascale Deep Learning for Climate Analytics
conference, November 2018

  • Kurth, Thorsten; Treichler, Sean; Romero, Joshua
  • SC18: International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2018.00054

Poster: Portals 4 Network Programming Interface
conference, November 2012

  • Barrett, Brian; Brightwell, Ron; Underwood, Keith
  • 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion: High Performance Computing, Networking Storage and Analysis
  • DOI: 10.1109/SC.Companion.2012.264

NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging
conference, September 2019

  • Miranda, Alberto; Jackson, Adrian; Tocci, Tommaso
  • 2019 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2019.8891014

A Large-Scale Study of Flash Memory Failures in the Field
journal, June 2015

  • Meza, Justin; Wu, Qiang; Kumar, Sanjev
  • ACM SIGMETRICS Performance Evaluation Review, Vol. 43, Issue 1
  • DOI: 10.1145/2796314.2745848

High Performance RDMA-Based MPI Implementation over InfiniBand
journal, June 2004


A 1 PB/s file system to checkpoint three million MPI tasks
conference, January 2013

  • Rajachandrasekar, Raghunath; Moody, Adam; Mohror, Kathryn
  • Proceedings of the 22nd international symposium on High-performance parallel and distributed computing - HPDC '13
  • DOI: 10.1145/2493123.2462908

Parallel netCDF: A High-Performance Scientific I/O Interface
conference, January 2003

  • Li, Jianwei; Zingale, Michael; Liao, Wei-keng
  • Proceedings of the 2003 ACM/IEEE conference on Supercomputing - SC '03
  • DOI: 10.1145/1048935.1050189

Performance and extension of user space file systems
conference, January 2010

  • Rajgarhia, Aditya; Gehani, Ashish
  • Proceedings of the 2010 ACM Symposium on Applied Computing - SAC '10
  • DOI: 10.1145/1774088.1774130

An Ephemeral Burst-Buffer File System for Scientific Applications
conference, November 2016

  • Wang, Teng; Mohror, Kathryn; Moody, Adam
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2016.68

File System Scalability with Highly Decentralized Metadata on Independent Storage Devices
conference, May 2016

  • Lensing, Paul Hermann; Cortes, Toni; Hughes, Jim
  • 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)
  • DOI: 10.1109/CCGrid.2016.28

A 1 PB/s file system to checkpoint three million MPI tasks
conference, October 2018

  • Rajachandrasekar, Raghunath; Moody, Adam; Mohror, Kathryn
  • HPDC'13: The 22nd International Symposium on High-Performance Parallel and Distributed Computing, Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
  • DOI: 10.1145/2462902.2462908