Data Management in the Continuum: Cross-facility Object-based Data Transfers
Scientific workflows are evolving from relying on a monolithic storage subsystem at a single High-Performance Computing (HPC) facility to using geographically distributed file systems, repositories, and cloud storage. As a result, storing, accessing, transferring, and managing scientific data have become highly complex and prone to performance inefficiencies. This paper delves into these challenges by exploring an optimized end-to-end interface designed to seamlessly connect various local and remote storage systems, enabling efficient data movement of objects across HPC–Cloud and HPC–HPC environments. We showcase this capability through an object-focused data management runtime system, discuss the effects of relaxed consistency semantics in distributed object scenarios, and illustrate its application in an earthquake simulation workflow. Besides reducing the amount of data by selectively transferring regions of interest, our facility-local results achieved a speedup of 45 × over an optimized HDF5 usage and 15 × over the HDF5 with caching by using the new interface in PDC-XF.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- US Department of Energy; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-ASCR)
- DOE Contract Number:
- AC02-05CH11231;
- OSTI ID:
- 3024529
- Resource Type:
- Conference paper
- Conference Information:
- 2025 IEEE/SBC 37th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)
- Country of Publication:
- United States
- Language:
- English
Similar Records
Interfacing HDF5 with a scalable object‐centric storage system on hierarchical storage
Proactive Data Containers for Scientific Storage (Final Report)
Programming Abstractions for Managing Workflows on Tiered Storage Systems
Journal Article
·
Sun Mar 08 20:00:00 EDT 2020
· Concurrency and Computation. Practice and Experience
·
OSTI ID:1603709
Proactive Data Containers for Scientific Storage (Final Report)
Technical Report
·
Mon Dec 09 23:00:00 EST 2019
·
OSTI ID:1577855
Programming Abstractions for Managing Workflows on Tiered Storage Systems
Journal Article
·
Sun Oct 24 20:00:00 EDT 2021
· ACM Transactions on Storage
·
OSTI ID:1898543