Enforcing End-to-end I/O Policies for Scientific Workflows using Software-Defined Storage Resource Enclaves
- Washington State Univ., Vancouver, WA (United States)
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Data-intensive knowledge discovery requires scientific applications to run concurrently with analytics and visualization codes executing in situ for timely output inspection and knowledge extraction. Consequently, I/O pipelines of scientific workflows can be long and complex because they comprise many stages of analytics across different layers of the I/O stack of high-performance computing systems. Performance limitations at any I/O layer or stage can cause an I/O bottleneck resulting in greater than expected end-to-end I/O latency. In this paper, we present the design and implementation of a novel data management infrastructure called Software-Defined Storage Resource Enclaves (SIREN) at system level to enforce end-to-end policies that dictate an I/O pipeline's performance. SIREN provides an I/O performance interface for users to specify the desired storage resources in the context of in-situ analytics. If suboptimal performance of analytics is caused by an I/O bottleneck when data are transferred between simulations and analytics, schedulers in different layers of the I/O stack automatically provide the guaranteed lower bounds on I/O throughput. Lastly, our experimental results demonstrate that SIREN provides performance isolation among scientific workflows sharing multiple storage servers across two I/O layers (burst buffer and parallel file systems) while maintaining high system scalability and resource utilization.
- Research Organization:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- Grant/Contract Number:
- 89233218CNA000001
- OSTI ID:
- 1482940
- Report Number(s):
- LA-UR-18-22116
- Journal Information:
- IEEE Transactions on Multi-Scale Computing Systems, Vol. 4, Issue 4; ISSN 2372-207X
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Orchestration of materials science workflows for heterogeneous resources at large scale
Integrated End-to-end Performance Prediction and Diagnosis for Extreme Scientific Workflows (IPPD) (Final Report)