skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: RISE: Reducing I/O Contention in Staging-based Extreme-Scale In-situ Workflows

Journal Article · · Proceedings - IEEE International Conference on Cluster Computing (Online)

While in-situ workflow formulations have addressed some of the data-related challenges associated with extreme-scale scientific workflows, these workflows involve complex interactions and different modes of data exchange. In the context of increasing system complexity, such workflows present significant resource management challenges, requiring complex cost-performance tradeoffs. This paper presents RISE, an intelligent staging-based data management middleware, which builds on the DataSpaces framework and performs intelligent scheduling of data management operations to reduce I/O contention. In RISE, data are always written immediately to local buffers to reduce the effect of the transfer impact upon application performance. RISE identifies applications’ data access patterns and moves data towards data consumers only when the network is expected to be idle, reducing the impact of asynchronous background data movement upon critical data read/write requests. Here, we experimentally demonstrate that RISE can take advantage of staging nodes to offload data during writes without degrading application data movement performance.

Research Organization:
Rutgers Univ., Piscataway, NJ (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR). Scientific Discovery through Advanced Computing (SciDAC)
Grant/Contract Number:
SC0021326; AC05-00OR22725
OSTI ID:
1907691
Journal Information:
Proceedings - IEEE International Conference on Cluster Computing (Online), Vol. 2021; Conference: 2021 IEEE International Conference on Cluster Computing (CLUSTER), Portland, OR (United States), 7-10 Sep 2021; ISSN 2168-9253
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (30)

In-memory staging and data-centric task placement for coupled scientific simulation workflows: In-memory staging and data-centric task placement for coupled scientific simulation workflows journal April 2017
Dual space analysis of turbulent combustion particle data conference March 2011
Opportunities for Nonvolatile Memory Systems in Extreme-Scale High-Performance Computing journal March 2015
DeStager: feature guided in-situ data management in distributed deep memory hierarchies journal August 2018
DataSpaces: an interaction and coordination framework for coupled simulation workflows journal February 2011
ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing: ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing journal October 2014
Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction
  • Dorier, Matthieu; Ibrahim, Shadi; Antoniu, Gabriel
  • SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.56
conference November 2014
Addressing data resiliency for staging based scientific workflows
  • Duan, Shaohua; Subedi, Pradeep; Davis, Philip E.
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356158
conference November 2019
Scalable Data Resilience for In-memory Data Staging conference May 2018
Computing Just What You Need: Online Data Analysis and Reduction at Extreme Scales conference December 2017
ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management journal July 2020
Exploring Data Staging Across Deep Memory Hierarchies for Coupled Data Intensive Simulation Workflows conference May 2015
Full-f gyrokinetic particle simulation of centrally heated global ITG turbulence from magnetic axis to edge pedestal top in a realistic tokamak geometry journal September 2009
Lynx: a learning linux prefetching mechanism for SSD performance model conference August 2016
Leveraging Machine Learning for Anticipatory Data Delivery in Extreme Scale In-situ Workflows conference September 2019
Terascale direct numerical simulations of turbulent combustion using S3D journal January 2009
Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In-Situ Workflows conference November 2018
Flexpath: Type-Based Publish/Subscribe System for Large-Scale Science Analytics conference May 2014
Adaptive data placement for staging-based coupled scientific workflows
  • Sun, Qian; Parashar, Manish; Jin, Tong
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15 https://doi.org/10.1145/2807591.2807669
conference January 2015
Practical prefetching via data compression
  • Curewitz, Kenneth M.; Krishnan, P.; Vitter, Jeffrey Scott
  • Proceedings of the 1993 ACM SIGMOD international conference on Management of data - SIGMOD '93 https://doi.org/10.1145/170035.170077
conference January 1993
The future of scientific workflows journal April 2017
Combining in-situ and in-transit processing to enable extreme-scale scientific analysis
  • Bennett, Janine C.; Abbasi, Hasan; Bremer, Peer-Timo
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2012.31
conference November 2012
Moving the Code to the Data - Dynamic Code Deployment Using ActiveSpaces
  • Docan, Ciprian; Parashar, Manish; Cummings, Julian
  • Distributed Processing Symposium (IPDPS), 2011 IEEE International Parallel & Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2011.120
conference May 2011
DataStager: scalable data staging services for petascale applications journal June 2010
Scientific workflow management and the Kepler system
  • Ludäscher, Bertram; Altintas, Ilkay; Berkley, Chad
  • Concurrency and Computation: Practice and Experience, Vol. 18, Issue 10 https://doi.org/10.1002/cpe.994
journal January 2006
Identifying Hierarchical Structure in Sequences: A linear-time algorithm journal September 1997
In Situ Visualization at Extreme Scale: Challenges and Opportunities journal November 2009
Taverna: a tool for the composition and enactment of bioinformatics workflows journal June 2004
Mercury: Enabling remote procedure call for high-performance computing conference September 2013
Mochi: Composing Data Services for High-Performance Computing Environments journal January 2020

Similar Records

In-Staging Data Placement for Asynchronous Coupling of Task-Based Scientific Workflows, In: 2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2)
Conference · Fri Jan 01 00:00:00 EST 2016 · PROCEEDINGS OF SECOND INTERNATIONAL WORKSHOP ON EXTREME SCALE PROGRAMMING MODELS AND MIDDLEWARE (ESPM2 2016) · OSTI ID:1907691

CoREC: Scalable and Resilient In-memory Data Staging for In-situ Workflows
Journal Article · Sun May 31 00:00:00 EDT 2020 · ACM Transactions on Parallel Computing · OSTI ID:1907691

Combining In-situ and In-transit Processing to Enable Extreme-Scale Scientific Analysis
Conference · Thu Nov 01 00:00:00 EDT 2012 · OSTI ID:1907691