Scalability of Several Asynchronous Many-Task Models for In Situ Statistical Analysis.
- Sandia National Lab. (SNL-CA), Livermore, CA (United States)
This report is a sequel to [PB16], in which we provided a first progress report on research and development towards a scalable, asynchronous many-task, in situ statistical analysis engine using the Legion runtime system. This earlier work included a prototype implementation of a proposed solution, using a proxy mini-application as a surrogate for a full-scale scientific simulation code. The first scalability studies were conducted with the above on modestly-sized experimental clusters. In contrast, in the current work we have integrated our in situ analysis engines with a full-size scientific application (S3D, using the Legion-SPMD model), and have conducted nu- merical tests on the largest computational platform currently available for DOE science ap- plications. We also provide details regarding the design and development of a light-weight asynchronous collectives library. We describe how this library is utilized within our SPMD- Legion S3D workflow, and compare the data aggregation technique deployed herein to the approach taken within our previous work.
- Research Organization:
- Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States); Sandia National Laboratories, Livermore, CA
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC04-94AL85000
- OSTI ID:
- 1367233
- Report Number(s):
- SAND--2017-5220; 653374
- Country of Publication:
- United States
- Language:
- English
Similar Records
An Asynchronous Many-Task Implementation of In-Situ Statistical Analysis using Legion.
Asynchronous Checkpoint Migration with MRNet in the Scalable Checkpoint / Restart Library
ASC ATDM Level 2 Milestone #6015: Asynchronous Many-Task Software Stack Demonstration
Technical Report
·
Sun Nov 01 00:00:00 EDT 2015
·
OSTI ID:1227237
Asynchronous Checkpoint Migration with MRNet in the Scalable Checkpoint / Restart Library
Conference
·
Tue Mar 20 00:00:00 EDT 2012
·
OSTI ID:1047769
ASC ATDM Level 2 Milestone #6015: Asynchronous Many-Task Software Stack Demonstration
Technical Report
·
Fri Sep 01 00:00:00 EDT 2017
·
OSTI ID:1596197