Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation

Carothers, Christopher D.; Meredith, Jeremy S.; Blanco, Marc; Vetter, Jeffrey S.; Mubarak, Misbah; LaPre, Justin; Moore, Shirley V.

doi:10.1145/3064911.3064923

Title: Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation

Conference · Mon May 01 00:00:00 EDT 2017

DOI:https://doi.org/10.1145/3064911.3064923· OSTI ID:1423024

Carothers, Christopher D. ^[1];

^[2]; Blanco, Marc ^[1];

^[2]; Mubarak, Misbah ^[3]; LaPre, Justin ^[1];

^[2]

Rensselaer Polytechnic Institute (RPI)
ORNL
Argonne National Laboratory

Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, that combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework.Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling.Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.

View Conference

Cite

Export

Save

Research Organization:: Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1423024

Resource Relation:: Conference: SIGSIM-PADS '17 - Singapore, , Singapore - 5/24/2017 12:00:00 PM-5/26/2017 12:00:00 PM

Country of Publication:: United States

Language:: English

Similar Records

Fit Fly: A Case Study of Interconnect Innovation through Parallel Simulation

Conference · Tue Jan 01 00:00:00 EST 2019 · OSTI ID:1423024

McGlohon, Neil; Wolfe, Noah; Mubarak, Misbah; +1 more

Union: An Automatic Workload Manager for Accelerating Network Simulation

Conference · Wed Jan 01 00:00:00 EST 2020 · OSTI ID:1423024

Wang, Xin; Mubarak, Misbah; Kang, Yao; +2 more

Proxy Applications for Converged Workloads: DMC LDRD Initiative

Technical Report · Tue Sep 26 00:00:00 EDT 2023 · OSTI ID:1423024

Ghosh, Sayan; Jain, Milan; Lee, Hyungro; +1 more

Title: Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation

Citation Formats

Similar Records

Related Subjects