skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation

Abstract

Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, that combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework.Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniappmore » code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling.Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.« less

Authors:
 [1]; ORCiD logo [2];  [1]; ORCiD logo [2];  [3];  [1]; ORCiD logo [2]
  1. Rensselaer Polytechnic Institute (RPI)
  2. ORNL
  3. Argonne National Laboratory
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
OSTI Identifier:
1423024
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: SIGSIM-PADS '17 - Singapore, , Singapore - 5/24/2017 12:00:00 PM-5/26/2017 12:00:00 PM
Country of Publication:
United States
Language:
English

Citation Formats

Carothers, Christopher D., Meredith, Jeremy S., Blanco, Marc, Vetter, Jeffrey S., Mubarak, Misbah, LaPre, Justin, and Moore, Shirley V.. Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation. United States: N. p., 2017. Web. doi:10.1145/3064911.3064923.
Carothers, Christopher D., Meredith, Jeremy S., Blanco, Marc, Vetter, Jeffrey S., Mubarak, Misbah, LaPre, Justin, & Moore, Shirley V.. Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation. United States. doi:10.1145/3064911.3064923.
Carothers, Christopher D., Meredith, Jeremy S., Blanco, Marc, Vetter, Jeffrey S., Mubarak, Misbah, LaPre, Justin, and Moore, Shirley V.. Mon . "Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation". United States. doi:10.1145/3064911.3064923. https://www.osti.gov/servlets/purl/1423024.
@article{osti_1423024,
title = {Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation},
author = {Carothers, Christopher D. and Meredith, Jeremy S. and Blanco, Marc and Vetter, Jeffrey S. and Mubarak, Misbah and LaPre, Justin and Moore, Shirley V.},
abstractNote = {Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, that combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework.Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling.Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.},
doi = {10.1145/3064911.3064923},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon May 01 00:00:00 EDT 2017},
month = {Mon May 01 00:00:00 EDT 2017}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: