Simulating Billion-Task Parallel Programs
- ORNL
In simulating large parallel systems, bottom-up approaches exercise detailed hardware models with effects from simplified software models or traces, whereas top-down approaches evaluate the timing and functionality of detailed software models over coarse hardware models. Here, we focus on the top-down approach and significantly advance the scale of the simulated parallel programs. Via the direct execution technique combined with parallel discrete event simulation, we stretch the limits of the top-down approach by simulating message passing interface (MPI) programs with millions of tasks. Using a timing-validated benchmark application, a proof-of-concept scaling level is achieved to over 0.22 billion virtual MPI processes on 216,000 cores of a Cray XT5 supercomputer, representing one of the largest direct execution simulations to date, combined with a multiplexing ratio of 1024 simulated tasks per real task.
- Research Organization:
- Oak Ridge National Laboratory (ORNL)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1143564
- Country of Publication:
- United States
- Language:
- English
Similar Records
: A Scalable and Transparent System for Simulating MPI Programs
Discrete Event Execution with One-Sided and Two-Sided GVT Algorithms on 216,000 Processor Cores