skip to main content

Title: Simulating Billion-Task Parallel Programs

In simulating large parallel systems, bottom-up approaches exercise detailed hardware models with effects from simplified software models or traces, whereas top-down approaches evaluate the timing and functionality of detailed software models over coarse hardware models. Here, we focus on the top-down approach and significantly advance the scale of the simulated parallel programs. Via the direct execution technique combined with parallel discrete event simulation, we stretch the limits of the top-down approach by simulating message passing interface (MPI) programs with millions of tasks. Using a timing-validated benchmark application, a proof-of-concept scaling level is achieved to over 0.22 billion virtual MPI processes on 216,000 cores of a Cray XT5 supercomputer, representing one of the largest direct execution simulations to date, combined with a multiplexing ratio of 1024 simulated tasks per real task.
Authors:
 [1] ;  [1]
  1. ORNL
Publication Date:
OSTI Identifier:
1143564
DOE Contract Number:
DE-AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: International Symposium on Performance Evaluation of Computer and Telecommunication Systems, Monterey, CA, USA, 20140706, 20140710
Research Org:
Oak Ridge National Laboratory (ORNL)
Sponsoring Org:
USDOE
Country of Publication:
United States
Language:
English