skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Fit Fly: A Case Study of Interconnect Innovation through Parallel Simulation

Abstract

To meet the demand for exascale-level performance from high-performance computing (HPC) interconnects, many system architects are turning to simulation results for accurate and reliable predictions of the performance of prospective technologies. Testing full-scale networks with a variety of benchmarking tools, including synthetic workloads and application traces, can give crucial insight into what ideas are most promising without needing to physically construct a test network. While flexible, however, this approach is extremely compute time intensive. We address this time complexity challenge through the use of large-scale, optimistic parallel simulation that ultimately leads to faster HPC network architecture innovations. In this paper we demonstrate this innovation capability through a real-world network design case study. Specifically, we have simulated and compared four extreme-scale interconnects: Dragonfly, Megafly, Slim Fly, and a new dual-rail-dual-plane variation of the Slim Fly network topology. We present this new variant of Slim Fly, dubbed Fit Fly, to show how interconnect innovation and evaluation-beyond what is possible through analytic methods-can be achieved through parallel simulation. We validate and compare the model with various network designs using the CODES interconnect simulation framework. By running large-scale simulations in a parallel environment, we are able to quickly generate reliable performance results that canmore » help network designers break ground on the next generation of high-performance network designs.« less

Authors:
; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science - Office of Advanced Scientific Computing Research
OSTI Identifier:
1574984
DOE Contract Number:  
AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: 2019 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, 06/03/19 - 06/05/19, Chicago, IL, US
Country of Publication:
United States
Language:
English
Subject:
High Performance Computing; Interconnection Networks; Modeling; Parallel Discrete Event Simulation

Citation Formats

McGlohon, Neil, Wolfe, Noah, Mubarak, Misbah, and Carothers, Christopher. Fit Fly: A Case Study of Interconnect Innovation through Parallel Simulation. United States: N. p., 2019. Web. doi:10.1145/3316480.3325515.
McGlohon, Neil, Wolfe, Noah, Mubarak, Misbah, & Carothers, Christopher. Fit Fly: A Case Study of Interconnect Innovation through Parallel Simulation. United States. doi:10.1145/3316480.3325515.
McGlohon, Neil, Wolfe, Noah, Mubarak, Misbah, and Carothers, Christopher. Tue . "Fit Fly: A Case Study of Interconnect Innovation through Parallel Simulation". United States. doi:10.1145/3316480.3325515.
@article{osti_1574984,
title = {Fit Fly: A Case Study of Interconnect Innovation through Parallel Simulation},
author = {McGlohon, Neil and Wolfe, Noah and Mubarak, Misbah and Carothers, Christopher},
abstractNote = {To meet the demand for exascale-level performance from high-performance computing (HPC) interconnects, many system architects are turning to simulation results for accurate and reliable predictions of the performance of prospective technologies. Testing full-scale networks with a variety of benchmarking tools, including synthetic workloads and application traces, can give crucial insight into what ideas are most promising without needing to physically construct a test network. While flexible, however, this approach is extremely compute time intensive. We address this time complexity challenge through the use of large-scale, optimistic parallel simulation that ultimately leads to faster HPC network architecture innovations. In this paper we demonstrate this innovation capability through a real-world network design case study. Specifically, we have simulated and compared four extreme-scale interconnects: Dragonfly, Megafly, Slim Fly, and a new dual-rail-dual-plane variation of the Slim Fly network topology. We present this new variant of Slim Fly, dubbed Fit Fly, to show how interconnect innovation and evaluation-beyond what is possible through analytic methods-can be achieved through parallel simulation. We validate and compare the model with various network designs using the CODES interconnect simulation framework. By running large-scale simulations in a parallel environment, we are able to quickly generate reliable performance results that can help network designers break ground on the next generation of high-performance network designs.},
doi = {10.1145/3316480.3325515},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: