TPR: Traffic Pattern-based Adaptive Routing for Dragonfly Networks
- Florida State Univ., Tallahassee, FL (United States). Dept. of Computer Science
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
The Cray Cascade architecture uses Dragonfly as its interconnect topology and employs a globally adaptive routing scheme called UGAL. UGAL directs traffic based on link loads but may make inappropriate adaptive routing decisions in various situations, which degrades its performance. Here in this work, we propose traffic pattern-based adaptive routing (TPR) for Dragonfly that improves UGAL by incorporating a traffic pattern-based adaptation mechanism. The idea is to explicitly use the link usage statistics that are collected in performance counters to infer the traffic pattern, and to take the inferred traffic pattern plus link loads into consideration when making adaptive routing decisions. Furthermore, our performance evaluation results on a diverse set of traffic conditions indicate that by incorporating the traffic pattern-based adaptation mechanism, TPR is much more effective in making adaptive routing decisions and achieves significant lower latency under low load and higher throughput under high load than its underlying UGAL.
- Research Organization:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA), Office of Defense Programs (DP) (NA-10)
- Grant/Contract Number:
- AC52-06NA25396
- OSTI ID:
- 1481984
- Report Number(s):
- LA-UR--18-20582
- Journal Information:
- IEEE Transactions on Multi-Scale Computing Systems, Journal Name: IEEE Transactions on Multi-Scale Computing Systems Journal Issue: 4 Vol. 4; ISSN 2372-207X
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Modeling Universal Globally Adaptive Load-Balanced Routing
A framework for adaptive routing in multicomputer networks