skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Evaluating On-Node GPU Interconnects for Deep Learning Workloads

Abstract

Scaling deep learning workloads across multiple GPUs on a single node has become increasingly important in data analytics. A key question is how well a PCIe-based GPU interconnect can perform relative to a custom high-performance interconnect such as NVIDIA's NVLink. This paper evaluates two such on-node interconnects for eight NVIDIA Pascal P100 GPUs: (a) the NVIDIA DGX-1's NVLink 1.0 `hybrid cube mesh'; and (b) the Cirrascale GX8's two-level PCIe tree using dual SR3615 switch risers. To show the effects of a range of neural network workloads, we define a parameterized version of the popular ResNet. We define a workload intensity metric that characterizes the expected computation/communication ratio; we also locate AlexNet and GoogLeNet within that space. As expected, the DGX-1 typically has superior performance. However, when equalizing GPU SM frequencies, the GX8 is very competitive on all ResNet workloads. With 8 GPUs, the GX8 can outperform the DGX-1 on all-to-all reductions by 10% for medium-sized payloads; and in rare cases, the GX8 slightly outperforms on ResNet.

Authors:
 [1]; ORCiD logo [1];  [1];  [1];  [2]
  1. BATTELLE (PACIFIC NW LAB)
  2. Brookhaven National Laboratory
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1525777
Report Number(s):
PNNL-SA-129849
Journal ID: ISSN 0302--9743
DOE Contract Number:  
AC05-76RL01830
Resource Type:
Conference
Resource Relation:
Journal Volume: 10724; Conference: Proceedings of the 8th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, (PMBS 2017), November 13, 2017, Denver, CO. Lecture Notes in Computer Science
Country of Publication:
Switzerland
Language:
English

Citation Formats

Tallent, Nathan R., Gawande, Nitin A., Siegel, Charles M., Vishnu, Abhinav, and Hoisie, A. Evaluating On-Node GPU Interconnects for Deep Learning Workloads. Switzerland: N. p., 2018. Web. doi:10.1007/978-3-319-72971-8_1.
Tallent, Nathan R., Gawande, Nitin A., Siegel, Charles M., Vishnu, Abhinav, & Hoisie, A. Evaluating On-Node GPU Interconnects for Deep Learning Workloads. Switzerland. doi:10.1007/978-3-319-72971-8_1.
Tallent, Nathan R., Gawande, Nitin A., Siegel, Charles M., Vishnu, Abhinav, and Hoisie, A. Mon . "Evaluating On-Node GPU Interconnects for Deep Learning Workloads". Switzerland. doi:10.1007/978-3-319-72971-8_1.
@article{osti_1525777,
title = {Evaluating On-Node GPU Interconnects for Deep Learning Workloads},
author = {Tallent, Nathan R. and Gawande, Nitin A. and Siegel, Charles M. and Vishnu, Abhinav and Hoisie, A},
abstractNote = {Scaling deep learning workloads across multiple GPUs on a single node has become increasingly important in data analytics. A key question is how well a PCIe-based GPU interconnect can perform relative to a custom high-performance interconnect such as NVIDIA's NVLink. This paper evaluates two such on-node interconnects for eight NVIDIA Pascal P100 GPUs: (a) the NVIDIA DGX-1's NVLink 1.0 `hybrid cube mesh'; and (b) the Cirrascale GX8's two-level PCIe tree using dual SR3615 switch risers. To show the effects of a range of neural network workloads, we define a parameterized version of the popular ResNet. We define a workload intensity metric that characterizes the expected computation/communication ratio; we also locate AlexNet and GoogLeNet within that space. As expected, the DGX-1 typically has superior performance. However, when equalizing GPU SM frequencies, the GX8 is very competitive on all ResNet workloads. With 8 GPUs, the GX8 can outperform the DGX-1 on all-to-all reductions by 10% for medium-sized payloads; and in rare cases, the GX8 slightly outperforms on ResNet.},
doi = {10.1007/978-3-319-72971-8_1},
journal = {},
issn = {0302--9743},
number = ,
volume = 10724,
place = {Switzerland},
year = {2018},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: