Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

Journal Article · · Future Generations Computer Systems

Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors—including NVIDIA, Intel, AMD, and IBM—have architectural road maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating large DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these different products. Here, this article provides a performance and power analysis of important DL workloads on two major parallel architectures: NVIDIA DGX-1 (eight Pascal P100 GPUs interconnected with NVLink) and Intel Knights Landing (KNL) CPUs interconnected with Intel Omni-Path or Cray Aries. Our evaluation consists of a cross section of convolutional neural net workloads: CifarNet, AlexNet, GoogLeNet, and ResNet50 topologies using the Cifar10 and ImageNet datasets. The workloads are vendor-optimized for each architecture. We use sequentially equivalent implementations to maintain iso-accuracy between parallel and sequential DL models. Our analysis indicates that although GPUs provide the highest overall performance, the gap can close for some convolutional networks; and the KNL can be competitive in performance/watt. We find that NVLink facilitates scaling efficiency on GPUs. However, its importance is heavily dependent on neural network architecture. Furthermore, for weak-scaling—sometimes encouraged by restricted GPU memory—NVLink is less important.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
Grant/Contract Number:
AC05-76RL01830
OSTI ID:
1617450
Report Number(s):
PNNL-SA--134513
Journal Information:
Future Generations Computer Systems, Journal Name: Future Generations Computer Systems Journal Issue: C Vol. 108; ISSN 0167-739X
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (13)

ImageNet Large Scale Visual Recognition Challenge journal April 2015
Searching for exotic particles in high-energy physics with deep learning journal July 2014
Benchmarking State-of-the-Art Deep Learning Software Tools conference November 2016
Going deeper with convolutions conference June 2015
FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters conference June 2016
Deep Residual Learning for Image Recognition conference June 2016
Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
  • Gawande, Nitin A.; Landwehr, Joshua B.; Daily, Jeff A.
  • 2017 IEEE International Parallel and Distributed Processing Symposium: Workshops (IPDPSW), 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) https://doi.org/10.1109/IPDPSW.2017.36
conference May 2017
Large Minibatch Training on Supercomputers with Improved Accuracy and Reduced Time to Train conference November 2018
Knights Landing: Second-Generation Intel Xeon Phi Product journal March 2016
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition journal November 2008
RAPL: memory power estimation and capping
  • David, Howard; Gorbatov, Eugene; Hanebutte, Ulf R.
  • Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design - ISLPED '10 https://doi.org/10.1145/1840845.1840883
conference January 2010
Caffe: Convolutional Architecture for Fast Feature Embedding conference January 2014
Theano: A CPU and GPU Math Compiler in Python conference January 2010

Cited By (2)

Applications of Artificial Intelligence Methodologies to Behavioral and Social Sciences journal December 2019
A Framework for Memory Oversubscription Management in Graphics Processing Units
  • Li, Chen; Ausavarungnirun, Rachata; Rossbach, Christopher J.
  • ASPLOS '19: Architectural Support for Programming Languages and Operating Systems, Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems https://doi.org/10.1145/3297858.3304044
conference April 2019

Figures / Tables (19)


Similar Records

Scaling deep learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
Conference · Thu Aug 24 00:00:00 EDT 2017 · OSTI ID:1411927

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
Conference · Mon Jul 03 00:00:00 EDT 2017 · OSTI ID:1373860

Evaluating On-Node GPU Interconnects for Deep Learning Workloads
Conference · Sun Dec 31 23:00:00 EST 2017 · OSTI ID:1525777