skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performance Analysis of Deep Learning Workloads on Leading-edge Systems

Abstract

This work examines the performance of leading-edge systems designed for machine learning computing, including the NVIDIA DGX-2, Amazon Web Services (AWS) P3, IBM Power System Accelerated Compute Server AC922, and a consumer-grade Exxact TensorEX TS4 GPU server. Representative deep learning workloads from the fields of computer vision and natural language processing are the focus of the analysis. Performance analysis is performed along with a number of important dimensions. Performance of the communication interconnects and large and high-throughput deep learning models are considered. Different potential use models for the systems as standalone and in the cloud also are examined. The effect of various optimization of the deep learning models and system configurations is included in the analysis.

Authors:
;
Publication Date:
Research Org.:
Brookhaven National Lab. (BNL), Upton, NY (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
OSTI Identifier:
1571428
Report Number(s):
BNL-212208-2019-COPA
DOE Contract Number:  
SC0012704
Resource Type:
Conference
Resource Relation:
Conference: SC19, Denver, CO, United States, 11/17/2019 - 11/22/2019
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Ren, Yihui, and Ren, Yihui. Performance Analysis of Deep Learning Workloads on Leading-edge Systems. United States: N. p., 2019. Web.
Ren, Yihui, & Ren, Yihui. Performance Analysis of Deep Learning Workloads on Leading-edge Systems. United States.
Ren, Yihui, and Ren, Yihui. Sun . "Performance Analysis of Deep Learning Workloads on Leading-edge Systems". United States. https://www.osti.gov/servlets/purl/1571428.
@article{osti_1571428,
title = {Performance Analysis of Deep Learning Workloads on Leading-edge Systems},
author = {Ren, Yihui and Ren, Yihui},
abstractNote = {This work examines the performance of leading-edge systems designed for machine learning computing, including the NVIDIA DGX-2, Amazon Web Services (AWS) P3, IBM Power System Accelerated Compute Server AC922, and a consumer-grade Exxact TensorEX TS4 GPU server. Representative deep learning workloads from the fields of computer vision and natural language processing are the focus of the analysis. Performance analysis is performed along with a number of important dimensions. Performance of the communication interconnects and large and high-throughput deep learning models are considered. Different potential use models for the systems as standalone and in the cloud also are examined. The effect of various optimization of the deep learning models and system configurations is included in the analysis.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {11}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: