DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: TensorFlow at Scale: Performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML

Authors:
ORCiD logo [1];  [2];  [3];  [4];  [5]
  1. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Berkeley California
  2. Software and Services Group Intel Corporation Moscow Russia
  3. Cray Programming Environments Performance Engineering Cray Inc Bloomington Minnesota
  4. Parallel Computing Labs Intel Corporation Karnataka India
  5. Data Center Group Intel Corporation Hillsboro Oregon
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1479562
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Publisher's Accepted Manuscript
Journal Name:
Concurrency and Computation. Practice and Experience
Additional Journal Information:
Journal Name: Concurrency and Computation. Practice and Experience Journal Volume: 31 Journal Issue: 16; Journal ID: ISSN 1532-0626
Publisher:
Wiley Blackwell (John Wiley & Sons)
Country of Publication:
United Kingdom
Language:
English

Citation Formats

Kurth, Thorsten, Smorkalov, Mikhail, Mendygral, Peter, Sridharan, Srinivas, and Mathuriya, Amrita. TensorFlow at Scale: Performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML. United Kingdom: N. p., 2018. Web. doi:10.1002/cpe.4989.
Kurth, Thorsten, Smorkalov, Mikhail, Mendygral, Peter, Sridharan, Srinivas, & Mathuriya, Amrita. TensorFlow at Scale: Performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML. United Kingdom. https://doi.org/10.1002/cpe.4989
Kurth, Thorsten, Smorkalov, Mikhail, Mendygral, Peter, Sridharan, Srinivas, and Mathuriya, Amrita. Sun . "TensorFlow at Scale: Performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML". United Kingdom. https://doi.org/10.1002/cpe.4989.
@article{osti_1479562,
title = {TensorFlow at Scale: Performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML},
author = {Kurth, Thorsten and Smorkalov, Mikhail and Mendygral, Peter and Sridharan, Srinivas and Mathuriya, Amrita},
abstractNote = {},
doi = {10.1002/cpe.4989},
journal = {Concurrency and Computation. Practice and Experience},
number = 16,
volume = 31,
place = {United Kingdom},
year = {2018},
month = {10}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1002/cpe.4989

Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Backpropagation and stochastic gradient descent method
journal, June 1993


Deep learning at 15PF: supervised and semi-supervised classification for scientific data
conference, January 2017

  • Kurth, Thorsten; Smorkalov, Mikhail; Deslippe, Jack
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17
  • DOI: 10.1145/3126908.3126916

DELPHES 3: a modular framework for fast simulation of a generic collider experiment
journal, February 2014

  • de Favereau, J.; Delaere, C.; Demin, P.
  • Journal of High Energy Physics, Vol. 2014, Issue 2
  • DOI: 10.1007/JHEP02(2014)057