DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Fast neural network training on a cluster of GPUs for action recognition with high accuracy

Abstract

In this work, we propose algorithms and techniques to accelerate training of deep neural networks for action recognition on a cluster of GPUs. The convergence analysis of our algorithm shows it is possible to reduce communication cost and at the same time minimize the number of iterations needed for convergence. We customize the Adam optimizer for our distributed algorithm to improve efficiency. In addition, we employ transfer-learning to further reduce training time while improving validation accuracy. For the UCF101 and HMDB51 datasets, the validation accuracies achieved are 93.1% and 67.9% respectively. With an additional end-to-end trained temporal stream, the validation accuracies achieved for UCF101 and HMDB51 are 93.47% and 81.24% respectively. As far as we know, these are the highest accuracies achieved with the two-stream approach using ResNet that does not involve computationally expensive 3D convolutions or pretraining on much larger datasets.

Authors:
 [1];  [1];  [1];  [2];  [3];  [4]
  1. IBM TJ Watson Research Center, Yorktown Heights, NY (United States)
  2. ASAPP, New York, NY (United States)
  3. Baidu Research USA, Bellevue, WA (United States)
  4. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Publication Date:
Research Org.:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1669241
Alternate Identifier(s):
OSTI ID: 2325457
Report Number(s):
LLNL-JRNL-814435
Journal ID: ISSN 0743-7315; 1021930
Grant/Contract Number:  
AC52-07NA27344
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Parallel and Distributed Computing
Additional Journal Information:
Journal Volume: 134; Journal Issue: na; Journal ID: ISSN 0743-7315
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Machine learning; video analytics; distributed training; transfer learning; GPU

Citation Formats

Cong, G, Domeniconi, G, Yang, C, Shapiro, J, Zhou, F, and Chen, B Y. Fast neural network training on a cluster of GPUs for action recognition with high accuracy. United States: N. p., 2019. Web. doi:10.1016/j.jpdc.2019.07.009.
Cong, G, Domeniconi, G, Yang, C, Shapiro, J, Zhou, F, & Chen, B Y. Fast neural network training on a cluster of GPUs for action recognition with high accuracy. United States. https://doi.org/10.1016/j.jpdc.2019.07.009
Cong, G, Domeniconi, G, Yang, C, Shapiro, J, Zhou, F, and Chen, B Y. Thu . "Fast neural network training on a cluster of GPUs for action recognition with high accuracy". United States. https://doi.org/10.1016/j.jpdc.2019.07.009. https://www.osti.gov/servlets/purl/1669241.
@article{osti_1669241,
title = {Fast neural network training on a cluster of GPUs for action recognition with high accuracy},
author = {Cong, G and Domeniconi, G and Yang, C and Shapiro, J and Zhou, F and Chen, B Y},
abstractNote = {In this work, we propose algorithms and techniques to accelerate training of deep neural networks for action recognition on a cluster of GPUs. The convergence analysis of our algorithm shows it is possible to reduce communication cost and at the same time minimize the number of iterations needed for convergence. We customize the Adam optimizer for our distributed algorithm to improve efficiency. In addition, we employ transfer-learning to further reduce training time while improving validation accuracy. For the UCF101 and HMDB51 datasets, the validation accuracies achieved are 93.1% and 67.9% respectively. With an additional end-to-end trained temporal stream, the validation accuracies achieved for UCF101 and HMDB51 are 93.47% and 81.24% respectively. As far as we know, these are the highest accuracies achieved with the two-stream approach using ResNet that does not involve computationally expensive 3D convolutions or pretraining on much larger datasets.},
doi = {10.1016/j.jpdc.2019.07.009},
journal = {Journal of Parallel and Distributed Computing},
number = na,
volume = 134,
place = {United States},
year = {Thu Aug 29 00:00:00 EDT 2019},
month = {Thu Aug 29 00:00:00 EDT 2019}
}

Journal Article:

Figures / Tables:

Figure 1 Figure 1: Two-stream training architecture

Save / Share:

Works referenced in this record:

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
conference, July 2017

  • Carreira, Joao; Zisserman, Andrew
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2017.502

Deep Residual Learning for Image Recognition
conference, June 2016

  • He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2016.90

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
conference, July 2017

  • Ilg, Eddy; Mayer, Nikolaus; Saikia, Tonmoy
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2017.179

ImageNet classification with deep convolutional neural networks
journal, May 2017

  • Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E.
  • Communications of the ACM, Vol. 60, Issue 6
  • DOI: 10.1145/3065386

TV-L1 Optical Flow Estimation
journal, January 2013

  • Sánchez Pérez, Javier; Meinhardt-Llopis, Enric; Facciolo, Gabriele
  • Image Processing On Line, Vol. 3
  • DOI: 10.5201/ipol.2013.26

Going deeper with convolutions
conference, June 2015