DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Abstract

Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel® Xeon Phi™ processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. Here, these enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩM, σ8 and ns with unprecedented accuracy.

Authors:
 [1];  [2];  [3];  [1];  [4];  [5];  [6];  [1];  [3];  [5];  [3];  [5];  [5];  [6];  [3];  [2];  [5]
  1. Intel Corp., Hillsboro, OR (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  3. Cray Inc., Seattle, WA (United States)
  4. Univ. of California, Berkeley, CA (United States)
  5. Intel Corp., Santa Clara, CA (United States)
  6. Flatiron Inst., New York, NY (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Carnegie Mellon Univ., Pittsburgh, PA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1510756
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
International Conference for High Performance Computing, Networking, Storage and Analysis
Additional Journal Information:
Journal Volume: 2018; Conference: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, TX (United States), 11-16 Nov 2018; Journal ID: ISSN 2167-4329
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
98 NUCLEAR DISARMAMENT, SAFEGUARDS, AND PHYSICAL PROTECTION; Cosmology; Deep Learning; Machine Learning; TensorFlow; High Performance Computing

Citation Formats

Mathuriya, Amrita, Bard, Deborah, Mendygral, Peter, Meadows, Lawrence, Arnemann, James, Shao, Lei, He, Siyu, Karna, Tuomas, Moise, Diana, Pennycook, Simon J., Maschhoff, Kristyn, Sewall, Jason, Kumar, Nalini, Ho, Shirley, Ringenburg, Michael F., Prabhat, Prabhat, and Lee, Victor. CosmoFlow: Using Deep Learning to Learn the Universe at Scale. United States: N. p., 2019. Web. doi:10.1109/sc.2018.00068.
Mathuriya, Amrita, Bard, Deborah, Mendygral, Peter, Meadows, Lawrence, Arnemann, James, Shao, Lei, He, Siyu, Karna, Tuomas, Moise, Diana, Pennycook, Simon J., Maschhoff, Kristyn, Sewall, Jason, Kumar, Nalini, Ho, Shirley, Ringenburg, Michael F., Prabhat, Prabhat, & Lee, Victor. CosmoFlow: Using Deep Learning to Learn the Universe at Scale. United States. https://doi.org/10.1109/sc.2018.00068
Mathuriya, Amrita, Bard, Deborah, Mendygral, Peter, Meadows, Lawrence, Arnemann, James, Shao, Lei, He, Siyu, Karna, Tuomas, Moise, Diana, Pennycook, Simon J., Maschhoff, Kristyn, Sewall, Jason, Kumar, Nalini, Ho, Shirley, Ringenburg, Michael F., Prabhat, Prabhat, and Lee, Victor. Thu . "CosmoFlow: Using Deep Learning to Learn the Universe at Scale". United States. https://doi.org/10.1109/sc.2018.00068. https://www.osti.gov/servlets/purl/1510756.
@article{osti_1510756,
title = {CosmoFlow: Using Deep Learning to Learn the Universe at Scale},
author = {Mathuriya, Amrita and Bard, Deborah and Mendygral, Peter and Meadows, Lawrence and Arnemann, James and Shao, Lei and He, Siyu and Karna, Tuomas and Moise, Diana and Pennycook, Simon J. and Maschhoff, Kristyn and Sewall, Jason and Kumar, Nalini and Ho, Shirley and Ringenburg, Michael F. and Prabhat, Prabhat and Lee, Victor},
abstractNote = {Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel® Xeon Phi™ processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. Here, these enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩM, σ8 and ns with unprecedented accuracy.},
doi = {10.1109/sc.2018.00068},
journal = {International Conference for High Performance Computing, Networking, Storage and Analysis},
number = ,
volume = 2018,
place = {United States},
year = {Thu Mar 14 00:00:00 EDT 2019},
month = {Thu Mar 14 00:00:00 EDT 2019}
}

Works referenced in this record:

Distributed Deep Learning Using Synchronous Stochastic Gradient Descent
preprint, January 2016


Solving large scale structure in ten easy steps with COLA
journal, June 2013

  • Tassev, Svetlin; Zaldarriaga, Matias; Eisenstein, Daniel J.
  • Journal of Cosmology and Astroparticle Physics, Vol. 2013, Issue 06
  • DOI: 10.1088/1475-7516/2013/06/036

sCOLA: The N-body COLA Method Extended to the Spatial Domain
preprint, January 2015


Cosmological model discrimination with Deep Learning
preprint, January 2017


cuDNN: Efficient Primitives for Deep Learning
preprint, January 2014


FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters
conference, June 2016

  • Iandola, Forrest N.; Moskewicz, Matthew W.; Ashraf, Khalid
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/cvpr.2016.284

Enabling Dark Energy Science with Deep Generative Models of Galaxy Images
preprint, January 2016


Hsp90 is important for fecundity, longevity, and buffering of cryptic deleterious variation in wild fly populations
journal, January 2012


WOMBAT: A Scalable and High Performance Astrophysical MHD Code
text, January 2017


Planck 2015 results : XIII. Cosmological parameters
journal, September 2016


Solving large scale structure in ten easy steps with COLA
journal, June 2013

  • Tassev, Svetlin; Zaldarriaga, Matias; Eisenstein, Daniel J.
  • Journal of Cosmology and Astroparticle Physics, Vol. 2013, Issue 06
  • DOI: 10.1088/1475-7516/2013/06/036

Multi-scale initial conditions for cosmological simulations: Multi-scale initial conditions
journal, July 2011


WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code
journal, February 2017

  • Mendygral, P. J.; Radcliffe, N.; Kandalla, K.
  • The Astrophysical Journal Supplement Series, Vol. 228, Issue 2
  • DOI: 10.3847/1538-4365/aa5b9c

In-Datacenter Performance Analysis of a Tensor Processing Unit
conference, January 2017

  • Jouppi, Norman P.; Borchers, Al; Boyle, Rick
  • Proceedings of the 44th Annual International Symposium on Computer Architecture - ISCA '17
  • DOI: 10.1145/3079856.3080246

Deep Residual Learning for Image Recognition
conference, June 2016

  • He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2016.90

Evolving Deep Networks Using HPC
conference, January 2017

  • Young, Steven R.; Rose, Derek C.; Johnston, Travis
  • Proceedings of the Machine Learning on HPC Environments - MLHPC'17
  • DOI: 10.1145/3146347.3146355

Deep learning at 15PF: supervised and semi-supervised classification for scientific data
conference, January 2017

  • Kurth, Thorsten; Smorkalov, Mikhail; Deslippe, Jack
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17
  • DOI: 10.1145/3126908.3126916

Rotation-invariant convolutional neural networks for galaxy morphology prediction
journal, April 2015

  • Dieleman, Sander; Willett, Kyle W.; Dambre, Joni
  • Monthly Notices of the Royal Astronomical Society, Vol. 450, Issue 2
  • DOI: 10.1093/mnras/stv632

Distributed asynchronous deterministic and stochastic gradient optimization algorithms
journal, September 1986

  • Tsitsiklis, J.; Bertsekas, D.; Athans, M.
  • IEEE Transactions on Automatic Control, Vol. 31, Issue 9
  • DOI: 10.1109/TAC.1986.1104412

Works referencing / citing this record:

Parallelizing Training of Deep Generative Models on Massive Scientific Datasets
preprint, January 2019


Derivation and Analysis of Fast Bilinear Algorithms for Convolution
preprint, January 2019


Quasar Detection using Linear Support Vector Machine with Learning From Mistakes Methodology
text, January 2020


Exascale Deep Learning for Climate Analytics
conference, November 2018

  • Kurth, Thorsten; Treichler, Sean; Romero, Joshua
  • SC18: International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/sc.2018.00054

DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
conference, November 2019

  • Rupe, Adam; Prabhat, Mr; Crutchfield, James P.
  • 2019 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)
  • DOI: 10.1109/mlhpc49564.2019.00013

The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
preprint, January 2020


Learning to Predict the Cosmological Structure Formation
text, January 2018


HPC AI500: A Benchmark Suite for HPC AI Systems
preprint, January 2019


Learning to predict the cosmological structure formation
journal, June 2019

  • He, Siyu; Li, Yin; Feng, Yu
  • Proceedings of the National Academy of Sciences, Vol. 116, Issue 28
  • DOI: 10.1073/pnas.1821458116

A computational-graph partitioning method for training memory-constrained DNNs
journal, July 2021


Clairvoyant prefetching for distributed machine learning I/O
conference, November 2021

  • Dryden, Nikoli; Böhringer, Roman; Ben-Nun, Tal
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1145/3458817.3476181