skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Journal Article · · International Conference for High Performance Computing, Networking, Storage and Analysis
 [1];  [2];  [3];  [1];  [4];  [5];  [6];  [1];  [3];  [5];  [3];  [5];  [5];  [6];  [3];  [2];  [5]
  1. Intel Corp., Hillsboro, OR (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  3. Cray Inc., Seattle, WA (United States)
  4. Univ. of California, Berkeley, CA (United States)
  5. Intel Corp., Santa Clara, CA (United States)
  6. Flatiron Inst., New York, NY (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Carnegie Mellon Univ., Pittsburgh, PA (United States)

Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel® Xeon Phi™ processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. Here, these enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩM, σ8 and ns with unprecedented accuracy.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1510756
Journal Information:
International Conference for High Performance Computing, Networking, Storage and Analysis, Vol. 2018; Conference: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, Dallas, TX (United States), 11-16 Nov 2018; ISSN 2167-4329
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (19)

Distributed Deep Learning Using Synchronous Stochastic Gradient Descent preprint January 2016
Solving large scale structure in ten easy steps with COLA journal June 2013
sCOLA: The N-body COLA Method Extended to the Spatial Domain preprint January 2015
Cosmological model discrimination with Deep Learning preprint January 2017
cuDNN: Efficient Primitives for Deep Learning preprint January 2014
FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters conference June 2016
Enabling Dark Energy Science with Deep Generative Models of Galaxy Images preprint January 2016
Hsp90 is important for fecundity, longevity, and buffering of cryptic deleterious variation in wild fly populations journal January 2012
WOMBAT: A Scalable and High Performance Astrophysical MHD Code text January 2017
Planck 2015 results : XIII. Cosmological parameters journal September 2016
Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC: Evaluating the Networking Characteristics of the Cray XC-40 Intel Knights Landing Based Cori Supercomputer at NERSC journal September 2017
Multi-scale initial conditions for cosmological simulations: Multi-scale initial conditions journal July 2011
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code journal February 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit conference January 2017
Deep Residual Learning for Image Recognition conference June 2016
Evolving Deep Networks Using HPC conference January 2017
Deep learning at 15PF: supervised and semi-supervised classification for scientific data
  • Kurth, Thorsten; Smorkalov, Mikhail; Deslippe, Jack
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17 https://doi.org/10.1145/3126908.3126916
conference January 2017
Rotation-invariant convolutional neural networks for galaxy morphology prediction journal April 2015
Distributed asynchronous deterministic and stochastic gradient optimization algorithms journal September 1986

Cited By (12)

Response to NITRD, NCO, NSF Request for Information on "Update to the 2016 National Artificial Intelligence Research and Development Strategic Plan" preprint January 2019
Parallelizing Training of Deep Generative Models on Massive Scientific Datasets preprint January 2019
Derivation and Analysis of Fast Bilinear Algorithms for Convolution preprint January 2019
Quasar Detection using Linear Support Vector Machine with Learning From Mistakes Methodology text January 2020
Exascale Deep Learning for Climate Analytics conference November 2018
DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems conference November 2019
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism preprint January 2020
Learning to Predict the Cosmological Structure Formation text January 2018
HPC AI500: A Benchmark Suite for HPC AI Systems preprint January 2019
Learning to predict the cosmological structure formation journal June 2019
A computational-graph partitioning method for training memory-constrained DNNs journal July 2021
Clairvoyant prefetching for distributed machine learning I/O
  • Dryden, Nikoli; Böhringer, Roman; Ben-Nun, Tal
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3458817.3476181
conference November 2021

Similar Records

Cataloging the visible universe through Bayesian inference in Julia at petascale
Journal Article · Mon Jan 21 00:00:00 EST 2019 · Journal of Parallel and Distributed Computing · OSTI ID:1510756

Cataloging the visible universe through Bayesian inference in Julia at petascale
Journal Article · Wed May 01 00:00:00 EDT 2019 · Journal of Parallel and Distributed Computing · OSTI ID:1510756

Preparing NERSC users for Cori, a Cray XC40 system with Intel many integrated cores
Journal Article · Fri Aug 25 00:00:00 EDT 2017 · Concurrency and Computation. Practice and Experience · OSTI ID:1510756