skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Cataloging the visible universe through Bayesian inference in Julia at petascale

Journal Article · · Journal of Parallel and Distributed Computing

A key task in astronomy is to locate astronomical objects in images and to characterize them according to physical parameters such as brightness, color, and morphology. This task, known as cataloging, is challenging for several reasons: many astronomical objects are much dimmer than the sky background, labeled data is generally unavailable, overlapping astronomical objects must be resolved collectively, and the datasets are enormous – terabytes now, petabytes soon. In this work, we infer an astronomical catalog from 55 TB of imaging data using Celeste, a Bayesian variational inference code written entirely in the high-productivity programming language Julia. Using over 1.3 million threads on 650,000 Intel Xeon Phi cores of the Cori Phase II supercomputer, Celeste achieves a peak rate of 1.54 DP PFLOP/s. Celeste is able to jointly optimize parameters for 188 M stars and galaxies, loading and processing 178 TB across 8192 nodes in 14.6 min. To achieve this, Celeste exploits parallelism at multiple levels (cluster, node, and thread) and accelerates I/O through Cori’s burst buffer. Julia’s native performance enables Celeste to employ high-level constructs without resorting to hand-written or generated low-level code (C/C++/Fortran) while still achieving petascale performance.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1527362
Journal Information:
Journal of Parallel and Distributed Computing, Vol. 127, Issue C; ISSN 0743-7315
Country of Publication:
United States
Language:
English

References (15)

The Eleventh and Twelfth data Releases of the Sloan Digital sky Survey: Final data from Sdss-Iii journal July 2015
Cython: The Best of Both Worlds journal March 2011
SExtractor: Software for source extraction journal June 1996
Julia: A Fresh Approach to Numerical Computing journal January 2017
Variational Inference: A Review for Statisticians journal July 2016
Probabilistic Catalogs for Crowded Stellar Fields journal June 2013
A New Boson with a Mass of 125 GeV Observed with the CMS Experiment at the Large Hadron Collider journal December 2012
Remote Memory Access Programming in MPI-3 journal July 2015
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit journal May 2006
Scripting: higher level programming for the 21st Century journal March 1998
Improved Point-source Detection in Crowded Fields Using Probabilistic Cataloging journal September 2017
Cataloging the Visible Universe Through Bayesian Inference at Petascale conference May 2018
Towards Green Aviation with Python at Petascale
  • Vincent, Peter; Witherden, Freddie; Vermeire, Brian
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2016.1
conference November 2016
The NumPy Array: A Structure for Efficient Numerical Computation journal March 2011
Seeing the Earth in the Cloud: Processing one petabyte of satellite imagery in one day conference October 2015

Cited By (1)


Similar Records

Cataloging the visible universe through Bayesian inference in Julia at petascale
Journal Article · Mon Jan 21 00:00:00 EST 2019 · Journal of Parallel and Distributed Computing · OSTI ID:1527362

CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Journal Article · Thu Mar 14 00:00:00 EDT 2019 · International Conference for High Performance Computing, Networking, Storage and Analysis · OSTI ID:1527362

Quantum Monte Carlo Endstation for Petascale Computing
Technical Report · Wed Mar 02 00:00:00 EST 2011 · OSTI ID:1527362

Related Subjects