Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

SAGIPS: a physics-inspired scalable asynchronous generative inverse-problem solver

Journal Article · · Machine Learning: Science and Technology
Solving large-scale inverse problems using deep-learning algorithms have become an essential part of modern research and industrial applications. The complexity of the underlying inverse problem may require the utilization of high performance computing systems which poses a challenge on the algorithmic design of the inverse problem solver. Most deep learning algorithms require, due to their design, custom parallelization techniques in order to be resource efficient while showing a reasonable convergence. In this paper we introduce a Scalable Asynchronous Generative Inverse Problem Solver (SAGIPS) on high-performance computing systems. We present a workflow that utilizes an asynchronous ring-allreduce algorithm to transfer the gradients of the generator network across multiple GPUs. Experiments with a scientific proxy application demonstrate that SAGIPS shows near linear weak scaling, together with a convergence quality that is comparable to traditional methods. The approach presented here allows leveraging Generative Adverserial Network across multiple GPUs, promising advancements in solving complex inverse problems at scale.
Research Organization:
Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE Office of Science (SC), Nuclear Physics (NP)
Grant/Contract Number:
AC02-06CH11357; AC05-06OR23177; SC0023472
OSTI ID:
2557554
Alternate ID(s):
OSTI ID: 2549470
OSTI ID: 2563545
Report Number(s):
DOE/OR/23177--7637; JLAB-CST--24-4174; arXiv:2407.00051
Journal Information:
Machine Learning: Science and Technology, Journal Name: Machine Learning: Science and Technology Journal Issue: 2 Vol. 6; ISSN 2632-2153
Publisher:
IOP PublishingCopyright Statement
Country of Publication:
United States
Language:
English

References (7)

Two-tree algorithms for full bandwidth broadcast, reduction and scan journal December 2009
Computationally Efficient Neural Rendering for Generator Adversarial Networks Using a Multi-GPU Cluster in a Cloud Environment journal January 2023
mpi4py: Status Update After 12 Years of Development journal July 2021
Performance, Energy, and Scalability Analysis and Improvement of Parallel Cancer Deep Learning CANDLE Benchmarks conference August 2019
FeGAN conference December 2020
PyTorch distributed journal August 2020
GAN Ensemble for Anomaly Detection journal May 2021

Similar Records

Evaluating asynchronous Schwarz solvers on GPUs
Journal Article · Sun Aug 09 20:00:00 EDT 2020 · International Journal of High Performance Computing Applications · OSTI ID:1778413