Scalable and accurate multi-GPU-based image reconstruction of large-scale ptychography data
Abstract
Abstract While the advances in synchrotron light sources, together with the development of focusing optics and detectors, allow nanoscale ptychographic imaging of materials and biological specimens, the corresponding experiments can yield terabyte-scale volumes of data that can impose a heavy burden on the computing platform. Although graphics processing units (GPUs) provide high performance for such large-scale ptychography datasets, a single GPU is typically insufficient for analysis and reconstruction. Several works have considered leveraging multiple GPUs to accelerate the ptychographic reconstruction. However, most of these works utilize only the Message Passing Interface to handle the communications between GPUs. This approach poses inefficiency for a hardware configuration that has multiple GPUs in a single node, especially while reconstructing a single large projection, since it provides no optimizations to handle the heterogeneous GPU interconnections containing both low-speed (e.g., PCIe) and high-speed links (e.g., NVLink). In this paper, we provide an optimized intranode multi-GPU implementation that can efficiently solve large-scale ptychographic reconstruction problems. We focus on the maximum likelihood reconstruction problem using a conjugate gradient (CG) method for the solution and propose a novel hybrid parallelization model to address the performance bottlenecks in the CG solver. Accordingly, we have developed a tool, called PtyGermore »
- Authors:
- Publication Date:
- Research Org.:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE Office of Science (SC), Basic Energy Sciences (BES); USDOE National Nuclear Security Administration (NNSA); US Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA)
- OSTI Identifier:
- 1860447
- Alternate Identifier(s):
- OSTI ID: 1901717
- Grant/Contract Number:
- AC02-06CH11357; 89233218CNA000001; D2019-1903270004
- Resource Type:
- Published Article
- Journal Name:
- Scientific Reports
- Additional Journal Information:
- Journal Name: Scientific Reports Journal Volume: 12 Journal Issue: 1; Journal ID: ISSN 2045-2322
- Publisher:
- Nature Publishing Group
- Country of Publication:
- United Kingdom
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Yu, Xiaodong, Nikitin, Viktor, Ching, Daniel J., Aslan, Selin, Gürsoy, Doğa, and Biçer, Tekin. Scalable and accurate multi-GPU-based image reconstruction of large-scale ptychography data. United Kingdom: N. p., 2022.
Web. doi:10.1038/s41598-022-09430-3.
Yu, Xiaodong, Nikitin, Viktor, Ching, Daniel J., Aslan, Selin, Gürsoy, Doğa, & Biçer, Tekin. Scalable and accurate multi-GPU-based image reconstruction of large-scale ptychography data. United Kingdom. https://doi.org/10.1038/s41598-022-09430-3
Yu, Xiaodong, Nikitin, Viktor, Ching, Daniel J., Aslan, Selin, Gürsoy, Doğa, and Biçer, Tekin. Tue .
"Scalable and accurate multi-GPU-based image reconstruction of large-scale ptychography data". United Kingdom. https://doi.org/10.1038/s41598-022-09430-3.
@article{osti_1860447,
title = {Scalable and accurate multi-GPU-based image reconstruction of large-scale ptychography data},
author = {Yu, Xiaodong and Nikitin, Viktor and Ching, Daniel J. and Aslan, Selin and Gürsoy, Doğa and Biçer, Tekin},
abstractNote = {Abstract While the advances in synchrotron light sources, together with the development of focusing optics and detectors, allow nanoscale ptychographic imaging of materials and biological specimens, the corresponding experiments can yield terabyte-scale volumes of data that can impose a heavy burden on the computing platform. Although graphics processing units (GPUs) provide high performance for such large-scale ptychography datasets, a single GPU is typically insufficient for analysis and reconstruction. Several works have considered leveraging multiple GPUs to accelerate the ptychographic reconstruction. However, most of these works utilize only the Message Passing Interface to handle the communications between GPUs. This approach poses inefficiency for a hardware configuration that has multiple GPUs in a single node, especially while reconstructing a single large projection, since it provides no optimizations to handle the heterogeneous GPU interconnections containing both low-speed (e.g., PCIe) and high-speed links (e.g., NVLink). In this paper, we provide an optimized intranode multi-GPU implementation that can efficiently solve large-scale ptychographic reconstruction problems. We focus on the maximum likelihood reconstruction problem using a conjugate gradient (CG) method for the solution and propose a novel hybrid parallelization model to address the performance bottlenecks in the CG solver. Accordingly, we have developed a tool, called PtyGer ( Pty chographic G PU(multipl e )-based r econstruction), implementing our hybrid parallelization model design. A comprehensive evaluation verifies that PtyGer can fully preserve the original algorithm’s accuracy while achieving outstanding intranode GPU scalability.},
doi = {10.1038/s41598-022-09430-3},
journal = {Scientific Reports},
number = 1,
volume = 12,
place = {United Kingdom},
year = {Tue Mar 29 00:00:00 EDT 2022},
month = {Tue Mar 29 00:00:00 EDT 2022}
}
https://doi.org/10.1038/s41598-022-09430-3
Works referenced in this record:
Phase retrieval with transverse translation diversity: a nonlinear optimization approach
journal, January 2008
- Guizar-Sicairos, Manuel; Fienup, James R.
- Optics Express, Vol. 16, Issue 10
Parallel ptychographic reconstruction
journal, January 2014
- Nashed, Youssef S. G.; Vine, David J.; Peterka, Tom
- Optics Express, Vol. 22, Issue 26
S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters
conference, January 2017
- Awan, Ammar Ahmad; Hamidouche, Khaled; Hashmi, Jahanzeb Maqbool
- PPoPP '17: 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing
conference, June 2016
- Wang, Linnan; Wu, Wei; Xu, Zenglin
- ICS '16: 2016 International Conference on Supercomputing, Proceedings of the 2016 International Conference on Supercomputing
Maximum-likelihood refinement for coherent diffractive imaging
journal, June 2012
- Thibault, P.; Guizar-Sicairos, M.
- New Journal of Physics, Vol. 14, Issue 6
AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors
conference, May 2016
- Hou, Kaixi; Wang, Hao; Feng, Wu-Chun
- 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
XDesign : an open-source software package for designing X-ray imaging phantoms and experiments
journal, February 2017
- Ching, Daniel J.; Gürsoy, Dogˇa
- Journal of Synchrotron Radiation, Vol. 24, Issue 2
Beyond crystallography: Diffractive imaging using coherent x-ray light sources
journal, April 2015
- Miao, J.; Ishikawa, T.; Robinson, I. K.
- Science, Vol. 348, Issue 6234
Movable Aperture Lensless Transmission Microscopy: A Novel Phase Retrieval Algorithm
journal, July 2004
- Faulkner, H. M. L.; Rodenburg, J. M.
- Physical Review Letters, Vol. 93, Issue 2
The conjugate gradient method in extremal problems
journal, January 1969
- Polyak, B. T.
- USSR Computational Mathematics and Mathematical Physics, Vol. 9, Issue 4
Multi-GPU Graph Analytics
conference, May 2017
- Pan, Yuechao; Wang, Yangzihao; Wu, Yuduo
- 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Transmission microscopy without lenses for objects of unlimited size
journal, February 2007
- Rodenburg, J. M.; Hurst, A. C.; Cullis, A. G.
- Ultramicroscopy, Vol. 107, Issue 2-3
High-throughput ptychography using Eiger-scanning X-ray nano-imaging of extended regions
journal, January 2014
- Guizar-Sicairos, Manuel; Johnson, Ian; Diaz, Ana
- Optics Express, Vol. 22, Issue 12
Topology-aware optimizations for multi-GPU ptychographic image reconstruction
conference, June 2021
- Yu, Xiaodong; Bicer, Tekin; Kettimuthu, Rajkumar
- ICS '21: 2021 International Conference on Supercomputing, Proceedings of the ACM International Conference on Supercomputing
High-Resolution Scanning X-ray Diffraction Microscopy
journal, July 2008
- Thibault, P.; Dierolf, M.; Menzel, A.
- Science, Vol. 321, Issue 5887
The Velociprobe: An ultrafast hard X-ray nanoprobe for high-resolution ptychographic imaging
journal, August 2019
- Deng, Junjing; Preissner, Curt; Klug, Jeffrey A.
- Review of Scientific Instruments, Vol. 90, Issue 8
Image Quality Assessment: From Error Visibility to Structural Similarity
journal, April 2004
- Wang, Z.; Bovik, A. C.; Sheikh, H. R.
- IEEE Transactions on Image Processing, Vol. 13, Issue 4
Relaxed averaged alternating reflections for diffraction imaging
journal, November 2004
- Luke, D. Russell
- Inverse Problems, Vol. 21, Issue 1
Keyhole coherent diffractive imaging
journal, March 2008
- Abbey, Brian; Nugent, Keith A.; Williams, Garth J.
- Nature Physics, Vol. 4, Issue 5
GPU acceleration of regular expression matching for large datasets: exploring the implementation space
conference, January 2013
- Yu, Xiaodong; Becchi, Michela
- Proceedings of the ACM International Conference on Computing Frontiers - CF '13
Stepping up to Summit
journal, March 2018
- Hines, Jonathan
- Computing in Science & Engineering, Vol. 20, Issue 2
Probe retrieval in ptychographic coherent diffractive imaging
journal, March 2009
- Thibault, Pierre; Dierolf, Martin; Bunk, Oliver
- Ultramicroscopy, Vol. 109, Issue 4
Optimization of Collective Communication Operations in MPICH
journal, February 2005
- Thakur, Rajeev; Rabenseifner, Rolf; Gropp, William
- The International Journal of High Performance Computing Applications, Vol. 19, Issue 1
High-Performance Multi-Mode Ptychography Reconstruction on Distributed GPUs
conference, August 2018
- Dong, Zhihua; Fang, Yao-Lung L.; Huang, Xiaojing
- 2018 New York Scientific Data Summit (NYSDS)
Rotation-as-fast-axis scanning-probe x-ray tomography: the importance of angular diversity for fly-scan modes
journal, January 2018
- Ching, Daniel J.; Hidayetoğlu, Mert; Biçer, Tekin
- Applied Optics, Vol. 57, Issue 30
A phase retrieval algorithm for shifting illumination
journal, November 2004
- Rodenburg, J. M.; Faulkner, H. M. L.
- Applied Physics Letters, Vol. 85, Issue 20
An optimum demodulator for poisson processes: Photon source detectors
journal, January 1963
- Reiffen, B.; Sherman, H.
- Proceedings of the IEEE, Vol. 51, Issue 10
cuART: Fine-Grained Algebraic Reconstruction Technique for Computed Tomography Images on GPUs
conference, May 2016
- Yu, Xiaodong; Wang, Hao; Feng, Wu-Chun
- 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)
Joint ptycho-tomography reconstruction through alternating direction method of multipliers
journal, January 2019
- Aslan, Selin; Nikitin, Viktor; Ching, Daniel J.
- Optics Express, Vol. 27, Issue 6
An improved ptychographical phase retrieval algorithm for diffractive imaging
journal, September 2009
- Maiden, Andrew M.; Rodenburg, John M.
- Ultramicroscopy, Vol. 109, Issue 10
PtychoShelves , a versatile high-level framework for high-performance analysis of ptychographic data
journal, March 2020
- Wakonig, Klaus; Stadler, Hans-Christian; Odstrčil, Michal
- Journal of Applied Crystallography, Vol. 53, Issue 2
Iterative least-squares solver for generalized maximum-likelihood ptychography
journal, January 2018
- Odstrčil, Michal; Menzel, Andreas; Guizar-Sicairos, Manuel
- Optics Express, Vol. 26, Issue 3
Ptychography at the Linac Coherent Light Source in a parasitic geometry
journal, September 2020
- Pound, Benjamin A.; Mertes, Kevin M.; Carr, Adra V.
- Journal of Applied Crystallography, Vol. 53, Issue 5
An Enhanced Image Reconstruction Tool for Computed Tomography on GPUs
conference, May 2017
- Yu, Xiaodong; Wang, Hao; Feng, Wu-chun
- CF '17: Computing Frontiers Conference, Proceedings of the Computing Frontiers Conference
Convergence Properties of Nonlinear Conjugate Gradient Methods
journal, January 2000
- Dai, Yuhong; Han, Jiye; Liu, Guanghui
- SIAM Journal on Optimization, Vol. 10, Issue 2
Memory access patterns: the missing piece of the multi-GPU puzzle
conference, November 2015
- Ben-Nun, Tal; Levy, Ely; Barak, Amnon
- SC15: The International Conference for High Performance Computing, Networking, Storage and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Exploring different automata representations for efficient regular expression matching on GPUs
journal, August 2013
- Yu, Xiaodong; Becchi, Michela
- ACM SIGPLAN Notices, Vol. 48, Issue 8
Beugung im inhomogenen Primärstrahlwellenfeld. I. Prinzip einer Phasenmessung von Elektronenbeungungsinterferenzen
journal, July 1969
- Hoppe, W.
- Acta Crystallographica Section A, Vol. 25, Issue 4
Comparing Managed Memory and ATS with and without Prefetching on NVIDIA Volta GPUs
conference, November 2019
- Gayatri, Rahulkumar; Gott, Kevin; Deslippe, Jack
- 2019 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)
cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on CPU+GPU
journal, July 2017
- Zhang, Jing; Wang, Hao; Feng, Wu-chun
- IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 14, Issue 4
Ptychopy: GPU framework for ptychographic data analysis
conference, September 2021
- Yue, Ke; Deng, Junjing; Jiang, Yi
- X-Ray Nanoimaging: Instruments and Methods V
Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect
journal, January 2020
- Li, Ang; Song, Shuaiwen Leon; Chen, Jieyang
- IEEE Transactions on Parallel and Distributed Systems, Vol. 31, Issue 1
Ptychography & lensless X-ray imaging
journal, January 2008
- Dierolf, Martin; Bunk, Oliver; Kynde, Søren
- Europhysics News, Vol. 39, Issue 1
Coherent X-Ray Diffraction Imaging
journal, January 2012
- Miao, Jianwei; Sandberg, Richard L.; Song, Changyong
- IEEE Journal of Selected Topics in Quantum Electronics, Vol. 18, Issue 1
Simultaneous X-ray fluorescence and ptychographic microscopy of Cyclotella meneghiniana
journal, January 2012
- Vine, D. J.; Pelliccia, D.; Holzner, C.
- Optics Express, Vol. 20, Issue 16
MemXCT: memory-centric X-ray CT reconstruction with massive parallelization
conference, November 2019
- Hidayetoğlu, Mert; Biçer, Tekin; de Gonzalo, Simon Garcia
- SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Further improvements to the ptychographical iterative engine
journal, January 2017
- Maiden, Andrew; Johnson, Daniel; Li, Peng
- Optica, Vol. 4, Issue 7
Coherent lensless X-ray imaging
journal, November 2010
- Chapman, Henry N.; Nugent, Keith A.
- Nature Photonics, Vol. 4, Issue 12
GPU-Based Static Data-Flow Analysis for Fast and Scalable Android App Vetting
conference, May 2020
- Yu, Xiaodong; Wei, Fengguo; Ou, Xinming
- 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
PyNX.Ptycho : a computing library for X-ray coherent diffraction imaging of nanostructures
journal, September 2016
- Mandula, Ondřej; Elzo Aizarna, Marta; Eymery, Joël
- Journal of Applied Crystallography, Vol. 49, Issue 5
Demystifying automata processing: GPUs, FPGAs or Micron's AP?
conference, January 2017
- Nourian, Marziyeh; Wang, Xiang; Yu, Xiaodong
- Proceedings of the International Conference on Supercomputing - ICS '17
A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property
journal, January 1999
- Dai, Y. H.; Yuan, Y.
- SIAM Journal on Optimization, Vol. 10, Issue 1
NV-group: link-efficient reduction for distributed deep learning on modern dense GPU systems
conference, June 2020
- Chu, Ching-Hsiang; Kousha, Pouya; Awan, Ammar Ahmad
- ICS '20: 2020 International Conference on Supercomputing, Proceedings of the 34th ACM International Conference on Supercomputing
GPU-Aware MPI on RDMA-Enabled Clusters: Design, Implementation and Evaluation
journal, October 2014
- Wang, Hao; Potluri, Sreeram; Bureddy, Devendar
- IEEE Transactions on Parallel and Distributed Systems, Vol. 25, Issue 10
GPU-Based Iterative Medical CT Image Reconstructions
journal, March 2018
- Yu, Xiaodong; Wang, Hao; Feng, Wu-chun
- Journal of Signal Processing Systems, Vol. 91, Issue 3-4
Groute: An Asynchronous Multi-GPU Programming Model for Irregular Computations
journal, October 2017
- Ben-Nun, Tal; Sutton, Michael; Pai, Sreepathi
- ACM SIGPLAN Notices, Vol. 52, Issue 8
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation
conference, May 2019
- Awan, Ammar Ahmad; Bedorf, Jereon; Chu, Ching-Hsiang
- 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)