skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Ultrafast convolution/superposition using tabulated and exponential kernels on GPU

Abstract

Purpose: Collapsed-cone convolution/superposition (CCCS) dose calculation is the workhorse for IMRT dose calculation. The authors present a novel algorithm for computing CCCS dose on the modern graphic processing unit (GPU). Methods: The GPU algorithm includes a novel TERMA calculation that has no write-conflicts and has linear computation complexity. The CCCS algorithm uses either tabulated or exponential cumulative-cumulative kernels (CCKs) as reported in literature. The authors have demonstrated that the use of exponential kernels can reduce the computation complexity by order of a dimension and achieve excellent accuracy. Special attentions are paid to the unique architecture of GPU, especially the memory accessing pattern, which increases performance by more than tenfold. Results: As a result, the tabulated kernel implementation in GPU is two to three times faster than other GPU implementations reported in literature. The implementation of CCCS showed significant speedup on GPU over single core CPU. On tabulated CCK, speedups as high as 70 are observed; on exponential CCK, speedups as high as 90 are observed. Conclusions: Overall, the GPU algorithm using exponential CCK is 1000-3000 times faster over a highly optimized single-threaded CPU implementation using tabulated CCK, while the dose differences are within 0.5% and 0.5 mm. This ultrafast CCCSmore » algorithm will allow many time-sensitive applications to use accurate dose calculation.« less

Authors:
; ;  [1]
  1. TomoTherapy Inc., 1240 Deming Way, Madison, Wisconsin 53717 (United States)
Publication Date:
OSTI Identifier:
22098540
Resource Type:
Journal Article
Journal Name:
Medical Physics
Additional Journal Information:
Journal Volume: 38; Journal Issue: 3; Other Information: (c) 2011 American Association of Physicists in Medicine; Country of input: International Atomic Energy Agency (IAEA); Journal ID: ISSN 0094-2405
Country of Publication:
United States
Language:
English
Subject:
62 RADIOLOGY AND NUCLEAR MEDICINE; 61 RADIATION PROTECTION AND DOSIMETRY; ACCURACY; ALGORITHMS; CALCULATION METHODS; COMPUTER GRAPHICS; DOSIMETRY; KERNELS; PERFORMANCE; PLANNING; RADIATION DOSES; RADIOTHERAPY

Citation Formats

Quan, Chen, Mingli, Chen, and Weiguo, Lu. Ultrafast convolution/superposition using tabulated and exponential kernels on GPU. United States: N. p., 2011. Web. doi:10.1118/1.3551996.
Quan, Chen, Mingli, Chen, & Weiguo, Lu. Ultrafast convolution/superposition using tabulated and exponential kernels on GPU. United States. https://doi.org/10.1118/1.3551996
Quan, Chen, Mingli, Chen, and Weiguo, Lu. 2011. "Ultrafast convolution/superposition using tabulated and exponential kernels on GPU". United States. https://doi.org/10.1118/1.3551996.
@article{osti_22098540,
title = {Ultrafast convolution/superposition using tabulated and exponential kernels on GPU},
author = {Quan, Chen and Mingli, Chen and Weiguo, Lu},
abstractNote = {Purpose: Collapsed-cone convolution/superposition (CCCS) dose calculation is the workhorse for IMRT dose calculation. The authors present a novel algorithm for computing CCCS dose on the modern graphic processing unit (GPU). Methods: The GPU algorithm includes a novel TERMA calculation that has no write-conflicts and has linear computation complexity. The CCCS algorithm uses either tabulated or exponential cumulative-cumulative kernels (CCKs) as reported in literature. The authors have demonstrated that the use of exponential kernels can reduce the computation complexity by order of a dimension and achieve excellent accuracy. Special attentions are paid to the unique architecture of GPU, especially the memory accessing pattern, which increases performance by more than tenfold. Results: As a result, the tabulated kernel implementation in GPU is two to three times faster than other GPU implementations reported in literature. The implementation of CCCS showed significant speedup on GPU over single core CPU. On tabulated CCK, speedups as high as 70 are observed; on exponential CCK, speedups as high as 90 are observed. Conclusions: Overall, the GPU algorithm using exponential CCK is 1000-3000 times faster over a highly optimized single-threaded CPU implementation using tabulated CCK, while the dose differences are within 0.5% and 0.5 mm. This ultrafast CCCS algorithm will allow many time-sensitive applications to use accurate dose calculation.},
doi = {10.1118/1.3551996},
url = {https://www.osti.gov/biblio/22098540}, journal = {Medical Physics},
issn = {0094-2405},
number = 3,
volume = 38,
place = {United States},
year = {Tue Mar 15 00:00:00 EDT 2011},
month = {Tue Mar 15 00:00:00 EDT 2011}
}