skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Toward Large-Scale Image Segmentation on Summit

Conference ·

Semantic segmentation of images is an important computer vision task that emerges in a variety of application domains such as medical imaging, robotic vision and autonomous vehicles to name a few. While these domain-specific image analysis tasks involve relatively small image sizes (~ 102 × 102), there are many applications that need to train machine learning models on image data with extents that are orders of magnitude larger (~104 × 104). Training deep neural network (DNN) models on large extent images is extremely memory-intensive and often exceeds the memory limitations of a single graphical processing unit, a hardware accelerator of choice for computer vision workloads. Here, an efficient, sample parallel approach to train U-Net models on large extent image data sets is presented. Its advantages and limitations are analyzed and near-linear strong-scaling speedup demonstrated on 256 nodes (1536 GPUs) of the Summit supercomputer. Using a single node of the Summit supercomputer, an early evaluation of a recently released model parallel framework called GPipe is demonstrated to deliver ~ 2X speedup in executing a U-Net model with an order of magnitude larger number of trainable parameters than reported before. Performance bottlenecks for pipelined training of U-Net models are identified and mitigation strategies to improve the speedups are discussed. Together, these results open up the possibility of combining both approaches into a unified scalable pipelined and data parallel algorithm to efficiently train U-Net models with very large receptive fields on data sets of ultra-large extent images.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1665994
Resource Relation:
Conference: 49th International Conference on Parallel Processing - ICPP - Vancouver,, , Canada - 8/17/2020 8:00:00 AM-8/20/2020 8:00:00 AM
Country of Publication:
United States
Language:
English

Similar Records

Distributed Halide
Journal Article · Fri Jan 01 00:00:00 EST 2016 · SIGPLAN · OSTI ID:1665994

Exploring flexible communications for streamlining DNN ensemble training pipelines
Conference · Thu Nov 01 00:00:00 EDT 2018 · OSTI ID:1665994

Exploring Flexible Communications for Streamlining DNN Ensemble Training Pipelines
Technical Report · Wed Mar 28 00:00:00 EDT 2018 · OSTI ID:1665994

Related Subjects