Adaptive Patching for High-resolution Image Segmentation with Transformers

Zhang, Enzhi; Lyngaas, Isaac; Chen, Peng; Wang, Xiao; Igarashi, Jun; Huo, Yuankai; Wahib, Mohamed; Munetomo, Masaharu

Adaptive Patching for High-resolution Image Segmentation with Transformers

Conference · Fri Nov 01 00:00:00 EDT 2024

OSTI ID:2480031

Zhang, Enzhi ^[1]; ^[2]; Chen, Peng ^[3]; Wang, Xiao ^[2]; Igarashi, Jun ^[4]; Huo, Yuankai ^[5]; Wahib, Mohamed ^[4]; Munetomo, Masaharu ^[1]

Hokkaido University, Japan
ORNL
AIST, Japan
RIKEN Center for Computational Science
Vanderbilt University

Attention-based models are proliferating in the space of image analytics, including segmentation. The standard method of feeding images to transformer encoders is to divide the images into patches and then feed the patches to the model as a linear sequence of tokens. For high-resolution images, e.g. microscopic pathology images, the quadratic compute and memory cost prohibits the use of an attention-based model, if we are to use smaller patch sizes that are favorable in segmentation. The solution is to either use custom complex multi-resolution models or approximate attention schemes. We take inspiration from Adapative Mesh Refinement (AMR) methods in HPC by adaptively patching the images, as a pre-processing step, based on the image details to reduce the number of patches being fed to the model, by orders of magnitude. This method has a negligible overhead, and works seamlessly with any attention-based model, i.e. it is a pre-processing step that can be adopted by any attention-based model without friction. We demonstrate superior segmentation quality over SoTA segmentation models for realworld pathology datasets while gaining a geomean speedup of 6.9× for resolutions up to 64K2, on up to 2, 048 GPUs.

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 2480031

Country of Publication:: United States

Language:: English

Similar Records

SHF: Symmetrical Hierarchical Forest with Pretrained Vision Transformer Encoder for High-Resolution Medical Segmentation

Conference · Sun Nov 30 23:00:00 EST 2025 · OSTI ID:3009458

Automated bone segmentation from dental CBCT images using patch-based sparse representation and convex optimization

Journal Article · Tue Apr 15 00:00:00 EDT 2014 · Medical Physics · OSTI ID:22250792

Semantic Stealth: Crafting Covert Adversarial Patches for Sentiment Classifiers Using Large Language Models

Conference · Fri Nov 01 00:00:00 EDT 2024 · OSTI ID:2480040

Adaptive Patching for High-resolution Image Segmentation with Transformers

Citation Formats

Similar Records

Related Subjects