UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation
Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https: //github.com/LS4GAN/uvcgan.
- Research Organization:
- Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Organization:
- Laboratory-Directed Research and Development (LDRD)
- DOE Contract Number:
- SC0012704
- OSTI ID:
- 1895074
- Report Number(s):
- BNL-223609-2022-COPA
- Country of Publication:
- United States
- Language:
- English
Similar Records
Hyperparameter Studies for Vision Transformers Trained on High-Fidelity Simulations
Potential Flow Generator With L2 Optimal Transport Regularity for Generative Models