Improved Architectures and Training Algorithms for Deep Operator Networks
- Univ. of Pennsylvania, Philadelphia, PA (United States); University of Pennsylvania
- Univ. of Pennsylvania, Philadelphia, PA (United States)
Operator learning techniques have recently emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces. Trained under appropriate constraints, they can also be effective in learning the solution operator of partial differential equations (PDEs) in an entirely self-supervised manner. In this work we analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel theory, and reveal a bias that favors the approximation of functions with larger magnitudes. To correct this bias we propose to adaptively re-weight the importance of each training example, and demonstrate how this procedure can effectively balance the magnitude of back-propagated gradients during training via gradient descent. We also propose a novel network architecture that is more resilient to vanishing gradient pathologies. Taken together, our developments provide new insights into the training of DeepONets and consistently improve their predictive accuracy by a factor of 10-50x, demonstrated in the challenging setting of learning PDE solution operators in the absence of paired input-output observations.
- Research Organization:
- Univ. of Pennsylvania, Philadelphia, PA (United States)
- Sponsoring Organization:
- US Air Force Office of Scientific Research (AFOSR); USDOE Advanced Research Projects Agency - Energy (ARPA-E)
- Grant/Contract Number:
- SC0019116
- OSTI ID:
- 2339531
- Journal Information:
- Journal of Scientific Computing, Journal Name: Journal of Scientific Computing Journal Issue: 2 Vol. 92; ISSN 0885-7474
- Publisher:
- SpringerCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
What do physics-informed DeepONets learn? Understanding and improving training for scientific computing applications
B-DeepONet: An enhanced Bayesian DeepONet for solving noisy parametric PDEs using accelerated replica exchange SGLD