DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Improved Architectures and Training Algorithms for Deep Operator Networks

Journal Article · · Journal of Scientific Computing
 [1];  [2]; ORCiD logo [2]
  1. Univ. of Pennsylvania, Philadelphia, PA (United States); University of Pennsylvania
  2. Univ. of Pennsylvania, Philadelphia, PA (United States)

Operator learning techniques have recently emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces. Trained under appropriate constraints, they can also be effective in learning the solution operator of partial differential equations (PDEs) in an entirely self-supervised manner. In this work we analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel theory, and reveal a bias that favors the approximation of functions with larger magnitudes. To correct this bias we propose to adaptively re-weight the importance of each training example, and demonstrate how this procedure can effectively balance the magnitude of back-propagated gradients during training via gradient descent. We also propose a novel network architecture that is more resilient to vanishing gradient pathologies. Taken together, our developments provide new insights into the training of DeepONets and consistently improve their predictive accuracy by a factor of 10-50x, demonstrated in the challenging setting of learning PDE solution operators in the absence of paired input-output observations.

Research Organization:
Univ. of Pennsylvania, Philadelphia, PA (United States)
Sponsoring Organization:
US Air Force Office of Scientific Research (AFOSR); USDOE Advanced Research Projects Agency - Energy (ARPA-E)
Grant/Contract Number:
SC0019116
OSTI ID:
2339531
Journal Information:
Journal of Scientific Computing, Journal Name: Journal of Scientific Computing Journal Issue: 2 Vol. 92; ISSN 0885-7474
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English

References (17)

Exponential Time Differencing for Stiff Systems journal March 2002
Efficient BackProp book January 2012
On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks journal October 2021
Deep learning of free boundary and Stefan problems journal March 2021
DeepM&Mnet: Inferring the electroconvection multiphysics fields based on operator approximation by neural networks journal July 2021
A First Course in the Numerical Analysis of Differential Equations book January 2008
Array programming with NumPy journal September 2020
Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators journal March 2021
Wide neural networks of any depth evolve as linear models under gradient descent * journal December 2020
Estimates on the generalization error of physics-informed neural networks for approximating a class of inverse problems for PDEs journal June 2021
Error estimates for DeepONets: a deep learning framework in infinite dimensions journal March 2022
Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems journal July 1995
Deep Residual Learning for Image Recognition conference June 2016
Learning the solution operator of parametric partial differential equations with physics-informed DeepONets journal October 2021
Understanding and Mitigating Gradient Flow Pathologies in Physics-Informed Neural Networks journal January 2021
dolfin-adjoint 2018.1: automated adjoints for FEniCS and Firedrake journal June 2019
On the Convergence of Physics Informed Neural Networks for Linear Second-Order Elliptic and Parabolic Type PDEs journal June 2020