Adaptive Activation Functions Accelerate Convergence in Deep and Physicsinformed Neural Networks
Abstract
We employ adaptive activation functions for regression in deep and physicsinformed neural networks (PINNs) to approximate smooth and discontinuous functions as well as solutions of linear and nonlinear partial differential equations. In particular, we solve the nonlinear KleinGordon equation, which has smooth solutions, the nonlinear Burgers equation, which can admit high gradient solutions, and the Helmholtz equation. We introduce a scalable hyperparameter in the activation function, which can be optimized to achieve best performance of the network as it changes dynamically the topology of the loss function involved in the optimization process. The adaptive activation function has better learning capabilities than the traditional one (fixed activation) as it improves greatly the convergence rate, especially at early training, as well as the solution accuracy. To better understand the learning process, we plot the neural network solution in the frequency domain to examine how the network captures successively different frequency bands present in the solution. We consider both forward problems, where the approximate solutions are obtained, as well as inverse problems, where parameters involved in the governing equation are identified. Our simulation results show that the proposed method is a very simple and effective approach to increase the efficiency, robustness and accuracymore »
 Authors:

 Brown University
 Massachusetts Institute of Technology
 BROWN UNIVERSITY
 Publication Date:
 Research Org.:
 Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
 Sponsoring Org.:
 USDOE
 OSTI Identifier:
 1617451
 Report Number(s):
 PNNLSA152708
 DOE Contract Number:
 AC0576RL01830
 Resource Type:
 Journal Article
 Journal Name:
 Journal of Computational Physics
 Additional Journal Information:
 Journal Volume: 404
 Country of Publication:
 United States
 Language:
 English
 Subject:
 machine leaning, Bad minima, Inverse problems, Physicsinformed neural networks, Partial differential equations, Deep learning benchmarks
Citation Formats
Jagtap, Ameya, Kawaguchi, Kenji, and Karniadakis, George E. Adaptive Activation Functions Accelerate Convergence in Deep and Physicsinformed Neural Networks. United States: N. p., 2020.
Web. doi:10.1016/j.jcp.2019.109136.
Jagtap, Ameya, Kawaguchi, Kenji, & Karniadakis, George E. Adaptive Activation Functions Accelerate Convergence in Deep and Physicsinformed Neural Networks. United States. doi:10.1016/j.jcp.2019.109136.
Jagtap, Ameya, Kawaguchi, Kenji, and Karniadakis, George E. Sun .
"Adaptive Activation Functions Accelerate Convergence in Deep and Physicsinformed Neural Networks". United States. doi:10.1016/j.jcp.2019.109136.
@article{osti_1617451,
title = {Adaptive Activation Functions Accelerate Convergence in Deep and Physicsinformed Neural Networks},
author = {Jagtap, Ameya and Kawaguchi, Kenji and Karniadakis, George E.},
abstractNote = {We employ adaptive activation functions for regression in deep and physicsinformed neural networks (PINNs) to approximate smooth and discontinuous functions as well as solutions of linear and nonlinear partial differential equations. In particular, we solve the nonlinear KleinGordon equation, which has smooth solutions, the nonlinear Burgers equation, which can admit high gradient solutions, and the Helmholtz equation. We introduce a scalable hyperparameter in the activation function, which can be optimized to achieve best performance of the network as it changes dynamically the topology of the loss function involved in the optimization process. The adaptive activation function has better learning capabilities than the traditional one (fixed activation) as it improves greatly the convergence rate, especially at early training, as well as the solution accuracy. To better understand the learning process, we plot the neural network solution in the frequency domain to examine how the network captures successively different frequency bands present in the solution. We consider both forward problems, where the approximate solutions are obtained, as well as inverse problems, where parameters involved in the governing equation are identified. Our simulation results show that the proposed method is a very simple and effective approach to increase the efficiency, robustness and accuracy of the neural network approximation of nonlinear functions as well as solutions of partial differential equations, especially for forward problems. We theoretically prove that in the proposed method, gradient descent algorithms are not attracted to suboptimal critical points or local minima.},
doi = {10.1016/j.jcp.2019.109136},
journal = {Journal of Computational Physics},
number = ,
volume = 404,
place = {United States},
year = {2020},
month = {3}
}