Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Adaptive activation functions accelerate convergence in deep and physics-informed neural networks

Journal Article · · Journal of Computational Physics
 [1];  [2];  [3]
  1. Brown Univ., Providence, RI (United States); Brown University
  2. Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)
  3. Brown Univ., Providence, RI (United States); Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Here we employ adaptive activation functions for regression in deep and physics-informed neural networks (PINNs) to approximate smooth and discontinuous functions as well as solutions of linear and nonlinear partial differential equations. In particular, we solve the nonlinear Klein-Gordon equation, which has smooth solutions, the nonlinear Burgers equation, which can admit high gradient solutions, and the Helmholtz equation. We introduce a scalable hyper-parameter in the activation function, which can be optimized to achieve best performance of the network as it changes dynamically the topology of the loss function involved in the optimization process. The adaptive activation function has better learning capabilities than the traditional one (fixed activation) as it improves greatly the convergence rate, especially at early training, as well as the solution accuracy. To better understand the learning process, we plot the neural network solution in the frequency domain to examine how the network captures successively different frequency bands present in the solution. We consider both forward problems, where the approximate solutions are obtained, as well as inverse problems, where parameters involved in the governing equation are identified. Our simulation results show that the proposed method is a very simple and effective approach to increase the efficiency, robustness and accuracy of the neural network approximation of nonlinear functions as well as solutions of partial differential equations, especially for forward problems. We theoretically prove that in the proposed method, gradient descent algorithms are not attracted to suboptimal critical points or local minima. Furthermore, the proposed adaptive activation functions are shown to accelerate the minimization process of the loss values in standard deep learning benchmarks using CIFAR-10, CIFAR-100, SVHN, MNIST, KMNIST, Fashion-MNIST, and Semeion datasets with and without data augmentation.

Research Organization:
Brown Univ., Providence, RI (United States)
Sponsoring Organization:
USDOE; Defense Advanced Research Projects Agency (DARPA)
Grant/Contract Number:
SC0019453
OSTI ID:
2282002
Alternate ID(s):
OSTI ID: 1617451
OSTI ID: 1775904
Journal Information:
Journal of Computational Physics, Journal Name: Journal of Computational Physics Vol. 404; ISSN 0021-9991
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (22)

Automatic differentiation in machine learning: a survey text January 2015
The sine-Gordon equation as a model classical field theory journal February 1975
The extreme learning machine learning algorithm with tunable activation function journal February 2012
A New Multi-output Neural Model with Tunable Activation Function and its Applications journal November 2004
Spectral and finite difference solutions of the Burgers equation journal January 1986
A Mathematical Model Illustrating the Theory of Turbulence book January 1948
New travelling wave solutions to the Boussinesq and the Klein–Gordon equations journal July 2008
A paradigm for data-driven predictive modeling using field inversion and machine learning journal January 2016
Inferring solutions of differential equations using noisy multi-fidelity data journal April 2017
Machine learning of linear differential equations using Gaussian processes journal November 2017
Hidden physics models: Machine learning of nonlinear partial differential equations journal March 2018
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations journal February 2019
Data-driven discovery of PDEs in complex datasets journal May 2019
Adaptive activation functions in convolutional neural networks journal January 2018
Deep learning of vortex-induced vibrations journal December 2018
Gradient-based learning applied to document recognition journal January 1998
Learning long-term dependencies with gradient descent is difficult journal March 1994
Artificial neural networks for solving ordinary and partial differential equations journal January 1998
Data-driven discovery of partial differential equations journal April 2017
Bayesian Numerical Homogenization journal January 2015
Numerical Gaussian Processes for Time-Dependent and Nonlinear Partial Differential Equations journal January 2018
Some Recent Researches on the Motion of Fluids journal April 1915

Cited By (3)


Similar Records

Adaptive Activation Functions Accelerate Convergence in Deep and Physics-informed Neural Networks
Journal Article · Sat Feb 29 23:00:00 EST 2020 · Journal of Computational Physics · OSTI ID:1617451

Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems
Journal Article · Fri Mar 18 00:00:00 EDT 2022 · Computer Methods in Applied Mechanics and Engineering · OSTI ID:1976976

Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
Journal Article · Thu Oct 14 00:00:00 EDT 2021 · Neurocomputing · OSTI ID:1977480