Accelerating gradient descent and Adam via fractional gradients

Shin, Yeonjong; Darbon, Jérôme; Karniadakis, George Em

doi:10.1016/j.neunet.2023.01.002

Accelerating gradient descent and Adam via fractional gradients

Journal Article · Tue Jan 10 23:00:00 EST 2023 · Neural Networks

DOI:https://doi.org/10.1016/j.neunet.2023.01.002· OSTI ID:2282013

^[1]; Darbon, Jérôme ^[2]; Karniadakis, George Em ^[2]

Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of); Brown University
Brown Univ., Providence, RI (United States)

Here we propose a class of novel fractional-order optimization algorithms. We define a fractional-order gradient via the Caputo fractional derivatives that generalizes integer-order gradient. We refer it to as the Caputo fractional-based gradient, and develop an efficient implementation to compute it. A general class of fractional-order optimization methods is then obtained by replacing integer-order gradients with the Caputo fractional-based gradients. To give concrete algorithms, we consider gradient descent (GD) and Adam, and extend them to the Caputo fractional GD (CfGD) and the Caputo fractional Adam (CfAdam). We demonstrate the superiority of CfGD and CfAdam on several large scale optimization problems that arise from scientific machine learning applications, such as ill-conditioned least squares problem on real-world data and the training of neural networks involving non-convex objective functions. Numerical examples show that both CfGD and CfAdam result in acceleration over GD and Adam, respectively. We also derive error bounds of CfGD for quadratic functions, which further indicate that CfGD could mitigate the dependence on the condition number in the rate of convergence and results in significant acceleration over GD.

View Accepted Manuscript (DOE)

Research Organization:: Brown Univ., Providence, RI (United States)

Sponsoring Organization:: USDOE; US Army Research Office (ARO); US Air Force Office of Scientific Research (AFOSR)

Grant/Contract Number:: SC0019453

OSTI ID:: 2282013

Journal Information:: Neural Networks, Journal Name: Neural Networks Vol. 161; ISSN 0893-6080

Publisher:: ElsevierCopyright Statement

Country of Publication:: United States

Language:: English

References (18)

A Fractional Gradient Descent-Based RBF Neural Network Khan, Shujaat; Naseem, Imran; Malik, Muhammad Ammar Circuits, Systems, and Signal Processing, Vol. 37, Issue 12 https://doi.org/10.1007/s00034-018-0835-3	journal	May 2018
Localization of nonlocal gradients in various topologies Mengesha, Tadele; Spector, Daniel Calculus of Variations and Partial Differential Equations, Vol. 52, Issue 1-2 https://doi.org/10.1007/s00526-014-0711-3	journal	March 2014
Fractional differential equation approach for convex optimization with convergence rate analysis Liang, Shu; Wang, Leyi; Yin, George Optimization Letters, Vol. 14, Issue 1 https://doi.org/10.1007/s11590-019-01437-6	journal	May 2019
Study on fractional order gradient methods Chen, Yuquan; Gao, Qing; Wei, Yiheng Applied Mathematics and Computation, Vol. 314 https://doi.org/10.1016/j.amc.2017.07.023	journal	December 2017
Fractional vector calculus and fractional Maxwell’s equations Tarasov, Vasily E. Annals of Physics, Vol. 323, Issue 11 https://doi.org/10.1016/j.aop.2008.04.005	journal	November 2008
On the calibration of sensor arrays for pattern recognition using the minimal number of experiments Rodriguez-Lujan, Irene; Fonollosa, Jordi; Vergara, Alexander Chemometrics and Intelligent Laboratory Systems, Vol. 130 https://doi.org/10.1016/j.chemolab.2013.10.012	journal	January 2014
Generalization of the gradient method with fractional order gradient direction Wei, Yiheng; Kang, Yu; Yin, Weidi Journal of the Franklin Institute, Vol. 357, Issue 4 https://doi.org/10.1016/j.jfranklin.2020.01.008	journal	March 2020
Convolutional neural networks with fractional order gradient method Sheng, Dian; Wei, Yiheng; Chen, Yuquan Neurocomputing, Vol. 408 https://doi.org/10.1016/j.neucom.2019.10.017	journal	September 2020
Fractional-order gradient descent learning of BP neural networks with Caputo derivative Wang, Jian; Wen, Yanqing; Gou, Yida Neural Networks, Vol. 89 https://doi.org/10.1016/j.neunet.2017.02.007	journal	May 2017
An innovative fractional order LMS based on variable initial value and gradient order Cheng, Songsong; Wei, Yiheng; Chen, Yuquan Signal Processing, Vol. 133 https://doi.org/10.1016/j.sigpro.2016.11.026	journal	April 2017
Chemical gas sensor drift compensation using classifier ensembles Vergara, Alexander; Vembu, Shankar; Ayhan, Tuba Sensors and Actuators B: Chemical, Vol. 166-167 https://doi.org/10.1016/j.snb.2012.01.074	journal	May 2012
Numerical methods for nonlocal and fractional models D’Elia, Marta; Du, Qiang; Glusa, Christian Acta Numerica, Vol. 29 https://doi.org/10.1017/S096249292000001X	journal	May 2020
Deep learning LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey Nature, Vol. 521, Issue 7553 https://doi.org/10.1038/nature14539	journal	May 2015
Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators Lu, Lu; Jin, Pengzhan; Pang, Guofei Nature Machine Intelligence, Vol. 3, Issue 3 https://doi.org/10.1038/s42256-021-00302-5	journal	March 2021
Fractional Extreme Value Adaptive Training Method: Fractional Steepest Descent Approach Pu, Yi-Fei; Zhou, Ji-Liu; Zhang, Yi IEEE Transactions on Neural Networks and Learning Systems, Vol. 26, Issue 4 https://doi.org/10.1109/TNNLS.2013.2286175	journal	April 2015
Linear Models of Dissipation whose Q is almost Frequency Independent--II Caputo, M. Geophysical Journal International, Vol. 13, Issue 5 https://doi.org/10.1111/j.1365-246X.1967.tb02303.x	journal	November 1967
Tikhonov Regularization and Total Least Squares Golub, Gene H.; Hansen, Per Christian; O'Leary, Dianne P. SIAM Journal on Matrix Analysis and Applications, Vol. 21, Issue 1 https://doi.org/10.1137/S0895479897326432	journal	January 1999
Towards a Unified theory of Fractional and Nonlocal Vector Calculus D’Elia, Marta; Gulian, Mamikon; Olson, Hayley Fractional Calculus and Applied Analysis, Vol. 24, Issue 5 https://doi.org/10.1515/fca-2021-0057	journal	October 2021

Similar Records

Stochastic gradient descent for optimization for nuclear systems

Journal Article · Thu May 25 00:00:00 EDT 2023 · Scientific Reports · OSTI ID:2417878

Convergence of Hyperbolic Neural Networks Under Riemannian Stochastic Gradient Descent

Journal Article · Thu Oct 05 00:00:00 EDT 2023 · Communications on Applied Mathematics and Computation · OSTI ID:2007675

Stochastic gradient descent algorithm for stochastic optimization in solving analytic continuation problems

Journal Article · Sat Feb 29 23:00:00 EST 2020 · Foundations of Data Science · OSTI ID:1632071

Related Subjects

97 MATHEMATICS AND COMPUTING
Adam
Caputo fractional derivative
neural networks
non-local calculus
optimization

Accelerating gradient descent and Adam via fractional gradients

Citation Formats

References (18)

Similar Records

Related Subjects