Summary: The chaotic nature of faster gradient descent
Kees van den Doel and Uri Ascher
July 19, 2011
The steepest descent method for large linear systems is well-known to of-
ten converge very slowly, with the number of iterations required being about
the same as that obtained by utilizing a gradient descent method with the
best constant step size and growing proportionally to the condition number.
Faster gradient descent methods must occasionally resort to significantly larger
step sizes, which in turn yields a rather non-monotone decrease pattern in the
residual vector norm.
We show that such faster gradient descent methods in fact generate chaotic
dynamical systems for the normalized residual vectors. Very little is required
to generate chaos here: simply damping steepest descent by a constant factor
close to 1 will do.
Several variants of the family of faster gradient descent methods are inves-
tigated, both experimentally and analytically. The fastest practical methods of
this family in general appear to be the known, chaotic, two-step ones. Our re-
sults also highlight the need of better theory for existing faster gradient descent