skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Random permutations fix a worst case for cyclic coordinate descent

Abstract

Abstract Variants of the coordinate descent approach for minimizing a nonlinear function are distinguished in part by the order in which coordinates are considered for relaxation. Three common orderings are cyclic (CCD), in which we cycle through the components of $x$ in order; randomized (RCD), in which the component to update is selected randomly and independently at each iteration; and random-permutations cyclic (RPCD), which differs from CCD only in that a random permutation is applied to the variables at the start of each cycle. Known convergence guarantees are weaker for CCD and RPCD than for RCD, though in most practical cases, computational performance is similar among all these variants. There is a certain type of quadratic function for which CCD is significantly slower than for RCD; a recent paper by Sun & Ye (2016, Worst-case complexity of cyclic coordinate descent: $O(n^2)$ gap with randomized version. Technical Report. Stanford, CA: Department of Management Science and Engineering, Stanford University. arXiv:1604.07130) has explored the poor behavior of CCD on functions of this type. The RPCD approach performs well on these functions, even better than RCD in a certain regime. This paper explains the good behavior of RPCD with a tight analysis.

Authors:
 [1];  [1]
  1. Computer Sciences Department, University of Wisconsin-Madison, Madison, WI, USA
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1462147
Grant/Contract Number:  
[3F-30222; 8F-30039]
Resource Type:
Published Article
Journal Name:
IMA Journal of Numerical Analysis
Additional Journal Information:
[Journal Name: IMA Journal of Numerical Analysis Journal Volume: 39 Journal Issue: 3]; Journal ID: ISSN 0272-4979
Publisher:
Oxford University Press
Country of Publication:
United Kingdom
Language:
English

Citation Formats

Lee, Ching-pei, and Wright, Stephen J. Random permutations fix a worst case for cyclic coordinate descent. United Kingdom: N. p., 2018. Web. doi:10.1093/imanum/dry040.
Lee, Ching-pei, & Wright, Stephen J. Random permutations fix a worst case for cyclic coordinate descent. United Kingdom. doi:10.1093/imanum/dry040.
Lee, Ching-pei, and Wright, Stephen J. Fri . "Random permutations fix a worst case for cyclic coordinate descent". United Kingdom. doi:10.1093/imanum/dry040.
@article{osti_1462147,
title = {Random permutations fix a worst case for cyclic coordinate descent},
author = {Lee, Ching-pei and Wright, Stephen J.},
abstractNote = {Abstract Variants of the coordinate descent approach for minimizing a nonlinear function are distinguished in part by the order in which coordinates are considered for relaxation. Three common orderings are cyclic (CCD), in which we cycle through the components of $x$ in order; randomized (RCD), in which the component to update is selected randomly and independently at each iteration; and random-permutations cyclic (RPCD), which differs from CCD only in that a random permutation is applied to the variables at the start of each cycle. Known convergence guarantees are weaker for CCD and RPCD than for RCD, though in most practical cases, computational performance is similar among all these variants. There is a certain type of quadratic function for which CCD is significantly slower than for RCD; a recent paper by Sun & Ye (2016, Worst-case complexity of cyclic coordinate descent: $O(n^2)$ gap with randomized version. Technical Report. Stanford, CA: Department of Management Science and Engineering, Stanford University. arXiv:1604.07130) has explored the poor behavior of CCD on functions of this type. The RPCD approach performs well on these functions, even better than RCD in a certain regime. This paper explains the good behavior of RPCD with a tight analysis.},
doi = {10.1093/imanum/dry040},
journal = {IMA Journal of Numerical Analysis},
number = [3],
volume = [39],
place = {United Kingdom},
year = {2018},
month = {7}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
DOI: 10.1093/imanum/dry040

Save / Share:

Works referenced in this record:

Coordinate descent algorithms
journal, March 2015


On the convergence of the coordinate descent method for convex differentiable minimization
journal, January 1992

  • Luo, Z. Q.; Tseng, P.
  • Journal of Optimization Theory and Applications, Vol. 72, Issue 1
  • DOI: 10.1007/BF00939948

Error bounds and convergence analysis of feasible descent methods: a general approach
journal, March 1993

  • Luo, Zhi-Quan; Tseng, Paul
  • Annals of Operations Research, Vol. 46-47, Issue 1
  • DOI: 10.1007/BF02096261

Asynchronous Stochastic Coordinate Descent: Parallelism and Convergence Properties
journal, January 2015

  • Liu, Ji; Wright, Stephen J.
  • SIAM Journal on Optimization, Vol. 25, Issue 1
  • DOI: 10.1137/140961134

Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems
journal, January 2012

  • Nesterov, Yu.
  • SIAM Journal on Optimization, Vol. 22, Issue 2
  • DOI: 10.1137/100802001

On the Convergence of Block Coordinate Descent Type Methods
journal, January 2013

  • Beck, Amir; Tetruashvili, Luba
  • SIAM Journal on Optimization, Vol. 23, Issue 4
  • DOI: 10.1137/120887679

On approximate solutions of systems of linear inequalities
journal, October 1952

  • Hoffman, A. J.
  • Journal of Research of the National Bureau of Standards, Vol. 49, Issue 4
  • DOI: 10.6028/jres.049.027