 
Summary: Perceptron Learning with Random Coordinate Descent
Ling Li
Learning Systems Group, California Institute of Technology
Abstract. A perceptron is a linear threshold classifier that separates examples with a hyperplane. It
is perhaps the simplest learning model that is used standalone. In this paper, we propose a family of
random coordinate descent algorithms for perceptron learning on binary classification problems. Un
like most perceptron learning algorithms, which require smooth cost functions, our algorithms directly
minimize the training error, and usually achieve the lowest training error compared with other algo
rithms. The algorithms are also computationally efficient. Such advantages make them favorable for
both standalone use and ensemble learning, on problems that are not linearly separable. Experiments
show that our algorithms work very well with AdaBoost, and achieve the lowest test errors for half of
the data sets.
1 Introduction
The perceptron was first introduced by Rosenblatt (1958) as a probabilistic model for information
processing in the brain. Presented with an input vector x, a perceptron calculates a weighted sum
of x, the inner product of x and its weight vector w. If the sum is above some threshold, the
perceptron outputs 1; otherwise it outputs 1.
Since a perceptron separates examples with a hyperplane in the input space, it is only capa
ble of learning linearly separable problems.1 For problems with more complex patterns, layers of
perceptrons have to be connected to form an artificial neural network, and the backpropagation
