Summary: Variational Information Maximization in Gaussian Channels
Felix V. Agakov
School of Informatics, University of Edinburgh, EH1 2QL, UK
IDIAP, Rue du Simplon 4, CH1920 Martigny Switzerland
Recently, we introduced a simple variational bound on mutual information, that resolves some
of the di#culties in the application of information theory to machine learning. Here we study a
specific application to Gaussian channels. It is well known that PCA may be viewed as the solution
to maximizing information transmission between a high dimensional vector x and its low dimensional
representation y. However, such results are based on assumptions of Gaussianity of the sources x.
In this paper, we show how our mutual information bound, when applied to this arena, gives PCA
solutions, without the need for the Gaussian assumption. Furthermore, it naturally generalizes to
providing an objective function for Kernel PCA, enabling the principled selection of kernel parameters.
Maximization of information transmission in noisy channels is a common problem, ranging from the
construction of good errorcorrecting codes  and feature extraction  to neural sensory processing