Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Journal of Machine Learning Research 11 (2010) 2785-2836 Submitted 7/09; Revised 6/10; Published 10/10 Regret Bounds and Minimax Policies under Partial Monitoring
 

Summary: Journal of Machine Learning Research 11 (2010) 2785-2836 Submitted 7/09; Revised 6/10; Published 10/10
Regret Bounds and Minimax Policies under Partial Monitoring
Jean-Yves Audibert AUDIBERT@IMAGINE.ENPC.FR
Imagine, Universit´e Paris Est
6 avenue Blaise Pascal
77455 Champs-sur-Marne, France
S´ebastien Bubeck SEBASTIEN.BUBECK@INRIA.FR
SequeL Project, INRIA Lille
40 avenue Halley
59650 Villeneuve d'Ascq, France
Editor: Nicol`o Cesa-Bianchi
Abstract
This work deals with four classical prediction settings, namely full information, bandit, label effi-
cient and bandit label efficient as well as four different notions of regret: pseudo-regret, expected re-
gret, high probability regret and tracking the best expert regret. We introduce a new forecaster, INF
(Implicitly Normalized Forecaster) based on an arbitrary function for which we propose a unified
analysis of its pseudo-regret in the four games we consider. In particular, for (x) = exp(x)+
K ,
INF reduces to the classical exponentially weighted average forecaster and our analysis of the
pseudo-regret recovers known results while for the expected regret we slightly tighten the bounds.

  

Source: Audibert, Jean-Yves - Département d'Informatique, École Normale Supérieure

 

Collections: Computer Technologies and Information Sciences