Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Minimax policies for adversarial and stochastic bandits Jean-Yves Audibert
 

Summary: Minimax policies for adversarial and stochastic bandits
Jean-Yves Audibert
Imagine, Universit´e Paris Est
&
Willow, CNRS/ENS/INRIA, Paris, France
audibert@certis.enpc.fr
S´ebastien Bubeck
SequeL Project, INRIA Lille
40 avenue Halley,
59650 Villeneuve d'Ascq, France
sebastien.bubeck@inria.fr
Abstract
We fill in a long open gap in the characterization of
the minimax rate for the multi-armed bandit prob-
lem. Concretely, we remove an extraneous loga-
rithmic factor in the previously known upper bound
and propose a new family of randomized algorithms
based on an implicit normalization, as well as a
new analysis. We also consider the stochastic case,
and prove that an appropriate modification of the

  

Source: Audibert, Jean-Yves - Département d'Informatique, École Normale Supérieure
École Normale Supérieure, Laboratoire d'Informatique, WILLOW Computer vision and machine learning research laboratory

 

Collections: Computer Technologies and Information Sciences; Engineering