Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Best Arm Identification in Multi-Armed Bandits Jean-Yves Audibert

Summary: Best Arm Identification in Multi-Armed Bandits
Jean-Yves Audibert
Imagine, Universit´e Paris Est
Willow, CNRS/ENS/INRIA, Paris, France
S´ebastien Bubeck, R´emi Munos
SequeL Project, INRIA Lille
40 avenue Halley,
59650 Villeneuve d'Ascq, France
{sebastien.bubeck, remi.munos}@inria.fr
This is the supplemental material of the COLT'10 paper untitled "Best Arm Identification in Multi-
Armed Bandits".
A Lower bound for UCB-E
Theorem 1 If 2, . . . , K are Dirac distributions concentrated at 1
2 and if 1 is the Bernoulli distribution of
parameter 3/4, the UCB-E algorithm satisfies 4Ern = en 4-(4a+1)
Proof: Consider the event E on which the reward obtained from the first m = 4a draws of arm 1 are equal


Source: Audibert, Jean-Yves - Département d'Informatique, École Normale Supérieure


Collections: Computer Technologies and Information Sciences