Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Exploration-exploitation trade-off using variance estimates in multi-armed bandits
 

Summary: Exploration-exploitation trade-off using variance
estimates in multi-armed bandits
Jean Yves Audibert
Universit´e Paris-Est, Ecole des Ponts ParisTech, CERTIS
6 avenue Blaise Pascal, 77455 Marne-la-Vall´ee, France
&
Willow - ENS / INRIA
45 rue d'Ulm, 75005 Paris, France
R´emi Munos
INRIA Lille - Nord Europe, SequeL project,
40 avenue Halley, 59650 Villeneuve d'Ascq, France
Csaba Szepesv´ari,1
Department of Computing Science
University of Alberta
Edmonton T6G 2E8, Canada
Abstract
Algorithms based on upper confidence bounds for balancing exploration and
exploitation are gaining popularity since they are easy to implement, efficient
and effective. This paper considers a variant of the basic algorithm for the
stochastic, multi-armed bandit problem that takes into account the empirical

  

Source: Audibert, Jean-Yves - Département d'Informatique, École Normale Supérieure

 

Collections: Computer Technologies and Information Sciences