Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Bandit game Parameters available to the forecaster
 

Summary: Bandit game
Parameters available to the forecaster:
the number of arms (or actions) K and the number of rounds n
Unknown to the forecaster: the way the gain vectors
gt = (g1,t, . . . , gK,t) [0, 1]K are generated
For each round t = 1, 2, . . . , n
1. the forecaster chooses an arm It {1, . . . , K}
2. the forecaster receives the gain gIt ,t
3. only gIt ,t is revealed to the forecaster
Cumulative regret goal: maximize the cumulative gains obtained.
More precisely, minimize
Rn = max
i=1,...,K
E
n
t=1
gi,t - E
n
t=1
gIt ,t

  

Source: Audibert, Jean-Yves - Département d'Informatique, École Normale Supérieure

 

Collections: Computer Technologies and Information Sciences