Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Proceedlngs of 25th Conference onDecisionandControl
 

Summary: Proceedlngs of 25th Conference
onDecisionandControl
Athens, Greece -December 1986 WA10 - 10:15
Asymptotically Efficient Rules in Multiarmed Bandit Problems
V. Anantharam and P. Varaiya
Department of Electrical Engineering and Computer Sciences
and Electronics Research Laboratory
University of California, Berkeley CA 94720
Setup
We are given N discrete-time real-valued stochastic processes
XI: X1(1). X'(2).' ' '
X N :X N ( l ) , X N ( 2 ) .' . ' .
. . .
The essential assumption is that these processes are independent. For
historical reasons these processes are also called m.
A fixed number r n , 1 dm < N , is specified. At each time t we
must select m different arms. Let T J(t be the number of times that
arm j wasselectedduringtheinterval 1. ' . .t : and let
U ( t ) C { l : . . , N ) bethem armsthatareselectedattimet. Then
at time t we receive the reward

  

Source: Anantharam, Venkat - Department of Electrical Engineering and Computer Sciences, University of California at Berkeley

 

Collections: Engineering