Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Proceedlngs of 25th Conference onDecisionandControl

Summary: Proceedlngs of 25th Conference
Athens, Greece -December 1986 WA10 - 10:15
Asymptotically Efficient Rules in Multiarmed Bandit Problems
V. Anantharam and P. Varaiya
Department of Electrical Engineering and Computer Sciences
and Electronics Research Laboratory
University of California, Berkeley CA 94720
We are given N discrete-time real-valued stochastic processes
XI: X1(1). X'(2).' ' '
X N :X N ( l ) , X N ( 2 ) .' . ' .
. . .
The essential assumption is that these processes are independent. For
historical reasons these processes are also called m.
A fixed number r n , 1 dm < N , is specified. At each time t we
must select m different arms. Let T J(t be the number of times that
arm j wasselectedduringtheinterval 1. ' . .t : and let
U ( t ) C { l : . . , N ) bethem armsthatareselectedattimet. Then
at time t we receive the reward


Source: Anantharam, Venkat - Department of Electrical Engineering and Computer Sciences, University of California at Berkeley


Collections: Engineering