Summary: Vehicle Traffic Light Control Using SARSA
Thomas L. Thorpe
April 2, 1997
SARSA (Sutton, 1996) is applied to a simulated, trafficlight control problem (Thorpe, 1997)
and its performance is compared with several, fixed control strategies. The performance of SARSA
with four different representations of the current state of traffic is analyzed using two reinforcement
schemes. Training on one intersection is compared to, and is as effective as training on all inter
sections in the environment. SARSA is shown to be better than fixedduration light timing and
fourway stops for minimizing total traffic travel time, individual vehicle travel times, and vehicle
wait times. Comparisons of performance using a constant reinforcement function versus a variable
reinforcement function dependent on the number of vehicles at an intersection showed that the
variable reinforcement resulted in slightly improved performance for some cases.
A variety of traffic control strategies are being studied in real traffic networks and in simulation.
The Denver Regional Council of Governments works with the Colorado Department of Transportation
and citizens to identify and modify problem intersections (Garnaas, 1996). Computers are used to
monitor the traffic flows for critical intersections throughout the Denver region. The computers have the
capability to change traffic light timing remotely but are only used to collect data for traffic analysis.
Recently a major traffic artery was retimed from 90 seconds in the heavy traffic flow direction to 100