On the Optimality of Spatial Attention for Object Detection

Jonathan Harel and Christof Koch
California Institute of Technology, Pasadena, CA 91125
Abstract. Studies on visual attention traditionally focus on its physio-
logical and psychophysical nature [16, 18, 19], or its algorithmic applica-
tions [1, 9, 21]. We here develop a simple, formal mathematical model of
the advantage of spatial attention for object detection, in which spatial
attention is defined as processing a subset of the visual input, and de-
tection is an abstraction with certain failure characteristics. We demon-
strate that it is suboptimal to process the entire visual input given prior
information about target locations, which in practice is almost always
available in a video setting due to tracking, motion, or saliency. This
argues for an attentional strategy independent of computational savings:
no matter how much computational power is available, it is in principle
better to dedicate it preferentially to selected portions of the scene. This
suggests, anecdotally, a form of environmental pressure for the evolution
of foveated photoreceptor densities in the retina. It also offers a general
justification for the use of spatial attention in machine vision.
1 Introduction


