Summary: Spatio-Temporal Segmentation of Video Data
John Y. A. Wangy
and Edward H. Adelsonz
yDepartment of Electical Engineering and Computer Science zDepartment of Brain and Cognitive Sciences
The MIT Media Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139
M.I.T. Media Laboratory Vision and Modeling Group, Technical Report No. 262, February 1994.
Appears in Proceedings of the SPIE: Image and Video Processing II, vol. 2182, San Jose, February 1994.
Image segmentation provides a powerful semantic description of video imagery essential in image understanding and
efficient manipulation of image data. In particular, segmentation based on image motion defines regions undergoing similar
motion allowing image coding system to more efficiently represent video sequences. This paper describes a general iterative
framework for segmentation of video data . The objective of our spatiotemporal segmentation is to produce a layered image
representation of the video for image coding applications whereby video data is simply described as a set of moving layers.
Segmentation is highly dependent on the model and criteria for grouping pixels into regions. In motion segmentation,
pixels are grouped together based on their similarity in motion. For any given application, the segmentation algorithm
needs to find a balance between model complexity and analysis stability. An insufficient model will inevitably result in
over segmentation. Complicated models will introduce more complexity and require more computation and constraints for
stability. In image coding, the objective of segmentation is to exploit the spatial and temporal coherences in the video data
by adequately identifying the coherent motion regions with simple motion models.