Andrew Stein, Robotics Institute, Carnegie Mellon University
28 August 2006
Abstract
While much focus in computer vision is placed on the processing of individual, static images, many applications actually offer video, or sequences of images, as input. The extra temporal dimension of the data allows the motion of the camera or the scene to be used in processing. In particular, this motion provides the opportunity to observe objects or surfaces occluding one another. While often considered a nuisance to be "handled," the boundaries of objects at which occlusion occurs can also be valuable sources of information about 3D scene structure and shape. Since most, if not all, computer vision techniques aggregate information spatially within a scene via smoothing, patches, or graphical models with neighborhood structures, information from different physical surfaces in the scene is invariably and erroneously considered together. The low-level ability to locally detect occlusion through motion, then, should benefit many different vision techniques.
To this end, we propose to use our existing low-level occlusion detector, based on local reasoning about moving edges and the patches of data on either side of them, to find those edges in a scene which show evidence of being occlusion boundaries. We will also propose tackling this problem with a learned discriminative classifer, using the same motion features. Taking uncertainty into account, we will then propagate this local, low-level information more globally using random field methods or a confidence-based hysteresis thresholding approach. With extended occlusion boundaries available, we can then develop methods for incorporating that information into existing feature-based object recognition techniques, including our own Background and Scale Invariant Feature Transform (BSIFT). Leveraging existing techniques as a foundation, we also propose the use of these boundaries in generic object detection and segmentation, which may be advantageous for unsupervised detection and learning of novel objects in general environments.
This thesis therefore seeks to contribute to both the low- and high- level aspects of reasoning about occlusion:
- We will develop and compare a novel model-based detector and a learned discriminative classifier for extracting local occlusion boundaries in short video clips, both based on local motion features.
- We will show how to use occlusion boundary information to benefit the high-level tasks of feature-based object recognition and object detection/segmentation, possibly for unsupervised learning of object models.
We have existing work completed at either end of the spectrum (model- based detection and boundary-respecting recognition). Future work includes improvements to each, the connection of the two, and further research on the segmentation and learning tasks.
A copy of the thesis proposal document can be found at http://www.andrewstein.net/proposall.pdf.
No comments:
Post a Comment