Trevor Darrell
May 8, 2006, 4:15PM
Abstract
Devices should be perceptive, and respond directly to their human user and/or environment. In this talk I'll present new computer vision algorithms for fast recognition, indexing, and tracking that make this possible, enabling multimodal interfaces which respond to users' conversational gesture and body language, robots which recognize common object categories, and mobile devices which can search using visual cues of specific objects of interest. I'll describe in detail a method for image indexing and recognition of object categories based on a new kernel function over sets of local features that approximates the true correspondence-based similarity between set elements. Our pyramid match efficiently forms an implicit partial matching between two sets of feature vectors. The matching has linear time complexity and is robust to clutter or outlier features--a critical advantage for handling images with variable backgrounds, occlusions, and viewpoint changes. With this technique, mobile devices can recognize locations and gather information about newly encountered objects by finding matching images on the web or other available databases.
About the Speaker
Trevor Darrell is an Associate Professor of Electrical Engineering and Computer Science at M.I.T. He leads the Vision Interface Group at the Computer Science and Artificial Intelligence Laboratory. His interests include computer vision, interactive graphics, and machine learning. Prior to joining the faculty of MIT he worked as a Member of the Research Staff at Interval Research in Palo Alto, CA, reseaching vision-based interface algorithms for consumer applications. He received his PhD and SM from the MIT Media Lab in 1996 and 1991, and the BSE while working at the GRASP Robotics Laboratory at the University of Pennsylvania in 1988.
No comments:
Post a Comment