Wednesday, November 25, 2009

CMU talk: Unsupervised Detection of Regions of Interest Using Iterative Link Analysis

CMU VASC Seminar
Monday, November 30, 2009

Unsupervised Detection of Regions of Interest Using Iterative Link Analysis
Gunhee Kim
Ph.D. Student, Computer Science Department

Abstract:

This work is a joint project with Antonio Torralba during my visit to MIT and will be presented as a poster at the upcoming NIPS 2009 Conference.

This talk will discuss a fast and scalable alternating optimization technique to detect regions of interest (ROIs) in cluttered Web images without labels. The proposed approach discovers highly probable regions of object instances by iteratively repeating the following two functions: (1) choose the exemplar set (i.e. a small number of highly ranked reference ROIs) across the dataset and (2) refine the ROIs of each image with respect to the exemplar set. These two subproblems are formulated as ranking in two different similarity networks of ROI hypotheses by link analysis. The experiments with the PASCAL 06 dataset show that our unsupervised localization performance is better than one of the state-of-the-art techniques and comparable to supervised methods. Also, we test the scalability of our approach with five objects in a Flickr dataset consisting of more than 200K images.

Bio: Gunhee Kim is a Ph.D. student in CMU's Computer Science Department advised by Takeo Kanade. He received his master's degree under the supervision of Martial Hebert in 2008 from the Robotics Institute at CMU. His research interests are computer vision, machine learning, data mining, and biomedical imaging.

Monday, November 23, 2009

Lab Meeting November 25, 2009 (Alan): Navigating, Recognizing and Describing Urban Spaces With Vision and Lasers (IJRR 2009)

Title: Navigating, Recognizing and Describing Urban Spaces With Vision and Lasers (IJRR 2009)

Authors: Paul Newman, Gabe Sibley, Mike Smith, Mark Cummins, Alastair Harrison, Chris Mei, Ingmar Posner, Robbie Shade, Derik Schroeter, Liz Murphy, Winston Churchill, Dave Cole, Ian Reid

Abstract:
In this paper we describe a body of work aimed at extending the reach of mobile navigation and mapping. We describe how running topological and metric mapping and pose estimation processes concurrently, using vision and laser ranging, has produced a full six degree-of-freedom outdoor navigation system. It is capable of pro-ducing intricate three-dimensional maps over many kilometers and in real time. We consider issues concerning the intrinsic quality of the built maps and describe our progress towards adding semantic labels to maps via scene de-construction and labeling. We show how our choices of representation, inference methods and use of both topological and metric techniques naturally allow us to fuse maps built from multiple sessions with no need for manual frame alignment or data association.


Sunday, November 22, 2009

CMU PhD Thesis Proposal: Learning Methods for Thought Recognition

CMU RI PhD Thesis Proposal
Mark Palatucci
Learning Methods for Thought Recognition
November 18, 2009, 3:00 p.m., NSH 3305

Abstract
This thesis proposal considers the problem of training machine learning classifiers in domains where data are very high dimensional and training examples are extremely limited or impossible to collect for all classes of interest. As a case study, we focus on the application of thought recognition, where the objective is to classify a person’s cognitive state from a recorded image of that person’s neural activity. Machine learning and pattern recognition methods have already made a large impact on this field, but most prior work has focused on classification studies with small numbers of classes and moderate amounts of training data. In this thesis, we focus on thought recognition in a limited data setting, where there are few, if any, training examples for the classes we wish to discriminate, and the number of possible classes can be in the thousands.

Despite these constraints, this thesis seeks to demonstrate that it is possible to classify noisy, high dimensional data with extremely few training examples by using spatial and temporal domain knowledge, intelligent feature selection, semantic side information, and large quantities of unlabeled data from related tasks.

In our preliminary work, we showed that it possible that build a binary classifier that can accurately classify between cognitive states with more than 80,000 features, and only two training examples per class. We also showed how classification can be improved using principled feature selection, and derived a significance test using order statistics that is appropriate for very high-dimensional problems with small numbers of training examples.

We have also explored the most extreme case of limited data, the zero-shot learning setting, where we do not have any training examples for classes we wish to discriminate. We showed that by using a knowledge base of semantic side information to create intermediate features, we can build a classifier that can classify words that people are thinking about, even without training data for those words while the classifier is forced to choose between nearly 1,000 different candidate words.

Finally, we showed how multi-task learning can be used to learn useful semantic features directly from data. We formulated the semantic feature learning problem as a Multi-task Lasso and presented an extremely fast and highly scalable algorithm for solving the resulting optimization.

We propose work to extend our zero-shot learning setting by optimizing semantic feature sets and by using an active learning framework to choose the most informative training examples. We also propose to use latent feature models such as components analysis and sparse coding in a self-taught learning framework to improve decoding by leveraging data from additional neural imaging experiments.

[PDF]

Thesis Committee
Tom Mitchell, Chair
Dean Pomerleau
J. Andrew Bagnell
Andrew Ng, Stanford University

Saturday, November 21, 2009

CMU talk: Imitation Learning and Purposeful Prediction

Machine Learning Lunch (http://www.cs.cmu.edu/~learning/)
Speaker: Prof. Drew Bagnell
Date: Monday, November 23, 2009

Imitation Learning and Purposeful Prediction

Programming robots is hard. While demonstrating a desired behavior may be easy, designing a system that behaves this way is often difficult, time consuming, and ultimately expensive. Machine learning promises to enable "programming by demonstration" for developing high-performance robotic systems. Unfortunately, many approaches that utilize the classical tools of supervised learning fail to meet the needs of imitation learning. Perhaps foremost, classical statistics and supervised machine learning exist in a vacuum: predictions made by these algorithms are explicitly assumed to not affect the world in which they operate. I'll discuss the problems that result from ignoring the effect of actions influencing the world, and I'll highlight simple "reduction-based" approaches that, both in theory and in practice, mitigate these problems.

Additionally, robotic systems are often built atop sophisticated planning algorithms that efficiently reason far into the future; consequently, ignoring these planning algorithms in lieu of a supervised learning approach often leads to poor and myopic performance. While planners have demonstrated dramatic success in applications ranging from legged locomotion to outdoor unstructured navigation, such algorithms rely on fully specified cost functions that map sensor readings and environment models to a scalar cost. Such cost functions are usually manually designed and programmed. Recently, our group has developed a set of techniques that learn these functions from human demonstration by applying an /Inverse Optimal Control/ (IOC) approach to find a cost function for which planned behavior mimics an expert's demonstration. These approaches shed new light on the intimate connections between probabilistic inference and optimal control. I'll consider case studies in activity forecasting of drivers and pedestrians as well as the imitation learning of robotic locomotion and rough-terrain navigation. These case-studies highlight key challenges in applying the algorithms in practical settings.

Friday, November 20, 2009

Lab Meeting November 25, 2009 (KuoHuei): You’ll NeverWalk Alone: Modeling Social Behavior for Multi-target Tracking (ICCV 2009)

Title: You’ll NeverWalk Alone: Modeling Social Behavior for Multi-target Tracking
The Twelfth IEEE International Conference on Computer Vision (ICCV 2009)
Authors: S. Pellegrini, A. Ess, K. Schindler, and L. van Gool

Abstract:
Object tracking typically relies on a dynamic model to predict the object’s location from its past trajectory. In crowded scenarios a strong dynamic model is particularly important, because more accurate predictions allow for smaller search regions, which greatly simplifies data association.
Traditional dynamic models predict the location for each target solely based on its own history, without taking into account the remaining scene objects. Collisions are resolved only when they happen. Such an approach ignores important aspects of human behavior: people are driven by their future destination, take into account their environment, anticipate collisions, and adjust their trajectories at an early stage in order to avoid them. In this work, we introduce a model of dynamic social behavior, inspired by models developed for crowd simulation. The model is trained with videos recorded from birds-eye view at busy locations, and applied as a motion model for multi-people tracking from a vehicle-mounted camera. Experiments on real sequences show that accounting for social interactions and scene knowledge improves tracking performance, especially during occlusions.



Wednesday, November 11, 2009

Lab Meeting November 11, 2009 (swem): Avoiding Negative Depth in Inverse Depth Bearing-Only SLAM (IROS 2008)

Title: Avoiding Negative Depth in Inverse Depth Bearing-Only SLAM
(2008 IEEE/RSJ International Conference on Intelligent Robots and Systems)
Author: Martin P. Parsley and Simon J. Julier

Abstract:
In this paper we consider ways to alleviate negative estimated depth for the inverse depth parameterisation of bearing-only SLAM. This problem, which can arise even if the beacons are far from the platform, can cause catastrophic failure of the filter.We consider three strategies to overcome this difficulty: applying inequality constraints, the use of truncated second order filters, and a reparameterisation using the negative logarithm of depth. We show that both a simple inequality method and the use of truncated second order filters are succesful. However, the most robust peformance is achieved using the negative log parameterisation.



Tuesday, November 10, 2009

ICCV'09 Oral Paper: You’ll NeverWalk Alone: Modeling Social Behavior for Multi-target Tracking

You’ll NeverWalk Alone: Modeling Social Behavior for Multi-target Tracking

S. Pellegrini, A. Ess, K. Schindler and L. van Gool
ICCV 2009 (oral)

Abstract:
Object tracking typically relies on a dynamic model to predict the object’s location from its past trajectory. In crowded scenarios a strong dynamic model is particularly important, because more accurate predictions allow for smaller search regions, which greatly simplifies data association. Traditional dynamic models predict the location for each target solely based on its own history, without taking into account the remaining scene objects. Collisions are resolved only when they happen. Such an approach ignores important aspects of human behavior: people are driven by their future destination, take into account their environment, anticipate collisions, and adjust their trajectories at an early stage in order to avoid them. In this work, we introduce a model of dynamic social behavior, inspired by models developed for crowd simulation. The model is trained with videos recorded from birds-eye view at busy locations, and applied as a motion model for multi-people tracking from a vehicle-mounted camera. Experiments on real sequences show that accounting for social interactions and scene knowledge improves tracking performance, especially during occlusions. [PDF]