Sunday, March 30, 2008

[Robot Perception and Learning] Lab Meeting March 31st, 2008 (Hero)

Design and evaluation of a reactive and deliberative collision avoidance and escape architecture for autonomous robots

We present the design and evaluation of an architecture for collision avoidance and escape of mobile autonomous robots operating in unstructured environments. The approach mixes both reactive and deliberative components. This provides the vehicle’s behavior designers with an explicit means to design-in avoidance strategies that match system requirements in concepts of operations and for robot certification. The now traditional three layer architecture is extended to include a fourth Scenario layer, where scripts describing specific responses are selected and parameterized on the fly. A local map is maintained using available sensor data, and adjacent objects are combined as they are observed. This has been observed to create safer trajectories. Objects have persistence and fade if not re-observed over time. In common with behavior based approaches, a reactive layer is maintained containing pre-defined knee jerk responses for extreme situations. The reactive layer can inhibit outputs from above. Path planning of updated goal point outputs from the Scenario layer is performed using a fast marching method made more efficient through lifelong planning techniques. The architecture is applied to applications with Autonomous Underwater Vehicles. Both simulated and open water tests are carried out to establish the performance and usefulness of the approach.

Lab Meeting March 31st, 2008 (Ekker)

Title:Visual Odometry System Using Multiple Stereo Cameras and Inertial Measurement
Author:Taragay Oskiper, Zhiwei Zhu, Supun Samarasekera ,Rakesh Kumar
From:Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference, Publication Date:-22 June 2007

Over the past decade, tremendous amount of research
activity has focused around the problem of localization in
GPS denied environments. Challenges with localization are
highlighted in human wearable systems where the operator
can freely move through both indoors and outdoors. In
this paper, we present a robust method that addresses these
challenges using a human wearable system with two pairs
of backward and forward looking stereo cameras together
with an inertial measurement unit (IMU). This algorithm
can run in real-time with 15Hz update rate on a dual-core
2GHz laptop PC and it is designed to be a highly accurate
local (relative) pose estimation mechanism acting as
the front-end to a Simultaneous Localization and Mapping
(SLAM) type method capable of global corrections through
landmark matching. Extensive tests of our prototype system
so far, reveal that without any global landmark matching,
we achieve between 0.5% and 1% accuracy in localizing a
person over a 500 meter travel indoors and outdoors. To
our knowledge, such performance results with a real time
system have not been reported before.

Saturday, March 29, 2008

Lab Meeting March 31st, 2008 (Stanley)

Title: User-adapted plan recognition and user-adapted shared control: A Bayesian approach to semi-autonomous wheelchair driving

Authors: Eric Demeester · Alexander Hüntemann · Dirk Vanhooydonck · Gerolf Vanacker · Hendrik Van Brussel · Marnix Nuttin

From: Autonomous Robots (2008) 24: 193–211

Many elderly and physically impaired people experience difficulties when maneuvering a powered wheelchair. In order to ease maneuvering, powered wheelchairs have been equipped with sensors, additional computing power and intelligence by various research groups.
This paper presents a Bayesian approach to maneuvering assistance for wheelchair driving, which can be adapted to a specific user. The proposed framework is able to model and estimate even complex user intents, i.e. wheelchair maneuvers that the driver has in mind. Furthermore, it explicitly takes the uncertainty on the user's intent into account. Besides during intent estimation, user-specific properties and uncertainty on the user's intent are incorporated when taking assistive actions, such that assistance is tailored to the user's driving skills. This decision making is modeled as a greedy Partially Observable Markov Decision Process (POMDP).
Benefits of this approach are shown using experimental results in simulation and on our wheelchair platform Sharioto.

Friday, March 28, 2008

News: All terrain Roomba/iRobot

Byron Lahey and other researchers at Arizona State University are working on this all terrain iRobot modification. He writes:
Robot Create robots are easily programmable and expandable in their functionality, but can only travel on a very limited varieties of terrain (carpet, hard flat floors and other typical interior domestic surfaces). As part of the Arts, Media and Engineering program we are working on systems related to the Mars exploration rovers. To teach about these systems and allow students to test and explore terrain navigation and mapping with robots, we needed robots with expanded terrain navigation capabilities. This video demonstrates an early prototype modification of the iRobot, comparing the performance of a modified and unmodified robot traveling on a rocky surface.
This video first shows how epicly the standard iRobot Create fails on a non-flat surface, then shows how his mods make it work better.

Check here for photos of the modifications.

Thursday, March 27, 2008

VASC Seminar: Free Space computation using Stochastic Occupancy Grids and Dynamic Programming

Hernán Badino
Goethe Frankfurt University (Germany)
Monday March 31 @ 3:30pm

Abstract--The computation of free space available in an environment is an essential task for many intelligent automotive and robotic applications. In this talk, I propose a new approach, which builds a stochastic occupancy grid to address the free space problem as a dynamic programming task. Stereo measurements are integrated over time reducing disparity uncertainty. These integrated measurements are entered into an occupancy grid, taking into account the noise properties of the measurements. In order to cope with real-time requirements of the application, three occupancy grid types are proposed. Their applicabilities and implementations are also discussed. Experimental results with real stereo sequences show the robustness and accuracy of the method. The current implementation of the method runs on off-the-shelf hardware at 20 Hz.

Bio--Hernán Badino received his degree of Engineer from the National Technological University, Córdoba, Argentina, in 2002. He is at the moment presenting his doctoral thesis at the J. W. Goethe Frankfurt University, Germany. Mr. Badino has worked during his PhD. D with the Image Based Environment Perception Group at Daimler AG, in Stuttgart, Germany. He is currently member of the Visual Sensorics and Information Processing Group, at the Frankfurt University, engaged in a project of camera-based urban traffic sensing for driver assistance systems. Mr. Badino particular research interests in the area of computer vision include the computation of ego-motion from sequences of stereo images and the development of stereo vision algorithms for the real-time detection and tracking of static and moving objects for automotive applications.

Tuesday, March 25, 2008

CMU RI Thesis: Studies in using image segmentation to improve object recognition

Caroline Pantofaru
Robotics Institute
Carnegie Mellon University


Recognizing object classes is a central problem in computer vision, and recently there has been renewed interest in also precisely localizing objects with pixel-accurate masks. Since classes of deformable objects can take a very large number of shapes in any given image, a requirement for recognizing and generating masks for such objects is a method for reducing the number of pixel sets which need to be examined. One method for proposing accurate spatial support for objects and features is data-driven pixel grouping through unsupervised image segmentation. The goals of this thesis are to define and address the issues associated with incorporating image segmentation into an object recognition framework.

The first part of this thesis examines the nature of image segmentation and the implications for an object recognition system. We develop a scheme for comparing and evaluating image segmentation algorithms which includes the definition of criteria that an algorithm must satisfy to be a useful black box, experiments for evaluating these criteria, and a measure of automatic segmentation correctness versus human image labeling. This evaluation scheme is used to perform experiments with popular segmentation algorithms, the results of which motivate our work in the remainder of this thesis.

The second part of this thesis explores approaches to incorporating the regions generated by unsupervised image segmentation into an object recognition framework. Influenced by our experiments with segmentation, we propose principled methods for describing such regions. Given the instability inherent in image segmentation, we experiment with increasing robustness by integrating the information from multiple segmentations. Finally, we examine the possibility of learning explicit spatial relationships between regions. The efficacy of these techniques is demonstrated on a number of challenging data sets.


Further details : here

CMU RI Thesis: Statistical Approaches to Multi-scale Point Cloud Processing

Ranjith Unnikrishnan
Robotics Institute
Carnegie Mellon University

In recent years, 3D geometry has gained increasing popularity as the new form of digital media content. Due to advances in sensor technology, it is now feasible to acquire highly detailed 3D scans of complex scenes to obtain millions of data points at high sampling rates over large spatial extents. This ability to acquire high-resolution depth information brings with it the possibility of using 3D geometric data to construct detailed shape models and of perhaps combining 3D depth with visual appearance from images to address challenging problems in computer vision.

However, geometric information represented as a 3D point cloud presents challenges uniquely different from other data modalities such as images or audio. Due to a combination of reasons such as the spatial irregularity of the data and the implicit nature of 3D observations, an easy substitution of traditional signal processing operators from images for processing unorganized 3D points is not possible. Furthermore, traditional estimators from classical statistics are not suitable for processing data in this domain, and new algorithms as well as different criteria for evaluating these algorithms are necessary.

This dissertation contributes towards the development of two fundamental building blocks for processing point clouds. The first is of geometric model fitting, where we present a class of locally semi-parametric estimators that allows finite-sample analysis of accuracy and also explicitly addresses the problem of support-radius selection in local fitting. The second is of multi-scale filtering operators for point clouds that allow detection of interest regions whose locations as well as spatial extent are completely data-driven. The proposed approaches are distinguished from related work by operating directly in the input 3D space on unorganized points without assuming an available mesh or resorting to an intermediate global 2D parameterization.

Results are presented for several applications including surface reconstruction, accurate shape descriptor computation and repeatable interest region detection, on synthetic data, as well as outdoor aerial and ground-based data obtained with a laser scanner.

Check here for further details.

Monday, March 24, 2008

Lab Meeting March 24th, 2008 (Dwayne Yu): 3D Reconstruction of Environments for Planetary Exploration

In this paper we present our approach to 3D surface reconstruction from large sparse range data sets. In space robotics constructing an accurate model of the environment is very important for a variety of reasons. In particular, the constructed model can be used for: safe tele-operation, path planning, planetary exploration and mapping of points of interest. Our approach is based on acquiring range scans from different view-points with overlapping regions, merge them together into a single data set, and fit a triangular mesh on the merged data points.We demonstrate the effectiveness of our approach in a path planning scenario and also by creating the accessibility map for a portion of the Mars Yard located in the Canadian Space Agency.


Sunday, March 23, 2008

Lab Meeting March 24th, 2008 (Kuo-Hwei Lin): Recent work

I will present the recent result about my work. I used the SCRIM algorithm to determine segments pairs possible movement, then make the full set hypothesis of whether segments is static. After weighting average the results of SCRIM in each hypothesis, I will show the sorted hypotheses applied to one step (two scans).

VASC Seminar: Toward a Perceptual Space for Reflectance

Title: Toward a Perceptual Space for Reflectance
Speaker: Sameer Agarwal

Location: NSH 1507, Univ. of Washington
Time: 3:30pm Monday, 24 March

Abstract -- As we make progress in measuring and modeling reflectance, it is also important that we develop a better understanding of how the human visual system perceives the reflection of light. Such a development not only has implications for efficient image synthesis, but also for computer vision where an understanding of reflectance perception will give us insight into the priors and constraints used by humans to solve various shading related problems, e.g., shape from shading and object recognition over variable and unknown lighting.

In this talk I will present a study of the perception of reflectance. I will argue that our methodology based on paired comparisons is better suited for capturing human perception and is less susceptible to experimental errors than previously used methods. The analysis of paired comparisons required the development of a new data analysis tool. In the second part of the talk I will present a new multidimensional scaling algorithm for analyzing paired comparisons. Based on semi-definite programming, this algorithm is a more general and efficient replacement for the widely used Non-metric MDS algorithm.

Using this algorithm we obtain a perceptual embedding of BRDFs from the MIT/MERL Database. This embedding, constructed purely from psychophysical data, exhibits some striking correlations with the material appearance standards that have been developed independently in the paper and paint industries. Finally, I will describe a novel perceptual interpolation scheme that uses this embedding to provide the user with an intuitive interface for navigating the space of reflectances and constructing new ones.

Friday, March 21, 2008

News: People prefer robots that do small talk

* 20 March 2008

ROBOTS of the future may have to learn to make small talk if humans are to accept them.

To find out how quickly domestic robots should respond to their owners' requests, Toshiyuki Shiwa and colleagues at the ATR laboratories in Kyoto, Japan, asked 38 students to give orders such as "take out the trash" to a robot, which took between zero and 5 seconds to respond.

The students liked delays of no more than 1 second best, with 2 seconds being their limit. However, when the robot took longer, impatient students were assuaged if it filled the time with words such as "well" or "er". "When the robot used conversational fillers to buy time until it could respond, people didn't notice the delay," Shiwa says. He presented the study last week at Human-Robot Interaction 2008 in Amsterdam, the Netherlands.

related paper

Monday, March 17, 2008

Effective Computer Agents for Interacting With People

CSAIL Event Calendar

Speaker: Ya'akov (Kobi) Gal, MIT CSAIL and Harvard School of Applied Sciences
Date: Wednesday, March 5 2008
Time: 2:00PM to 3:30PM

This talk will present work that addresses challenges by synthesizing techniques from computer science with insights from the behavioral and social sciences.
A language that makes explicit the different mental models agents use to make their decisions, described as nodes in a graphical network. The language defines an equilibrium that makes a distinction between agents' optimal strategies and the way they actually behave in reality.
For best performance, computers participating in mixed human-computer settings must model human behavior in a way that reflects the contextual setting in which the decision is presented to people.

Link to the speaker:

Monday, March 10, 2008

Lab Meeting March 10th, 2008 (Leo): Recent work

I will show some initial results of my recent work.

Lab Meeting March 10th, 2008 (Yu-Hsiang) : Interacting Object Tracking in Crowded Urban Areas

Author : Chieh-Chih Wang, Tzu-Chien Lo and Shao-Wen Yang


Tracking in crowded urban areas is a daunting task. High crowdedness causes challenging data association problems. Different motion patterns from a wide variety of moving objects make motion modeling difficult. Accompanying with traditional motion modeling techniques, this paper introduces a scene interaction model and a neighboring object interaction model to respectively take long-term and short-term interactions between the tracked objects and its surroundings into account. With the use of the interaction models, anomalous activity recognition is accomplished easily. In addition, move-stop hypothesis tracking is applied to deal with move-stop-move maneuvers. All these approaches are seamlessly intergraded under the variable-structure multiple-model estimation framework. The proposed approaches have been demonstrated using data from a laser scanner mounted on the PAL1 robot at a crowded intersection. Interacting pedestrians, bicycles, motorcycles, cars and trucks are successfully tracked in difficult situations with occlusion.


Sunday, March 09, 2008

Lab Meeting March 10th, 2008 (Jeff):Progress report

I will try to do live demo, if possible.

Or, I will show some previous data set results.

Saturday, March 08, 2008

Snake robot uses obstacles for propulsion news service:

A new snake-like robot can replicate a trick of real snakes, pushing off obstacles it encounters to move forwards.

A virtual double of the robot that accurately predicts its real life behaviour has also been developed, something not achieved for a realistic snake robot before

Researchers have been working on snake-inspired robots for decades, but they usually have wheels or treads on their body to help them move. These make it easier for a snake robot to slither forward, by converting its writhing motion into a forward slide.

But that approach works best on smooth surfaces. "In a collapsed building where there's a lot of rubble, for example after an earthquake, a wheeled snake would probably get stuck," says Aksel Transeth of Norwegian research organization SINTEF in Trondheim.
Look, no wheels

A more versatile snake robot would move in a truly snaky way, pushing off of obstacles, such as rocks, that it encounters, Transeth argues. Along with colleagues at the Norwegian University of Science and Technology, also in Trondheim, he developed a wheelless snake robot that can do just that...

Link to the full article: NewScientist

Link to the related IEEE Paper

... and finally a video

MIT Brains & Machines Seminar Series: Online Learning with Limited Feedback: An Efficient and Optimal Algorithm

Speaker: Alexander Rakhlin, Berkeley, Dpt. Computer Science
Date: Tuesday, March 11 2008
Host: Tomaso Poggio, CSAIL, BCS
Relevant URL:

Title: "Online Learning with Limited Feedback: An Efficient and Optimal Algorithm"

One's ability to learn and make decisions rests heavily on the availability of feedback. In sequential decision-making problems such feedback is often limited. A gambler, for example, can observe entirely the outcome of a horse race regardless of where he placed his bet; however, when the same gambler chooses his route to travel to the race track, perhaps at a busy hour, he will likely never learn the outcome of possible alternatives. The latter limited-feedback problem is the focus of this talk.

The problem can be phrased as an Online Linear Optimization game with ``bandit'' feedback. The existence of a low-regret algorithm has been an open question since the work of Awerbuch and Kleinberg in 2004. We present the first known efficient algorithm for bandit Online Linear Optimization over arbitrary convex decision sets. We show how the difficulties encountered by previous approaches are overcome by employing Regularization -- a method well-known in statistical learning, but under-appreciated in online learning. Furthermore, our solution reveals surprising connections between online learning and Interior Point methods in Optimization.

In particular, our method solves the Online Shortest Path problem: at each round, a path from source to sink is chosen and only the total length (delay) of this path is revealed. Our method has numerous applications in network routing, resource allocation, dynamic treatment of patients, and many more. The worst-case guarantees imply robustness with respect to noise and malicious adversary.

Joint work with Jacob Abernethy and Elad Hazan.

MIT CSAIL Theory Colloquium: Conditional Computational Entropy, or Towards Separating Pseudoentropy from Compressibility

Theory Colloquium: Conditional Computational Entropy, or Towards Separating Pseudoentropy from Compressibility

Speaker: Leonid Reyzin, Boston University
Date: Tuesday, March 11 2008

Computational entropy measures the amount of randomness a distribution appears to have to a computationally bounded observer. It is an open question whether two definitions of this entropy -- the so-called "HILL entropy" (based on indistinguishability from truly random distributions) and "Yao entropy" (based on incompressibility) are equivalent, as they are in the information-theoretic setting. We observe that most of the time the observer has some correlated information, and thus define and study _conditional_ computational entropy. By considering conditional versions of HILL and Yao entropies, we obtain:
-- a separation between conditional HILL and Yao entropies;
-- the first demonstration of a distribution from which extraction techniques based on Yao entropy produce more pseudorandom bits than appears possible by the traditional HILL-entropy-based techniques;
-- a new, natural notion of unpredictability entropy, which, in particular, can handle entropy of singleton distributions, and allows for known extraction and hardcore bit results to be stated and used more generally.

Joint work with Chun-Yuan Hsiao and Chi-Jen Lu.

Friday, March 07, 2008

MIT talk: Translating reactive tasks to reactive controllers

Hadas Kress-Gazit
University of Pennsylvania
March 5, 2008, 11:00 a.m. 32-G449 (Stata Center)

Translating reactive tasks to reactive controllers

Abstract: How can we automatically create controllers for our system, be it a robot or a team of UAVs, that are guaranteed to satisfy reactive high level tasks such as "Search for Nemo and if you find him transmit his location"? How do we transition from systems that require each behavior or task to be hand coded, tested and verified to systems that allow anyone to just specify an abstract mission and then the system takes care of the rest while taking into account dynamically changing environments? As systems become more sophisticated and mechanically complex, providing theory and tools that answer these questions is crucial for creating truly autonomous systems. In this talk, Hadas Kress-Gazit will present a formal approach to creating robot controllers that ensure the robot satisfies a given high level task. She will describe a framework in which a user specifies a complex and reactive task in structured English. This task is then automatically translated, using logic and tools from the formal methods world, into a hybrid controller. This controller is guaranteed to control the robot such that its motion and actions satisfy the intended task, under some assumptions, in a variety of different environments. As an example, she will show how tasks related to DARPA's Urban challenge can be handled using this framework.

MIT talk : From Detection to Tracking

Speaker: Dr. Fatih Porikli, Mitsubishi Electric Research Laboratory (MERL)
Date: Wednesday, March 5 2008
Time: 3:00PM to 4:00PM Refreshments: 2:45PM
Location: Star Seminar Room (32-D463)
Host: C. Mario Christoudias, Gerald Dalley, MIT CSAIL
Contact: C. Mario Christoudias, Gerald Dalley, 3-4278, 3-6095, ,

Object detection and tracking, as our eyes do so innately, are among the most challenging tasks in computer vision. In general, natural objects belong to same class exhibit a large variance in their appearance. Besides, varying imaging conditions, partial occlusions, non-rigid shape deformations, multifaceted profiles and insufficient image resolutions make the detection more difficult. Similarly, a tracked object may undergo severe appearance transformations, suddenly change its motion, become fully occluded, congregate into a group of identical objects, etc. Traditional approaches tend to address these issues separately, often out of context by aiming for fixed generic solutions. Recently, there is push towards making use of any useful bit of information embedded in priori and contextual cues. More systems seek to provide online adaptation to local conditions. In this talk, various aspects of the conventional and contextual detection and tracking methods will be dissected and a unifying statistical descriptor will be examined.


Thursday, March 06, 2008

CMU Intelligence Seminar: Hearing the Shape of a State Space: New Frontiers in Representation Discovery

Hearing the Shape of a State Space: New Frontiers in Representation Discovery
Sridhar Mahadevan
University of Massachusetts
Tuesday 3/18

In this talk, we will explore new frontiers in representation discovery, where agents construct a basis for approximation of functions on a space well-adapted to its nonlinear geometry. For example, most spatial environments contain significant bottlenecks (e.g. doors, elevators, exits) that factor into our daily decision-making. Similarly, discovery of latent structure in collections of images or text documents is also facilitated by a deeper understanding of the geometry of particular document or image spaces.

We will describe an approach to representation discovery where agents construct novel bases by "hearing the shape" of the underlying state space. Formally, the proposed framework builds on recent advances in harmonic analysis, specifically Fourier and wavelet analysis on graphs, which transform spatial and temporal structure to frequency-oriented representations. Efficient algorithms for basis construction involves computational challenges, which will be addressed by sampling, matrix compression, and domain knowledge.

A range of case studies will be presented, including a novel paradigm for solving Markov decision processes where representation and control are learned simultaneously; a novel multiscale wavelet method for clustering of text documents where the topic hierarchy is automatically constructed; and a new compression method for computer graphics based on multiscale analysis of object geometry.

Speaker Bio
Professor Sridhar Mahadevan is Co-Director of the Autonomous Learning Laboratory at the Department of Computer Science, University of Massachusetts, Amherst. His research interests are in artificial intelligence and machine learning. He is an associate editor of the Journal of Machine Learning Research, and was a tutorial speaker at AAAI 2007, IJCAI 2007, and ICML 2006.

Wednesday, March 05, 2008

MIT Thesis Defense: Learning Coupled Conditional Random Field for Image Decomposition: Theory and Application in Object Categorization

MIT Thesis Defense: Learning Coupled Conditional Random Field for Image Decomposition: Theory and Application in Object Categorization

Speaker: Xiaoxu Ma, MIT CSAIL
Date: Wednesday, March 5 2008

The goal of this thesis is to build a computational system that is able to identify object categories within images. To this end, this thesis proposes a computational model of "recognition-through-decomposition-and-fusion" based on the psychophysical theories of information dissociation and integration in human visual perception. At the lowest level, contour and texture processes are measured. In the mid-level, a coupled Conditional Random Field model is proposed to model and decompose the contour and texture processes in natural images. Various matching schemes are introduced to match the decomposed contour and texture channels in a dissociative manner. As a counterpart to the integrative process in the human visual system, adaptive combination is applied to fuse the perception in the decomposed contour and texture channels.

The proposed coupled Conditional Random Field model is shown to be an important extension of popular single-layer Random Field models for modeling image processes, by dedicating a separate layer of random field grid to each individual image process and explicitly capturing the distinct properties of multiple visual processes. The decomposition enables the system to fully leverage each decomposed visual stimulus to its full potential in discriminating different object classes. Adaptive combination of multiple visual cues mirrors the fact that different visual cues play different roles in distinguishing various object classes. Experimental results demonstrate that the proposed computational model of "recognition-through-decomposition-and-fusion" achieves better performance than most of the state-of-the-art methods in recognizing the objects in Caltech-101, especially when only a limited number of training samples are available, which conforms with the capability of learning to recognize a class of objects from a few sample images in the human visual system.

Tuesday, March 04, 2008

News: Which robot would you take home with you?

Do you find any one of the three robots more likable than the others? your choice might just reveal something about your personality type, or about your current mood.
Kerstin Dautenhahn and colleagues at the University of Hertfordshire asked volunteers to watch videos of the robots as they answered the door for an owner who had missed the bell. The volunteers were then asked to fill out a questionnaire rating the likeability of each robot, and to complete a simple personality test.
Volunteers tended to prefer the Humanoid robot while "introverts and participants with lower emotional stability" warmed to the two more mechanical robots slightly more.

full text
related paper - Avoiding the uncanny valley: robot appearance, personality and consistency of behavior in an attention-seeking home scenario for a robot companion (link)

[Thesis] Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

Title: Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

Joan Solà Ortega

February 2007

Docteur de l’Institut National Politechnique de Toulouse

Full text:

Saturday, March 01, 2008

[Lab meeting] Mar. 3, 2008 (Jim Yu): PAL2's arm progress

The new arms are on PAL2 now, I could use command line to control it.
I will do the demo or just show the vedio (If I fix some basic parameters these days).

CMU RI PhD Thesis Proposal: Pairwise Constraints for Matching, Perceptual Grouping and Recognition

Pairwise Constraints for Matching, Perceptual Grouping and Recognition

Marius Leordeanu
Robotics Institute
Carnegie Mellon University

10 March 2008

Object category recognition is a challenging problem in computer vision, which currently receives a growing interest in the field. This problem is almost ill-posed, because there is no formal definition of what constitutes an object category. While people largely agree on common, useful categories, it is still not clear which are the objects' properties that help us group them into such categories. In this thesis we represent the object category models as graphs of features, and focus mainly on the second order relationships between them: pairwise category-dependent (e.g. shape) as well as pairwise perceptual grouping constraints (e.g. geometrical and color based). The main theme of this thesis is that higher order relationships between model features are more important for category recognition than local, first order features. We present several novel algorithms that take full advantage of such pairwise constraints. Firstly, we present our spectral matching algorithm for the Quadratic Assignment Problem (also known as Graph Matching), along with a novel, efficient method for learning the pairwise parameters. Secondly, we present a novel optimization method which can handle nonlinear, complex functions, and present some of its applications in the context of our work. Thirdly, we discuss our object category recognition approach based on shape alone, which uses pairwise geometric constraints only. Next, we explore ways (based on both color and geometry) to establish perceptual grouping relationships between pairs of features, which are category independent. And finally, we talk about how we plan on combining both the category dependent and the perceptual relationships in order to perform object category recognition.

Further Details

A copy of the thesis proposal document can be found at

Thesis Committee

* Martial Hebert, Chair
* Rahul Sukthankar
* Fernando De la Torre
* David Lowe, University of British Columbia