Sunday, October 31, 2010

Lab Meeting November 1, 2010 (Will): Visual Event Recognition in Videos by Learning from Web Data (CVPR 2010)

Title: Visual Event Recognition in Videos by Learning from Web Data (CVPR 2010)
Author: Lixin Duan, Dong Xu, Ivor W. Tsang, Jiebo Luo

We propose a visual event recognition framework for consumer domain videos by leveraging a large amount of loosely labeled web videos (e.g., from YouTube). First, we propose a new aligned space-time pyramid matching method to measure the distances between two video clips, where each video clip is divided into space-time volumes over multiple levels. We calculate the pairwise distances between any two volumes and further integrate the information from different volumes with Integer-flow Earth Mover’s Distance (EMD) to explicitly align the volumes. Second, we propose a new cross-domain learning method in order to 1) fuse the information from multiple pyramid levels and features (i.e., space-time feature and static SIFT feature) and 2) cope with the considerable variation in feature dis- tributions between videos from two domains (i.e., web do- main and consumer domain). For each pyramid level and each type of local features, we train a set of SVM classifiers based on the combined training set from two domains using multiple base kernels of different kernel types and parameters, which are fused with equal weights to obtain an average classifier. Finally, we propose a cross-domain learning method, referred to as Adaptive Multiple Kernel Learning (A-MKL), to learn an adapted classifier based on multiple base kernels and the prelearned average classifiers by minimizing both the structural risk functional and the mismatch between data distributions from two domains. Extensive experiments demonstrate the effectiveness of our proposed framework that requires only a small number of labeled consumer videos by leveraging web data.

Friday, October 29, 2010

Lab meeting Nov. 01 2010, (Chih-Chung) POMDPs for robotic tasks with mixed observability (RSS 2009)

Title:POMDPs for robotic tasks with mixed observability
Author:Sylvie C.W.Ong, Shao Wei Png, David Hsu and Wee Sun Lee.

Partially observable Markov decision processes
(POMDPs) provide a principled mathematical framework for
motion planning of autonomous robots in uncertain and dynamic
environments. They have been successfully applied to
various robotic tasks, but a major challenge is to scale up
POMDP algorithms for more complex robotic systems. Robotic
systems often have mixed observability: even when a robot’s
state is not fully observable, some components of the state
may still be fully observable. Exploiting this, we use a factored
model to represent separately the fully and partially observable
components of a robot’s state and derive a compact lowerdimensional
representation of its belief space. We then use this
factored representation in conjunction with a point-based algorithm
to compute approximate POMDP solutions. Separating
fully and partially observable state components using a factored
model opens up several opportunities to improve the efficiency
of point-based POMDP algorithms. Experiments show that on
standard test problems, our new algorithm is many times faster
than a leading point-based POMDP algorithm.

Thursday, October 28, 2010

News: University of Chicago, Cornell Researchers Develop Universal Robotic Gripper

Robotic hands are usually just that -- hands -- but some researchers from the University of Chicago and Cornell University (with a little help from iRobot) have taken a decidedly different approach for their so-called universal robotic gripper. As you can see above, the gripper is actually a balloon that can conform to and grip just about any small object, and hang onto it firmly enough to pick it up. What's the secret? After much testing, the researchers found that ground coffee was the best substance to fill the balloon with -- to grab an object, the gripper simply creates a vacuum in the balloon (much like a vacuum-sealed bag of coffee), and it's then able to let go of the object just by releasing the vacuum. Simple, but it works. Head on past the break to check it out in action. [via engadget]

Monday, October 25, 2010

Lab meeting Oct. 25 2010, (David) Threat-aware Path Planning in Uncertain Urban Environments (IROS 2010)

Title: Threat-aware Path Planning in Uncertain Urban Environments

Authors: Georges S. Aoude, Brandon D. Luders, Daniel S. Levine, and Jonathan P. How

This paper considers the path planning problem
for an autonomous vehicle in an urban environment populated
with static obstacles and moving vehicles with uncertain intents.
We propose a novel threat assessment module, consisting of
an intention predictor and a threat assessor, which augments
the host vehicle’s path planner with a real-time threat value
representing the risks posed by the estimated intentions of
other vehicles. This new threat-aware planning approach is
applied to the CL-RRT path planning framework, used by the
MIT team in the 2007 DARPA Grand Challenge. The strengths
of this approach are demonstrated through simulation and
experiments performed in the RAVEN testbed facilities

[local copy]

[link ]

[local video]


Monday, October 11, 2010

Lab meeting Oct. 11 2010, (Shao-Chen) Consistent data association in multi-robot systems with limited communications(RSS 2010)

Title: Consistent data association in multi-robot systems with limited communications

Authors: Rosario Aragues,Eduardo Montijano, and Carlos Sagues


In this paper we address the data association
problem of features observed by a robot team with limited communications.
At every time instant, each robot can only exchange
data with a subset of the robots, its neighbors. Initially, each
robot solves a local data association with each of its neighbors.
After that, the robots execute the proposed algorithm to agree
on a data association between all their local observations which
is globally consistent. One inconsistency appears when chains of
local associations give rise to two features from one robot being
associated among them. The contribution of this work is the
decentralized detection and resolution of these inconsistencies.
We provide a fully decentralized solution to the problem. This
solution does not rely on any particular communication topology.
Every robot plays the same role, making the system robust to
individual failures. Information is exchanged exclusively between
neighbors. In a finite number of iterations, the algorithm finishes
with a data association which is free of inconsistent associations.
In the experiments, we show the performance of the algorithm
under two scenarios. In the first one, we apply the resolution
and detection algorithm for a set of stochastic visual maps. In
the second, we solve the feature matching between a set of images
taken by a robotic team.


Lab meeting Oct. 11th 2010, (Nicole) Improvement in Listening Capability for Humanoid Robot HRP-2(ICRA 2010)

Title: Improvement in Listening Capability for Humanoid Robot HRP-2 (ICRA2010)

Authors: Toru Takahashi, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata and Hiroshi G. Okuno.


This paper describes improvement of sound source separation for a simultaneous automatic speech recognition (ASR) system of a humanoid robot. A recognition error in the system is caused by a separation error and interferences of other sources. In separability, an original geometric source separation (GSS) is improved. Our GSS uses a measured robot’s head related transfer function (HRTF) to estimate a separation matrix. As an original GSS uses a simulated HRTF calculated based on a distance between microphone and sound source, there is a large mismatch between the simulated and the measured transfer functions. The mismatch causes a severe degradation of recognition performance.

Faster convergence speed of separation matrix reduces separation error. Our approach gives a nearer initial separation matrix based on a measured transfer function from an optimal separation matrix than a simulated one. As a result, we expect that our GSS improves the convergence speed. Our GSS is also able to handle an adaptive step-size parameter.

These new features are added into open source robot audition software (OSS) called”HARK” which is newly updated as version 1.0.0. The HARK has been installed on a HRP-2 humanoid with an 8-element microphone array. The listening capability of HRP-2 is evaluated by recognizing a target speech signal which is separated from a simultaneous speech signal by three talkers. The word correct rate (WCR) of ASR improves by 5 points under normal acoustic environments and by 10 points under noisy environments. Experimental results show that HARK 1.0.0 improves the robustness against noises.


Lab meeting Oct. 11th 2010, (Andi) Dynamic 3D Scene Analysis for Acquiring Articulated Scene Models

Dynamic 3D Scene Analysis for Acquiring Articulated Scene Models

ICRA 2010

Agnes Swadzba, Niklas Beuter, Sven Wachsmuth, and Franz Kummert

In this paper we present a new system for a mobile robot to generate an articulated scene model by analyzing complex dynamic 3D scenes. The system extracts essential knowledge about the foreground, like moving persons, and the background, which consists of all visible static scene parts. In contrast to other 3D reconstruction approaches, we suggest to additionally distinguish between static parts, like walls, and movable objects like chairs or doors. The discrimination supports the reconstruction process and additionally, delivers important information about interaction objects. Here, the movable object detection is realized object independent by analyzing changes in the scenery. Furthermore, in the proposed system the background scene is feedbacked to the tracking part yielding a much better tracking and detection result which improves again the 3D reconstruction. We show in our experiments that we are able to provide a sound background model and to extract simultaneously persons and object regions representing chairs, doors, and even smaller movable objects.

Sunday, October 10, 2010

News: Google's Self-Driving Cars

Google: We've Been Secretly Building And Testing Robot Cars That Drive Themselves
By Sebastian Thrun, Google
Read the full article.

Google Cars Drive Themselves, in Traffic
By John Markoff, The New York Times 
Read the full article.

Friday, October 08, 2010

News: MIT Media Lab Medical Mirror

MIT Medical Media Lab Mirror tells your pulse with a webcam

Mirror mirror on the wall, who has the highest arterial palpation of them all? If you went to MIT you might be able to answer that question thanks to the work of grad student Ming-Zher Poh, who has found a way to tell your pulse with just a simple webcam and some software. By looking at minute changes in the brightness of the face, the system can find the beating of your heart even at a low resolution, comparable to the results of a traditional FDA-approved pulse monitor. Right now the mirror above is just a proof of concept, but the idea is that the hospital beds or surgery rooms of tomorrow might be able to monitor a patient's pulse without requiring any wires or physical contact, encouraging news for anyone who has ever tried to sleep whilst wearing a heart monitor. [via Engadget]

Sunday, October 03, 2010

Lab Meeting October 4th, 2010 (Jeff): Progress Report

I will represent the progress in RFID SLAM with some indexing value to show the performance.

Lab Meeting October 4th, 2010(KuoHuei): progress report

I will present the Neighboring Object Interacting Tracking, including modeling, learning, inference, and some results.