Thursday, July 29, 2010

Lab Meeting 8 / 3, 2010 (Alan) - Mapping Indoor Environments Based on Human Activity (ICRA 2010)

Title: Mapping Indoor Environments Based on Human Activity (ICRA 2010)
Authors: Slawomir Grzonka, Frederic Dijoux, Andreas Karwath, Wolfram Burgard

We present a novel approach to build approximate maps of structured environments utilizing human motion and activity. Our approach uses data recorded with a data suit which is equipped with several IMUs to detect movements of a person and door opening and closing events. In our approach we interpret the movements as motion constraints and door handling events as landmark detections in a graph-based SLAM framework. As we cannot distinguish between individual doors, we employ a multi-hypothesis approach on top of the SLAM system to deal with the high data-association uncertainty. As a result, our approach is able to accurately and robustly recover the trajectory of the person. We additionally take advantage of the fact that people traverse free space and that doors separate rooms to recover the geometric structure of the environment after the graph optimization. We evaluate our approach in several experiments carried out with different users and in environments of different types.

Link: pdf

Monday, July 26, 2010

Lab Meeting 07/27, 2010(Kuen-Han) Non-Rigid Structure from Locally-Rigid Motion (CVPR,2010)

Title: Non-Rigid Structure from Locally-Rigid Motion
Authors: Jonathan Taylor Allan D. Jepson Kiriakos N. Kutulakos

We introduce locally-rigid motion, a general framework for
solving the M-point, N-view structure-from-motion problem
for unknown bodies deforming under orthography. The
key idea is to first solve many local 3-point, N-view rigid
problems independently, providing a “soup” of specific,
plausibly rigid, 3D triangles. The main advantage here is
that the extraction of 3D triangles requires only very weak
assumptions: (1) deformations can be locally approximated
by near-rigid motion of three points (i.e., stretching not
dominant) and (2) local motions involve some generic rotation
in depth. Triangles from this soup are then grouped
into bodies, and their depth flips and instantaneous relative
depths are determined. Results on several sequences,
both our own and from related work, suggest these conditions
apply in diverse settings—including very challenging
ones (e.g., multiple deforming bodies). Our starting point
is a novel linear solution to 3-point structure from motion,
a problem for which no general algorithms currently exist.


Saturday, July 24, 2010

Lab Meeting July 20, 2010 (fish60): What if the Irresponsible Teachers Are Dominating? A Method of Training on Samples and Clustering on Teachers

Sorry for the previous blank post.
Here's the content:

What if the Irresponsible Teachers Are Dominating? A Method of Training on Samples and Clustering on Teachers

Shuo Chen, Jianwen Zhang, Guangyun Chen, Changshui Zhang
State Key Laboratory on Intelligent Technology and Systems
Tsinghua National Laboratory for Information Science and Technology (TNList)
Department of Automation, Tsinghua University, Beijing 100084, China

Learning from multiple teachers or sources
has received more attention of the researchers in the machine
learning area. In this setting, the learning system is dealing
with samples and labels provided by multiple teachers, who
in common cases, are non-expert. Their labeling styles and
behaviors are usually diverse, some of which are even detrimental
to the learning system. Thus, simply putting them
together and utilizing the algorithms designed for singleteacher
scenario would be not only improper, but also damaging.
Our work focuses on a case where the teachers are composed of good
ones and irresponsible ones. By irresponsible, we mean the
teacher who takes the labeling task not seriously and label
the sample at random without inspecting the sample itself.
If we do not take out their effects, our learning system would be ruined with no
doubt. In this paper, we propose a method for picking out the
good teachers with promising experimental results. It works
even when the irresponsible teachers are dominating in numbers.


Wednesday, July 21, 2010

江山代有才人出 攻讀博士—不輕言放棄

Author: 王榮騰 臺大客座教授


A生曾榮獲美國極頂尖大學某一指導教授〈Advisor〉給予的全額研究助理獎學金〈RA,Research Assistantship〉,一年半後,A生放棄學業,正在覓職中!
B生曾榮獲美國另一極頂尖大學給予的一年期全額研究生獎學金〈Graduate Fellowship〉,一年後及時拿到RA,卻一直認為研究與現實脫節,擔心未來就業機會而深感困擾!




指導教授常會同時進行數個研究項目,可當面請教並說明原因,是否能更換原指定之研究題目。若非不合理,教授多半都會接受。須知,博士論文〈PhD Dissertation〉大多由幾個研究專題組合而成。因此,最好是在文章被期刊或會議接受後提出;一來,可對目前該專題有所交代〈不至於浪費教授研究經費〉,二來也有助於自己博士論文的進展。再者,亦可利用這段時間對新研究項目有所了解。




換言之,不管個人興趣是否與指導教授研究領域相近,繼續跟定指導教授,不輕言放棄;且莫在博士資格考試〈PhD Qualifying Examination〉未通過前提出,以免造成輟學的嚴重後果。

經過長期溝通,最後A生接受指導教授建議,先休學、工作一段時間,再考慮是否繼續完成博士學位。B生則同時加入另一教授之研究團隊,不排除於畢業後往學術界發展;其後續已不再為研究課題而煩惱,並已在新覓研究領域之尖端會議中發表論文。由於處理得宜,目前這兩位高材生仍與原來指導教授保持良好關係。畢竟,恩師難覓,必須知福惜福;師生情難建,值得一生珍惜! 〈王榮騰 臺大電機系與電子工程研究所客座教授;2010年6月6日〉

Sunday, July 18, 2010

Lab Meeting July 20, 2010 (Gary): Robust Unified Stereo-Based 3D Head Tracking and Its Application to Face Recognition (ICRA2010)

Robust Unified Stereo-Based 3D Head Tracking and Its Application
to Face Recognition

Authors: Kwang Ho An and Myung Jin Chung

This paper investigates the estimation of 3D head poses and its identity authentication with a partial ellipsoid model. To cope with large out-of-plane rotations and translation in-depth, we extend conventional head tracking with a single camera to a stereo-based framework. To achieve more robust motion estimation even under time-varying lighting conditions, we incorporate illumination correction into the aforementioned framework. We approximate the face image variations due to illumination changes as a linear combination of illumination bases. Also,��by computing the illumination bases online from the registered face images, after estimating the 3D head poses, user-specific illumination bases can be obtained, and therefore illumination-robust tracking without a prior learning process can be possible. Furthermore, our unified stereo-based tracking is approximated as a linear least-squares problem; a closed-form solution is then provided. After recovering the full-motions of the head, we can register face images with pose variations into stabilized-view images, which are suitable for pose-robust face recognition. To verify the feasibility and applicability of our approach, we performed extensive experiments with three sets of challenging image sequences.


Thursday, July 15, 2010

Lab Meeting July 20, 2010 (Jimmy): Group-Sensitive Multiple Kernel Learning for Object Categorization

Title: Group-Sensitive Multiple Kernel Learning for Object Categorization
Authors: Jingjing Yang, Yuanning Li, Yonghong Tian, Lingyu Duan, Wen Gao
In: ICCV 2009

In this paper, we propose a group-sensitive multiple kernel learning (GS-MKL) method to accommodate the intra-class diversity and the inter-class correlation for object categorization. By introducing an intermediate representation “group” between images and object categories, GS-MKL attempts to find appropriate kernel combination for each group to get a finer depiction of object categories. For each category, images within a group share a set of kernel weights while images from different groups may employ distinct sets of kernel weights. In GS-MKL, such group-sensitive kernel combinations together with the multi-kernels based classifier are optimized in a joint manner to seek a trade-off between capturing the diversity and keeping the invariance for each category. Extensive experiments show that our proposed GS-MKL method has achieved encouraging performance over three challenging datasets.


Monday, July 12, 2010

Lab Meeting July 13, 2010(ShaoChen):Rao-Blackwellized Particle Filters Multi Robot SLAM with Unknown Initial Correspondences and Limited Communication(ICRA 2010)

Title: Rao-Blackwellized Particle Filters Multi Robot SLAM with Unknown Initial Correspondences and Limited Communication

Authors: Luca Carlone, Miguel Kaouk Ng, Jingjing Du, Basilio Bona, and Marina Indri


Multi robot systems are envisioned to play an important role in many robotic applications. A main prerequisite for a team deployed in a wide unknown area is the capability of autonomously navigate, exploiting the information acquired through the on-line estimation of both robot poses
and surrounding environment model, according to Simultaneous Localization And Mapping (SLAM) framework. As team coordination is improved, distributed techniques for filtering
are required in order to enhance autonomous exploration and large scale SLAM increasing both efficiency and robustness of operation. Although Rao-Blackwellized Particle Filters (RBPF) have been demonstrated to be an effective solution to the problem of single robot SLAM, few extensions to teams of robots exist, and these approaches are characterized by strict assumptions on both communication bandwidth and prior knowledge on relative poses of the teammates. In the present paper we address the problem of multi robot SLAM in the case of limited communication and unknown relative initial poses. Starting from the well established single robot RBPFSLAM, we propose a simple technique which jointly estimates SLAM posterior of the robots by fusing the prioceptive and the eteroceptive information acquired by each teammate. The approach intrinsically reduces the amount of data to be exchanged among the robots, while taking into account the uncertainty in relative pose measurements. Moreover it can be naturally extended to different communication technologies (bluetooth, RFId, wifi, etc.) regardless their sensing range. The proposed approach is validated through experimental test.


Lab Meeting July 13,2010(Nicole):Mutual Localization in a Team of Autonomous Robots using Acoustic Robot Detection

Title: Mutual Localization in a Team of Autonomous Robots using Acoustic Robot Detection

Authors: David Becker and Max Risler

In RoboCup 2008: Robot Soccer World Cup XII ,Volume 5399/2009

In order to improve self-localization accuracy we are exploring ways of mutual localization in a team of autonomous robots. Detecting team mates visually usually leads to inaccurate bearings and only rough distance estimates. Also, visually identifying teammates is not possible. Therefore we are investigating methods of gaining relative position information acoustically in a team of robots.
The technique introduced in this paper is a variant of code-multiplexed communication (CDMA, code division multiple access). In a CDMA system, several receivers and senders can communicate at the same time, using the same carrier frequency. Well-known examples of CDMA systems include wireless computer networks and the Global Positioning System, GPS. While these systems use electro-magnetic waves, we will try to adopt the CDMA principle towards using acoustic pattern recognition, enabling robots to calculate distances and bearings to each other.
First, we explain the general idea of cross-correlation functions and appropriate signal pattern generation. We will further explain the importance of synchronized clocks and discuss the problems arising from clock drifts.
Finally, we describe an implementation using the Aibo ERS-7 as platform and briefly state basic results, including measurement accuracy and a runtime estimate. We will briefly discuss acoustic localization in the specific scenario of a RoboCup soccer game.


Tuesday, July 06, 2010

Lab Meeting July 6th (Casey): Live Dense Reconstruction with a Single Moving Camera (CVPR 2010)

Authors: Richard A. Newcombe and Andrew J. Davison


We present a method which enables rapid and dense reconstruction of scenes browsed by a single live camera. We take point-based real-time structure from motion (SFM) as our starting point, generating accurate 3D camera pose estimates and a sparse point cloud. Our main novel contribution is to use an approximate but smooth base mesh generated from the SFM to predict the view at a bundle of poses around automatically selected reference frames spanning the scene, and then warp the base mesh into highly accurate depth maps based on view-predictive optical flow and a constrained scene flow update. The quality of the resulting depth maps means that a convincing global scene model can be obtained simply by placing them side by side and removing overlapping regions. We show that a cluttered indoor environment can be reconstructed from a live hand-held camera in a few seconds, with all processing performed by current desktop hardware. Real-time monocular dense reconstruction opens up many application areas, and we demonstrate both real-time novel view synthesis and advanced augmented reality where augmentations interact physically with the 3D scene and are correctly clipped by occlusions.

Monday, July 05, 2010

Lab Meeting July 6th 2010 (Andi): Upsampling Range Data in Dynamic Environments (CVPR 2010 )


Jennifer Dolson, Jongmin Baek, Christian Plagemann and Sebastian Thrun (Stanford University)


We present a flexible method for fusing information from optical and range sensors based on an accelerated high-dimensional filtering approach. Our system takes as input a sequence of monocular camera images as well as a stream of sparse range measurements as obtained from a laser or other sensor system. In contrast with existing approaches, we do not assume that the depth and color data streams have the same data rates or that the observed scene is fully static. Our method produces a dense, high-resolution depth map of the scene, automatically generating confidence values for every interpolated depth point. We describe how to integrate priors on object shape, motion and appearance and how to achieve an efficient implementation using parallel processing hardware such as GPUs.