Robot Perception and Learning: 2012

Monday, December 17, 2012

Lab Meeting December 19th, 2012 (Jeff): Inference on networks of mixtures for robust robot mapping

Title: Inference on networks of mixtures for robust robot mapping

Authors: Edwin Olson and Pratik Agarwal

Abstract:

The central challenge in robotic mapping is obtaining reliable data associations (or “loop closures”): state-of-the-art inference algorithms can fail catastrophically if even
one erroneous loop closure is incorporated into the map. Consequently, much work has been done to push error rates closer to zero. However, a long-lived or multi-robot system will
still encounter errors, leading to system failure.

We propose a fundamentally different approach: allow richer error models that allow the probability of a failure to be explicitly modeled. In other words, we optimize the map while
simultaneously determining which loop closures are correct from within a single integrated Bayesian framework. Unlike earlier multiple-hypothesis approaches, our approach avoids
exponential memory complexity and is fast enough for realtime performance.

We show that the proposed method not only allows loop closing errors to be automatically identified, but also that in extreme cases, the “front-end” loop-validation systems can be unnecessary. We demonstrate our system both on standard benchmarks and on the real-world datasets that motivated this work.

Link:
Robotics: Science and Systems(RSS), 2012
LocalLink
http://april.eecs.umich.edu/papers/details.php?name=olson2012rss
http://www.roboticsproceedings.org/rss08/p40.pdf

Monday, December 10, 2012

Lab meeting Dec. 12, 2012 (Alan): A Simple Prior-free Method for Non-Rigid Structure-from-Motion Factorization (CVPR 2012 Best Paper Award)

Title: A Simple Prior-free Method for Non-Rigid Structure-from-Motion Factorization (CVPR 2012 Best Paper Award)
Authors: Yuchao Dai, Hongdong Li, Mingyi He

Abstract:
This paper proposes a simple “prior-free” method for solving non-rigid structure-from-motion factorization problems. Other than using the basic low-rank condition, our method does not assume any extra prior knowledge about the nonrigid scene or about the camera motions. Yet, it runs reliably, produces optimal result, and does not suffer from the inherent basis-ambiguity issue which plagued many conventional nonrigid factorization techniques.

Our method is easy to implement, which involves solving no more than an SDP (semi-deﬁnite programming) of small and ﬁxed size, a linear Least-Squares or trace-norm minimization. Extensive experiments have demonstrated that it outperforms most of the existing linear methods of nonrigid factorization. This paper offers not only new theoretical insight, but also a practical, everyday solution, to non-rigid structure-from-motion.

Link

Tuesday, December 04, 2012

Lab meeting Dec 5th 2012 (Jim): Imitation Learning by Coaching

Title: Imitation Learning by Coaching
Authors: He He, Hal Daumé III and Jason Eisner
Neural Information Processing Systems (NIPS), 2012

Abstract:
... we propose to use a coach that demonstrates easy-to-learn actions for the learner and gradually approaches the oracle. ... We apply our algorithm to cost-sensitive dynamic feature selection, a hard decision problem that considers a user-specified accuracy-cost trade-off. ...

Link

Wednesday, October 24, 2012

NTU CSIE Talk: [2012-11-09] Dr. Koji Yatani, "A Ph.D. – What does it take?"

Title: A Ph.D. – What does it take?

Date: 2012-11-09 2:20pm

Location: R103

Speaker: Dr. Koji Yatani, Microsoft Research Asia

Abstract:

Getting a Ph.D. surely needs long effort, but why? Of course, research takes time, but a Ph.D. is not just about research. A Ph.D. student needs to be more than just a research person to be a successful Ph.D. This talk is not about a collection of my research projects (although I will introduce some of them a bit); rather, a collection of my experience in research at University of Toronto, Microsoft Research Asia, and industry labs where I did my internship. Through this talk, I will attempt to share my thoughts on what I believe a Ph.D. student should do and learn before getting her Ph.D. Your honest discussions, opinions and feedback would be greatly appreciated.

Biography:

Dr. Koji Yatani (http://yatani.jp) is an associate researcher in Human-Computer Interaction Group at Microsoft Research Asia. His main research interests lie in Human-Computer Interaction (HCI) and its intersections with Ubiquitous Computing and Computational Linguistics. More specifically, he is interested in designing new forms of interacting with mobile devices, and developing new hardware and sensing technologies to support user interactions in mobile/ubiquitous computing environments. He is also interested in developing interactive systems and exploring new applications using computational linguistics methods.

He received B.Eng. and M.Sci. from University of Tokyo in 2003 and 2005, respectively, and his Ph.D. in Computer Science from University of Toronto in 2011. On November 2011, he joined HCI group at Microsoft Research Asia in Beijing. He was a recipient of NTT Docomo Scholarship (October 2003 -- March 2005), and Japan Society for the Promotion of Science Research Fellowship for Young Scientists (April 2005 -- March 2006). He received the Best Paper Award at CHI 2011. He served as a program committee on CHI 2013, Ubicomp 2012, and WHC 2013. He also served as a Mentoring co-chair on ITS 2012.

Tuesday, October 16, 2012

Lab meeting Oct 17th 2012 (Hank): Motion Segmentation of Multiple Objects from a Freely Moving Monocular Camera

Link

Presented by Hank Lin

From ICRA2012

Authors: Rahul Kumar Namdev, Abhijit Kundu, K Madhava Krishna and C. V. Jawahar

Abstract:
Motion segmentation or segmentation of moving
objects is an inevitable component for mobile robotic systems
such as the case with robots performing SLAM and collision
avoidance in dynamic worlds. This paper proposes an incre-mental motion segmentation system that efficiently segments
multiple moving objects and simultaneously build the map of
the environment using visual SLAM modules. Multiple cues
based on optical flow and two view geometry are integrated
to achieve this segmentation. A dense optical flow algorithm
provides for dense tracking of features. Motion potentials based
on geometry are computed for each of these dense tracks. These
geometric potentials along with optical flow potentials are used
to form a graph like structure. A graph based segmentation
algorithm then clusters together nodes of similar potentials
to form the eventual motion segments. Experimental results
of high quality segmentation on different publicly available
datasets demonstrate the effectiveness of our method.

Tuesday, October 02, 2012

Lab meeting Oct 3rd 2012 (Gene): Decentralised Cooperative Localisation for Heterogeneous Teams of Mobile Robots

Link

Presented by Chun-Kai (Gene) Chang

From ICRA2011 Australian Centre for Field Robotics, University of Sydney, NSW, Australia

Authors: Tim Bailey, Mitch Bryson, Hua Mu , John Vial, Lachlan McCalman and Hugh Durrant-Whyte

Abstract:
This paper presents a distributed algorithm for
performing joint localisation of a team of robots. The mobile
robots have heterogeneous sensing capabilities, with some having
high quality inertial and exteroceptive sensing, while others have
only low quality sensing or none at all. By sharing information,
a combined estimate of all robot poses is obtained. Interrobot
range-bearing measurements provide the mechanism for
transferring pose information from well-localised vehicles to those
less capable.
In our proposed formulation, high frequency egocentric data
(e.g., odometry, IMU, GPS) is fused locally on each platform. This
is the distributed part of the algorithm. Inter-robot measurements,
and accompanying state estimates, are communicated to a central
server, which generates an optimal minimum mean-squared
estimate of all robot poses. This server is easily duplicated for
full redundant decentralisation. Communication and computation
are efficient due to the sparseness properties of the informationform
Gaussian representation. A team of three indoor mobile
robots equipped with lasers, odometry and inertial sensing provides
experimental verification of the algorithms effectiveness in
combining location information.

Sunday, September 09, 2012

Lab meeting Sep 12th 2012 (Chih-Chung): Global motion planning under uncertain motion, sensing,and environment map

[LINK]

Presented by Chih-Chung

From Autonomous Robots, Volume 33, No.3, 2012, pp. 255-272

Authors:
Hanna Kurniawati · Tirthankar Bandyopadhyay ·
Nicholas M. Patrikalakis

Abstract:
Uncertainty in motion planning is often caused by
three main sources: motion error, sensing error, and imperfect
environment map. Despite the significant effect of all
three sources of uncertainty to motion planning problems,
most planners take into account only one or at most two
of them. We propose a new motion planner, called Guided
Cluster Sampling (GCS), that takes into account all three
sources of uncertainty for robots with active sensing capabilities.
GCS uses the Partially Observable Markov Decision
Process (POMDP) framework and the point-based
POMDP approach. Although point-based POMDPs have
shown impressive progress over the past few years, it performs
poorly when the environment map is imperfect. This
poor performance is due to the extremely high dimensional
state space, which translates to the extremely large belief
space B.
We alleviate this problem by constructing a more suitable
sampling distribution based on the observations that when the
robot has active sensing capability, B can be partitioned
into a collection of much smaller sub-spaces, and an optimal
policy can often be generated by sufficient sampling of
a small subset of the collection. Utilizing these observations,
GCS samples B in two-stages, a subspace is sampled from
the collection and then a belief is sampled from the subspace.

It uses information from the set of sampled sub-spaces
and sampled beliefs to guide subsequent sampling. Simulation
results on marine robotics scenarios suggest that GCS
can generate reasonable policies for motion planning problems
with uncertain motion, sensing, and environment map,
that are unsolvable by the best point-based POMDPs today.
Furthermore, GCS handles POMDPs with continuous state,
action, and observation spaces. We show that for a class of
POMDPs that often occur in robot motion planning, given
enough time, GCS converges to the optimal policy.

To the best of our knowledge, this is the first convergence
result for point-based POMDPs with continuous action
space.

Tuesday, June 05, 2012

Lab Meeting June 6th, 2012 (Chiao-Hui ): Robot Musical Accompaniment: Integrating Audio and Visual Cues for Real-time Synchronization with a Human Flutist

Chiao-Hui will present the following paper:
Robot Musical Accompaniment: Integrating Audio and Visual Cues for Real-time Synchronization with a Human Flutist

Authors: Angelica Lim, Takeshi Mizumoto, Louis-Kenzo Cahier, Takuma Otsuka, Toru Takahashi, Kazunori Komatani, Tetsuya Ogata and Hiroshi G. Okuno

From: IROS 2010

Abstract:
Musicians often have the following problem: they have a music score that requires 2 or more players, but they have no one with whom to practice. So far, score-playing music robots exist, but they lack adaptive abilities to synchronize with fellow players’ tempo variations. In other words, if the human speeds up their play, the robot should also increase its speed. However, computer accompaniment systems allow exactly this kind of adaptive ability. We present a first step towards giving these accompaniment abilities to a music robot. We introduce a new paradigm of beat tracking using 2 types of sensory input – visual and audio – using our own visual cue recognition system and state-of-the-art acoustic onset detection techniques. Preliminary experiments suggest that by coupling these two modalities, a robot accompanist can start and stop a performance in synchrony with a flutist, and detect tempo changes within half a second.

Link

Tuesday, May 29, 2012

Lab Meeting May 30th, 2012 (Wei-Shin): Progress Report

I will present my current progress on chair manipulation at lab meeting. (Borrowed account for posting.)

Tuesday, May 22, 2012

Lab Meeting May 22th, 2012 (Mark):Strong supervision from weak annotation: Interactive training of deformable part models

We propose a framework for large scale learning and annotation of structured models. The system interleaves interactive labeling (where the current model is used to semi-automate the labeling of a new example) and online learning (where a newly labeled example is used to update the current model parameters). This framework is scalable to large datasets and complex image models and is shown to have excellent theoretical and practical properties in terms of train time, optimality guarantees, and bounds on the amount of annotation effort per image. We apply this framework to part-based detection, and introduce a novel algorithm for interactive labeling of deformable part models. The labeling tool updates and displays in real-time the maximum likelihood location of all parts as the user clicks and drags the location of one or more parts. We demonstrate that the system can be used to efficiently and robustly train part and pose detectors on the CUB Birds-200-a challenging dataset of birds in unconstrained pose and environment.
paper link

Tuesday, May 01, 2012

[Robot Perception and Learning] Meeting 2012/05/02 (Andi):Energy Based Multiple Model Fitting for Non-Rigid Structure from Motion

Energy Based Multiple Model Fitting for Non-Rigid Structure from Motion

Authors: Chris Russell, Joao Fayad, Lourdes Agapito

From: CVPR '11

Abstract:

In this paper we reformulate the 3D reconstruction of deformable surfaces from monocular video sequences as a labeling problem. We solve simultaneously for the assignment of feature points to multiple local deformation models and the fitting of models to points to minimize a geometric cost, subject to a spatial constraint that neighboring points should also belong to the same model.

Piecewise reconstruction methods rely on features shared between models to enforce global consistency on the 3D surface. To account for this overlap between regions, we consider a super-set of the classic labeling problem in which a set of labels, instead of a single one, is assigned to each variable. We propose a mathematical formulation of this new model and show how it can be efficiently optimized with a variant of -expansion. We demonstrate how this framework can be applied to Non-Rigid Structure from Motion and leads to simpler explanations of the same data. Compared to existing methods run on the same data, our approach has up to half the reconstruction error, and is more robust to over-fitting and outliers.

Link: get paper here

Wednesday, April 25, 2012

Lab Meeting April 25th, 2012 (David): Progress Report

I will present my research progress on laser scanner and stereo camera sensor fusion, with application on pedestrian detection.

Monday, April 09, 2012

Lab Meeting April 11th, 2012 (Jeff): Progress Report

The progress of my current work on Failure Tolerance SLAM with RFID tags would be shared during the meeting.

Monday, March 26, 2012

Lab Meeting, March 28, 2012 (Alan): Realtime Multibody Visual SLAM with a Smoothly Moving Monocular Camera (ICCV 2011)

Title: Realtime Multibody Visual SLAM with a Smoothly Moving Monocular Camera

In: 2011 IEEE International Conference on Computer Vision (ICCV 2011)

Authors: Abhijit Kundu, K Madhava Krishna and C. V. Jawahar

Abstract:

This paper presents a realtime, incremental multibody visual SLAM system that allows choosing between full 3D reconstruction or simply tracking of the moving objects. Motion reconstruction of dynamic points or objects from a monocular camera is considered very hard due to well known problems of observability. We attempt to solve the problem with a Bearing only Tracking (BOT) and by integrating multiple cues to avoid observability issues. The BOT is accomplished through a particle filter, and by integrating multiple cues from the reconstruction pipeline. With the help of these cues, many real world scenarios which are considered unobservable with a monocular camera is solved to reasonable accuracy. This enables building of a unified dynamic 3D map of scenes involving multiple moving objects. Tracking and reconstruction is preceded by motion segmentation and detection which makes use of efficient geometric constraints to avoid difficult degenerate motions, where objects move in the epipolar plane. Results reported on multiple challenging real world image sequences verify the efficacy of the proposed framework.

Link: local copy

Saturday, March 17, 2012

Lab Meeting Mar. 21, 2012 (Wang Li): Object interaction detection using hand posture cues in an office setting (IJHCS 2011)

Object interaction detection using hand posture cues in an office setting

Brandon Paulson
Danielle Cummings
Tracy Hammond

Abstract

The goal of this paper is to determine if hand posture can be used as a cue to determine the types of interactions a user has with objects in a desk/office environment. Our experiments indicate that (a) hand posture can be used to determine object interaction, with accuracy rates around 97%, and (b) hand posture is dependent upon the individual user when users are allowed to interact with objects as they would naturally.

Paper Link

Tuesday, March 06, 2012

Lab Meeting March 07, 2012 (Jimmy): Hand-Grip and Body-Loss Impact on RSS Measurements for Localization of Mass Market Devices

Title: Hand-Grip and Body-Loss Impact on RSS Measurements for Localization of Mass Market Devices
Authors: Rosa, F.D.; Li Xu; Nurmi, J.; Pelosi, M.; Laoudias, C.; Terrezza, A.
In: IEEE International Conference on Localization and GNSS (ICL-GNSS), 2011

Abstract
In this paper we present the effect of the hand-grip and the presence of the human body on received signal strength measurements when performing positioning of mass market devices in indoor environments. We demonstrate that the mitigation of both human body and hand-grip influence can enhance the positioning accuracy and that the human factor cannot be neglected in experimental activities with real mobile devices.

[link]

Monday, February 27, 2012

Lab Meeting Feb. 29 (Hank): Creating Household Environment Map for Environment Manipulation Using Color Range Sensors on Environment and Robot

Authors: Yohei Kakiuchi and Ryohei Ueda and Kei Okada and Masayuki Inaba

Abstract— A humanoid robot working in a household environment with people needs to localize and continuously update the locations of obstacles and manipulable objects. Achieving such system, requires strong perception method to efﬁciently update the frequently changing environment.

We propose a method for mapping a household environment using multiple stereo and depth cameras located on the humanoid head and the environment. The method relies on colored 3D point cloud data computed from the sensors. We achieve robot localization by matching the point clouds from the robot sensor data directly with the environment sensor data. Object detection is performed using Iterative Closest Point (ICP) with a database of known point cloud models. In order to guarantee accurate object detection results, objects are only detected within the robot sensor data. Furthermore, we utilize the environment sensor data to map out of the obstacles as bounding convex hulls.

We show experimental results creating a household environment map with known object labels and estimate the robot position in this map.

[link]

Thursday, February 16, 2012

Lab meeting Feb 22(Chih Chung): Motion planning in urban environments (Journal of Field Robotics 2008)

Author: Dave Ferguson, Thomas M. Howard and Maxim Likhachev

Abstract
We present the motion planning framework for an autonomous vehicle navigating through urban environments. Such environments present a number of motion planning challenges, including ultrareliability, high-speed operation, complex intervehicle interaction, parking in large unstructured lots, and constrained maneuvers. Our approach combines a model-predictive trajectory generation algorithm for computing dynamically feasible actions with two higher level planners for generating long-range plans in both on-road and unstructured areas of the environment. In the first part of this article, we describe the underlying trajectory generator and the on-road planning component of this system. We then describe the unstructured planning component of this system used for navigating through parking lots and recovering from anomalous on-road scenarios. Throughout, we provide examples and results from “Boss” an autonomous sport utility vehicle that has driven itself over 3,000 km and competed in, and won, the DARPA Urban Challenge.

[LINK]