Robot Perception and Learning: November 2011

Thursday, November 24, 2011

Lab Meeting November 24, 2011 (Hank): A Large-Scale Hierarchical Multi-View RGB-D Object Dataset (ICRA 2011)

Authors: K. Lai, L. Bo, X. Ren, and D. Fox.
Title:A Large-Scale Hierarchical Multi-View RGB-D Object Dataset
In: Proc. of International Conference on Robotics and Automation (ICRA), 2011

Abstract:
Over the last decade, the availability of public image repositories and recognition benchmarks has enabled rapid progress in visual object category and instance detection. Today we are witnessing the birth of a new generation of sensing technologies capable of providing high quality synchronized videos of both color and depth, the RGB-D (Kinectstyle) camera. With its advanced sensing capabilities and the potential for mass adoption, this technology represents an opportunity to dramatically increase robotic object recognition, manipulation, navigation, and interaction capabilities. In this paper, we introduce a large-scale, hierarchical multi-view object dataset collected using an RGB-D camera. The dataset contains 300 objects organized into 51 categories and has been made publicly available to the research community so as to enable rapid progress based on this promising technology. This paper describes the dataset collection procedure and introduces techniques for RGB-D based object recognition and detection, demonstrating that combining color and depth information substantially improves quality of results.

link

Wednesday, November 23, 2011

Lab Meeting November 24, 2011 (Jimmy): Tracking Mobile Users in Wireless Networks via Semi-Supervised Co-Localization (TPAMI 2011)

Title: Tracking Mobile Users in Wireless Networks via Semi-Supervised Co-Localization
Authors: Jeffrey Junfeng Pan, Sinno Jialin Pan, Jie Yin, Lionel M. Ni, and Qiang Yang
In: TPAMI 2011

Abstract
Recent years have witnessed growing popularity of sensor and sensor-network technologies, supporting important practical applications. One of the fundamental issues is how to accurately locate a user with few labelled data in a wireless sensor network, where a major difﬁculty arises from the need to label large quantities of user location data, which in turn requires knowledge about the locations of signal transmitters, or access points. To solve this problem, we have developed a novel machine-learning-based approach that combines collaborative ﬁltering with graph-based semi-supervised learning to learn both mobile-users’ locations and the locations of access points. Our framework exploits both labelled and unlabelled data from mobile devices and access points. In our two-phase solution, we ﬁrst build a manifold-based model from a batch of labelled and unlabelled data in an ofﬂine training phase and then use a weighted k-nearest-neighbor method to localize a mobile client in an online localization phase. We extend the two-phase co-localization to an online and incremental model that can deal with labelled and unlabelled data that come sequentially and adapt to environmental changes. Finally, we embed an action model to the framework such that additional kinds of sensor signals can be utilized to further boost the performance of mobile tracking. Compared to other state-of-the-art systems, our framework has been shown to be more accurate while requiring less calibration effort in our experiments performed at three different test-beds.

[pdf]

Wednesday, November 16, 2011

Lab Meeting November 17, 2011 (Chih-Chung): Motion Planning under Uncertainty for Robotic Tasks with Long Time Horizons (IJRR 2011)

Authors: Hanna Kurniawati, Yanzhu Du, David Hsu and Wee Sun Lee.

Abstract:
Motion planning with imperfect state information is a crucial capability for autonomous robots to operate reliably in uncertain and dynamic environments. Partially observable Markov decision processes (POMDPs) provide a principled general framework for planning under uncertainty. Using probabilistic sampling, point-based POMDP solvers have drastically improved the speed of POMDP planning, enabling us to handle moderately complex robotic tasks. However, robot motion planning tasks with long time horizons remains a severe obstacle for even the fastest point-based POMDP solvers today. This paper proposes Milestone Guided Sampling (MiGS), a new point-based POMDP solver,which exploits state space information to reduce e ective planning horizons. MiGS samples a set of points, called milestones, from a robot's state space and constructs a simpli ed representation of the state space from the sampled milestones. It then uses this representation of the state space to guide
sampling in the belief space and tries to capture the essential features of the belief space with a small number of sampled points. Preliminary results are very promising. We tested MiGS in simulation on several di cult POMDPs that model distinct robotic tasks with long time horizons in both 2-D and 3-D environments. These POMDPs are impossible to solve with the fastest point-based solvers today, but MiGS solved them in a few minutes.

Link

Wednesday, November 02, 2011

Lab Meeting November 03, 2011 (David): Real-Time Multi-Person Tracking with Detector Assisted Structure Propagation (ICCV'11 Workshop)

Lab Meeting November 03, 2011 (David): Real-Time Multi-Person Tracking with Detector Assisted Structure Propagation (ICCV'11 Workshop)

Authors: Dennis Mitzel and Bastian Leibe

Abstract:
Classical tracking-by-detection approaches require a robust object detector that needs to be executed in each frame. However the detector is typically the most computationally expensive component, especially if more than one object class needs to be detected. In this paper we investigate how the usage of the object detector can be reduced by using stereo range data for following detected objects over time. To this end we propose a hybrid tracking framework consisting of a stereo based ICP (Iterative Closest Point) tracker and a high-level multi-hypothesis tracker. Initiated by a detector response, the ICP tracker follows individual pedestrians over time using just the raw depth information. Its output is then fed into the high-level tracker that is responsible for solving long-term data association and occlusion handling. In addition, we propose to constrain the detector to run only on some small regions of interest (ROIs) that are extracted from a 3D depth based occupancy map of the scene. The ROIs are tracked over time and only newly appearing ROIs are evaluated by the detector. We present experiments on real stereo sequences recorded from a moving camera setup in urban scenarios and show that our proposed approach achieves state of the art performance

Link