This Blog is maintained by the Robot Perception and Learning lab at CSIE, NTU, Taiwan. Our scientific interests are driven by the desire to build intelligent robots and computers, which are capable of servicing people more efficiently than equivalent manned systems in a wide variety of dynamic and unstructured environments.
Wednesday, December 28, 2011
Lab Meeting Dec. 29, 2011 (David): Semantic fusion of laser and vision in pedestrian detection (PR 2010)
Lab Meeting Dec. 29, 2011 (David): Semantic fusion of laser and vision in pedestrian detection (PR 2010)
Luciano Oliveira, Urbano Nunes, Paulo Peixoto, Marco Silva, Fernando Moita
Abstract
Fusion of laser and vision in object detection has been accomplished by two main approaches: (1) independent integration of sensor-driven features or sensor-driven classifiers, or (2) a region of interest (ROI) is found by laser segmentation and an image classifier is used to name the projected ROI. Here, we propose a novel fusion approach based on semantic information, and embodied on many levels. Sensor fusion is based on spatial relationship of parts-based classifiers, being performed via a Markov logic network. The proposed system deals with partial segments, it is able to recover depth information even if the laser fails, and the integration is modeled through contextual information—characteristics not found on previous approaches. Experiments in pedestrian detection demonstrate the effectiveness of our method over data sets gathered in urban scenarios.
Paper Link
Local Link
Wednesday, December 21, 2011
Lab Meeting Dec. 22, 2011 (Wang Li): Fast Point Feature Histograms (FPFH) for 3D Registration (ICRA 2009)
Fast Point Feature Histograms (FPFH) for 3D Registration
Radu Bogdan Rusu
Nico Blodow
Michael Beetz
Abstract
In this paper, we modify the mathematical expressions of Point Feature Histograms (PFH), and perform a rigorous analysis on their robustness and complexity for the problem of 3D registration. More concretely, we present optimizations that reduce the computation times drastically by either caching previously computed values or by revising their theoretical formulations. The latter results in a new type of local features, called Fast Point Feature Histograms (FPFH), which retain most of the discriminative power of the PFH. Moreover, we propose an algorithm for the online computation of FPFH features, demonstrate their efficiency for 3D registration, and propose a new sample consensus based method for bringing two datasets into the convergence basin of a local non-linear optimizer: SAC-IA (SAmple Consensus Initial Alignment).
Paper Link
Radu Bogdan Rusu
Nico Blodow
Michael Beetz
Abstract
In this paper, we modify the mathematical expressions of Point Feature Histograms (PFH), and perform a rigorous analysis on their robustness and complexity for the problem of 3D registration. More concretely, we present optimizations that reduce the computation times drastically by either caching previously computed values or by revising their theoretical formulations. The latter results in a new type of local features, called Fast Point Feature Histograms (FPFH), which retain most of the discriminative power of the PFH. Moreover, we propose an algorithm for the online computation of FPFH features, demonstrate their efficiency for 3D registration, and propose a new sample consensus based method for bringing two datasets into the convergence basin of a local non-linear optimizer: SAC-IA (SAmple Consensus Initial Alignment).
Paper Link
Lab Meeting December 22nd, 2011 (Jeff): Towards Semantic SLAM using a Monocular Camera
Title: Towards Semantic SLAM using a Monocular Camera
Authors: Javier Civera, Dorian G´alvez-L´opez, L. Riazuelo, Juan D. Tard´os, and J. M. M. Montiel
Abstract:
Monocular SLAM systems have been mainly focused on producing geometric maps just composed of points or edges; but without any associated meaning or semantic content.
In this paper, we propose a semantic SLAM algorithm that merges in the estimated map traditional meaningless points with known objects. The non-annotated map is built using only the information extracted from a monocular image sequence. The known object models are automatically computed from a sparse set of images gathered by cameras that may be different from the SLAM camera. The models include both visual appearance and tridimensional information. The semantic or annotated part of the map –the objects– are estimated using the information in the image sequence and the precomputed object models.
The proposed algorithm runs an EKF monocular SLAM parallel to an object recognition thread. This latest one informs of the presence of an object in the sequence by searching
for SURF correspondences and checking afterwards their geometric compatibility. When an object is recognized it is inserted in the SLAM map, being its position measured and hence refined by the SLAM algorithm in subsequent frames. Experimental results show real-time performance for a handheld camera imaging a desktop environment and for a camera
mounted in a robot moving in a room-sized scenario.
Link:
IEEE International Conference on Intelligent Robots and Systems(IROS), 2011
LocalLink
http://webdiis.unizar.es/~jcivera/papers/civera_etal_iros11.pdf
Authors: Javier Civera, Dorian G´alvez-L´opez, L. Riazuelo, Juan D. Tard´os, and J. M. M. Montiel
Abstract:
Monocular SLAM systems have been mainly focused on producing geometric maps just composed of points or edges; but without any associated meaning or semantic content.
In this paper, we propose a semantic SLAM algorithm that merges in the estimated map traditional meaningless points with known objects. The non-annotated map is built using only the information extracted from a monocular image sequence. The known object models are automatically computed from a sparse set of images gathered by cameras that may be different from the SLAM camera. The models include both visual appearance and tridimensional information. The semantic or annotated part of the map –the objects– are estimated using the information in the image sequence and the precomputed object models.
The proposed algorithm runs an EKF monocular SLAM parallel to an object recognition thread. This latest one informs of the presence of an object in the sequence by searching
for SURF correspondences and checking afterwards their geometric compatibility. When an object is recognized it is inserted in the SLAM map, being its position measured and hence refined by the SLAM algorithm in subsequent frames. Experimental results show real-time performance for a handheld camera imaging a desktop environment and for a camera
mounted in a robot moving in a room-sized scenario.
Link:
IEEE International Conference on Intelligent Robots and Systems(IROS), 2011
LocalLink
http://webdiis.unizar.es/~jcivera/papers/civera_etal_iros11.pdf
Thursday, December 15, 2011
Lab Meeting Dec. 15, 2011 (Alan): Two-View Motion Segmentation with Model Selection and Outlier Removal by RANSAC-Enhanced Dirichlet ... (IJCV 2010)
Title: Two-View Motion Segmentation with Model Selection and Outlier Removal by RANSAC-Enhanced Dirichlet Process Mixture Models (IJCV 2010)
Authors: Yong-Dian Jian, Chu-Song Chen
Abstract:
We propose a novel motion segmentation algorithm based on mixture of Dirichlet process (MDP) models. In contrast to previous approaches, we consider motion segmentation and its model selection regarding to the number of motion models as an inseparable problem. Our algorithm can simultaneously infer the number of motion models, estimate the cluster memberships of correspondences, and identify the outliers. The main idea is to use MDP models to fully exploit the geometric consistencies before making premature decisions about the number of motion models. To handle outliers, we incorporate RANSAC into the inference process of MDP models. In the experiments, we compare the proposed algorithm with naive RANSAC, GPCA and Schindler’s method on both synthetic data and real image data. The experimental results show that we can handlemore motions and have satisfactory performance in the presence of various levels of noise and outlier.
Link
Authors: Yong-Dian Jian, Chu-Song Chen
Abstract:
We propose a novel motion segmentation algorithm based on mixture of Dirichlet process (MDP) models. In contrast to previous approaches, we consider motion segmentation and its model selection regarding to the number of motion models as an inseparable problem. Our algorithm can simultaneously infer the number of motion models, estimate the cluster memberships of correspondences, and identify the outliers. The main idea is to use MDP models to fully exploit the geometric consistencies before making premature decisions about the number of motion models. To handle outliers, we incorporate RANSAC into the inference process of MDP models. In the experiments, we compare the proposed algorithm with naive RANSAC, GPCA and Schindler’s method on both synthetic data and real image data. The experimental results show that we can handlemore motions and have satisfactory performance in the presence of various levels of noise and outlier.
Link
Monday, December 05, 2011
Lab Meeting Dec. 8, 2011 (Jim): Execution of a Dual-Object (Pushing) Action with Semantic Event Chains
Title: “Execution of a Dual-Object (Pushing) Action with Semantic Event Chains”
Authors: Aksoy Eren Erdal, Dellen Babette, Tamosiunaite Minija, and Wörgötter Florentin
In IEEE-RAS Int. Conf. on Humanoid Robots, pp.576-583
Abstract:
Here we present a framework for manipulation execution based on the so called “Semantic Event Chain” which is an abstract description of relations between the objects in the scene. It captures the change of those relations during a manipulation and thereby provides the decisive temporal anchor points by which a manipulation is critically defined. Using semantic event chains a model of a manipulation can be learned. We will show that it is possible to add the required control parameters (the spatial anchor points) to this model, which can then be executed by a robot in a fully autonomous way. The process of learning and execution of semantic event chains is explained using a box pushing example
Link
Authors: Aksoy Eren Erdal, Dellen Babette, Tamosiunaite Minija, and Wörgötter Florentin
In IEEE-RAS Int. Conf. on Humanoid Robots, pp.576-583
Abstract:
Here we present a framework for manipulation execution based on the so called “Semantic Event Chain” which is an abstract description of relations between the objects in the scene. It captures the change of those relations during a manipulation and thereby provides the decisive temporal anchor points by which a manipulation is critically defined. Using semantic event chains a model of a manipulation can be learned. We will show that it is possible to add the required control parameters (the spatial anchor points) to this model, which can then be executed by a robot in a fully autonomous way. The process of learning and execution of semantic event chains is explained using a box pushing example
Link