This Blog is maintained by the Robot Perception and Learning lab at CSIE, NTU, Taiwan. Our scientific interests are driven by the desire to build intelligent robots and computers, which are capable of servicing people more efficiently than equivalent manned systems in a wide variety of dynamic and unstructured environments.
Wednesday, December 28, 2011
Lab Meeting Dec. 29, 2011 (David): Semantic fusion of laser and vision in pedestrian detection (PR 2010)
Lab Meeting Dec. 29, 2011 (David): Semantic fusion of laser and vision in pedestrian detection (PR 2010)
Luciano Oliveira, Urbano Nunes, Paulo Peixoto, Marco Silva, Fernando Moita
Abstract
Fusion of laser and vision in object detection has been accomplished by two main approaches: (1) independent integration of sensor-driven features or sensor-driven classifiers, or (2) a region of interest (ROI) is found by laser segmentation and an image classifier is used to name the projected ROI. Here, we propose a novel fusion approach based on semantic information, and embodied on many levels. Sensor fusion is based on spatial relationship of parts-based classifiers, being performed via a Markov logic network. The proposed system deals with partial segments, it is able to recover depth information even if the laser fails, and the integration is modeled through contextual information—characteristics not found on previous approaches. Experiments in pedestrian detection demonstrate the effectiveness of our method over data sets gathered in urban scenarios.
Paper Link
Local Link
Wednesday, December 21, 2011
Lab Meeting Dec. 22, 2011 (Wang Li): Fast Point Feature Histograms (FPFH) for 3D Registration (ICRA 2009)
Fast Point Feature Histograms (FPFH) for 3D Registration
Radu Bogdan Rusu
Nico Blodow
Michael Beetz
Abstract
In this paper, we modify the mathematical expressions of Point Feature Histograms (PFH), and perform a rigorous analysis on their robustness and complexity for the problem of 3D registration. More concretely, we present optimizations that reduce the computation times drastically by either caching previously computed values or by revising their theoretical formulations. The latter results in a new type of local features, called Fast Point Feature Histograms (FPFH), which retain most of the discriminative power of the PFH. Moreover, we propose an algorithm for the online computation of FPFH features, demonstrate their efficiency for 3D registration, and propose a new sample consensus based method for bringing two datasets into the convergence basin of a local non-linear optimizer: SAC-IA (SAmple Consensus Initial Alignment).
Paper Link
Radu Bogdan Rusu
Nico Blodow
Michael Beetz
Abstract
In this paper, we modify the mathematical expressions of Point Feature Histograms (PFH), and perform a rigorous analysis on their robustness and complexity for the problem of 3D registration. More concretely, we present optimizations that reduce the computation times drastically by either caching previously computed values or by revising their theoretical formulations. The latter results in a new type of local features, called Fast Point Feature Histograms (FPFH), which retain most of the discriminative power of the PFH. Moreover, we propose an algorithm for the online computation of FPFH features, demonstrate their efficiency for 3D registration, and propose a new sample consensus based method for bringing two datasets into the convergence basin of a local non-linear optimizer: SAC-IA (SAmple Consensus Initial Alignment).
Paper Link
Lab Meeting December 22nd, 2011 (Jeff): Towards Semantic SLAM using a Monocular Camera
Title: Towards Semantic SLAM using a Monocular Camera
Authors: Javier Civera, Dorian G´alvez-L´opez, L. Riazuelo, Juan D. Tard´os, and J. M. M. Montiel
Abstract:
Monocular SLAM systems have been mainly focused on producing geometric maps just composed of points or edges; but without any associated meaning or semantic content.
In this paper, we propose a semantic SLAM algorithm that merges in the estimated map traditional meaningless points with known objects. The non-annotated map is built using only the information extracted from a monocular image sequence. The known object models are automatically computed from a sparse set of images gathered by cameras that may be different from the SLAM camera. The models include both visual appearance and tridimensional information. The semantic or annotated part of the map –the objects– are estimated using the information in the image sequence and the precomputed object models.
The proposed algorithm runs an EKF monocular SLAM parallel to an object recognition thread. This latest one informs of the presence of an object in the sequence by searching
for SURF correspondences and checking afterwards their geometric compatibility. When an object is recognized it is inserted in the SLAM map, being its position measured and hence refined by the SLAM algorithm in subsequent frames. Experimental results show real-time performance for a handheld camera imaging a desktop environment and for a camera
mounted in a robot moving in a room-sized scenario.
Link:
IEEE International Conference on Intelligent Robots and Systems(IROS), 2011
LocalLink
http://webdiis.unizar.es/~jcivera/papers/civera_etal_iros11.pdf
Authors: Javier Civera, Dorian G´alvez-L´opez, L. Riazuelo, Juan D. Tard´os, and J. M. M. Montiel
Abstract:
Monocular SLAM systems have been mainly focused on producing geometric maps just composed of points or edges; but without any associated meaning or semantic content.
In this paper, we propose a semantic SLAM algorithm that merges in the estimated map traditional meaningless points with known objects. The non-annotated map is built using only the information extracted from a monocular image sequence. The known object models are automatically computed from a sparse set of images gathered by cameras that may be different from the SLAM camera. The models include both visual appearance and tridimensional information. The semantic or annotated part of the map –the objects– are estimated using the information in the image sequence and the precomputed object models.
The proposed algorithm runs an EKF monocular SLAM parallel to an object recognition thread. This latest one informs of the presence of an object in the sequence by searching
for SURF correspondences and checking afterwards their geometric compatibility. When an object is recognized it is inserted in the SLAM map, being its position measured and hence refined by the SLAM algorithm in subsequent frames. Experimental results show real-time performance for a handheld camera imaging a desktop environment and for a camera
mounted in a robot moving in a room-sized scenario.
Link:
IEEE International Conference on Intelligent Robots and Systems(IROS), 2011
LocalLink
http://webdiis.unizar.es/~jcivera/papers/civera_etal_iros11.pdf
Thursday, December 15, 2011
Lab Meeting Dec. 15, 2011 (Alan): Two-View Motion Segmentation with Model Selection and Outlier Removal by RANSAC-Enhanced Dirichlet ... (IJCV 2010)
Title: Two-View Motion Segmentation with Model Selection and Outlier Removal by RANSAC-Enhanced Dirichlet Process Mixture Models (IJCV 2010)
Authors: Yong-Dian Jian, Chu-Song Chen
Abstract:
We propose a novel motion segmentation algorithm based on mixture of Dirichlet process (MDP) models. In contrast to previous approaches, we consider motion segmentation and its model selection regarding to the number of motion models as an inseparable problem. Our algorithm can simultaneously infer the number of motion models, estimate the cluster memberships of correspondences, and identify the outliers. The main idea is to use MDP models to fully exploit the geometric consistencies before making premature decisions about the number of motion models. To handle outliers, we incorporate RANSAC into the inference process of MDP models. In the experiments, we compare the proposed algorithm with naive RANSAC, GPCA and Schindler’s method on both synthetic data and real image data. The experimental results show that we can handlemore motions and have satisfactory performance in the presence of various levels of noise and outlier.
Link
Authors: Yong-Dian Jian, Chu-Song Chen
Abstract:
We propose a novel motion segmentation algorithm based on mixture of Dirichlet process (MDP) models. In contrast to previous approaches, we consider motion segmentation and its model selection regarding to the number of motion models as an inseparable problem. Our algorithm can simultaneously infer the number of motion models, estimate the cluster memberships of correspondences, and identify the outliers. The main idea is to use MDP models to fully exploit the geometric consistencies before making premature decisions about the number of motion models. To handle outliers, we incorporate RANSAC into the inference process of MDP models. In the experiments, we compare the proposed algorithm with naive RANSAC, GPCA and Schindler’s method on both synthetic data and real image data. The experimental results show that we can handlemore motions and have satisfactory performance in the presence of various levels of noise and outlier.
Link
Monday, December 05, 2011
Lab Meeting Dec. 8, 2011 (Jim): Execution of a Dual-Object (Pushing) Action with Semantic Event Chains
Title: “Execution of a Dual-Object (Pushing) Action with Semantic Event Chains”
Authors: Aksoy Eren Erdal, Dellen Babette, Tamosiunaite Minija, and Wörgötter Florentin
In IEEE-RAS Int. Conf. on Humanoid Robots, pp.576-583
Abstract:
Here we present a framework for manipulation execution based on the so called “Semantic Event Chain” which is an abstract description of relations between the objects in the scene. It captures the change of those relations during a manipulation and thereby provides the decisive temporal anchor points by which a manipulation is critically defined. Using semantic event chains a model of a manipulation can be learned. We will show that it is possible to add the required control parameters (the spatial anchor points) to this model, which can then be executed by a robot in a fully autonomous way. The process of learning and execution of semantic event chains is explained using a box pushing example
Link
Authors: Aksoy Eren Erdal, Dellen Babette, Tamosiunaite Minija, and Wörgötter Florentin
In IEEE-RAS Int. Conf. on Humanoid Robots, pp.576-583
Abstract:
Here we present a framework for manipulation execution based on the so called “Semantic Event Chain” which is an abstract description of relations between the objects in the scene. It captures the change of those relations during a manipulation and thereby provides the decisive temporal anchor points by which a manipulation is critically defined. Using semantic event chains a model of a manipulation can be learned. We will show that it is possible to add the required control parameters (the spatial anchor points) to this model, which can then be executed by a robot in a fully autonomous way. The process of learning and execution of semantic event chains is explained using a box pushing example
Link
Thursday, November 24, 2011
Lab Meeting November 24, 2011 (Hank): A Large-Scale Hierarchical Multi-View RGB-D Object Dataset (ICRA 2011)
Authors: K. Lai, L. Bo, X. Ren, and D. Fox.
Title:A Large-Scale Hierarchical Multi-View RGB-D Object Dataset
In: Proc. of International Conference on Robotics and Automation (ICRA), 2011
Abstract:
Over the last decade, the availability of public image repositories and recognition benchmarks has enabled rapid progress in visual object category and instance detection. Today we are witnessing the birth of a new generation of sensing technologies capable of providing high quality synchronized videos of both color and depth, the RGB-D (Kinectstyle) camera. With its advanced sensing capabilities and the potential for mass adoption, this technology represents an opportunity to dramatically increase robotic object recognition, manipulation, navigation, and interaction capabilities. In this paper, we introduce a large-scale, hierarchical multi-view object dataset collected using an RGB-D camera. The dataset contains 300 objects organized into 51 categories and has been made publicly available to the research community so as to enable rapid progress based on this promising technology. This paper describes the dataset collection procedure and introduces techniques for RGB-D based object recognition and detection, demonstrating that combining color and depth information substantially improves quality of results.
link
Title:A Large-Scale Hierarchical Multi-View RGB-D Object Dataset
In: Proc. of International Conference on Robotics and Automation (ICRA), 2011
Abstract:
Over the last decade, the availability of public image repositories and recognition benchmarks has enabled rapid progress in visual object category and instance detection. Today we are witnessing the birth of a new generation of sensing technologies capable of providing high quality synchronized videos of both color and depth, the RGB-D (Kinectstyle) camera. With its advanced sensing capabilities and the potential for mass adoption, this technology represents an opportunity to dramatically increase robotic object recognition, manipulation, navigation, and interaction capabilities. In this paper, we introduce a large-scale, hierarchical multi-view object dataset collected using an RGB-D camera. The dataset contains 300 objects organized into 51 categories and has been made publicly available to the research community so as to enable rapid progress based on this promising technology. This paper describes the dataset collection procedure and introduces techniques for RGB-D based object recognition and detection, demonstrating that combining color and depth information substantially improves quality of results.
link
Wednesday, November 23, 2011
Lab Meeting November 24, 2011 (Jimmy): Tracking Mobile Users in Wireless Networks via Semi-Supervised Co-Localization (TPAMI 2011)
Title: Tracking Mobile Users in Wireless Networks via Semi-Supervised Co-Localization
Authors: Jeffrey Junfeng Pan, Sinno Jialin Pan, Jie Yin, Lionel M. Ni, and Qiang Yang
In: TPAMI 2011
Abstract
Recent years have witnessed growing popularity of sensor and sensor-network technologies, supporting important practical applications. One of the fundamental issues is how to accurately locate a user with few labelled data in a wireless sensor network, where a major difficulty arises from the need to label large quantities of user location data, which in turn requires knowledge about the locations of signal transmitters, or access points. To solve this problem, we have developed a novel machine-learning-based approach that combines collaborative filtering with graph-based semi-supervised learning to learn both mobile-users’ locations and the locations of access points. Our framework exploits both labelled and unlabelled data from mobile devices and access points. In our two-phase solution, we first build a manifold-based model from a batch of labelled and unlabelled data in an offline training phase and then use a weighted k-nearest-neighbor method to localize a mobile client in an online localization phase. We extend the two-phase co-localization to an online and incremental model that can deal with labelled and unlabelled data that come sequentially and adapt to environmental changes. Finally, we embed an action model to the framework such that additional kinds of sensor signals can be utilized to further boost the performance of mobile tracking. Compared to other state-of-the-art systems, our framework has been shown to be more accurate while requiring less calibration effort in our experiments performed at three different test-beds.
[pdf]
Authors: Jeffrey Junfeng Pan, Sinno Jialin Pan, Jie Yin, Lionel M. Ni, and Qiang Yang
In: TPAMI 2011
Abstract
Recent years have witnessed growing popularity of sensor and sensor-network technologies, supporting important practical applications. One of the fundamental issues is how to accurately locate a user with few labelled data in a wireless sensor network, where a major difficulty arises from the need to label large quantities of user location data, which in turn requires knowledge about the locations of signal transmitters, or access points. To solve this problem, we have developed a novel machine-learning-based approach that combines collaborative filtering with graph-based semi-supervised learning to learn both mobile-users’ locations and the locations of access points. Our framework exploits both labelled and unlabelled data from mobile devices and access points. In our two-phase solution, we first build a manifold-based model from a batch of labelled and unlabelled data in an offline training phase and then use a weighted k-nearest-neighbor method to localize a mobile client in an online localization phase. We extend the two-phase co-localization to an online and incremental model that can deal with labelled and unlabelled data that come sequentially and adapt to environmental changes. Finally, we embed an action model to the framework such that additional kinds of sensor signals can be utilized to further boost the performance of mobile tracking. Compared to other state-of-the-art systems, our framework has been shown to be more accurate while requiring less calibration effort in our experiments performed at three different test-beds.
[pdf]
Wednesday, November 16, 2011
Lab Meeting November 17, 2011 (Chih-Chung): Motion Planning under Uncertainty for Robotic Tasks with Long Time Horizons (IJRR 2011)
Authors: Hanna Kurniawati, Yanzhu Du, David Hsu and Wee Sun Lee.
Abstract:
Motion planning with imperfect state information is a crucial capability for autonomous robots to operate reliably in uncertain and dynamic environments. Partially observable Markov decision processes (POMDPs) provide a principled general framework for planning under uncertainty. Using probabilistic sampling, point-based POMDP solvers have drastically improved the speed of POMDP planning, enabling us to handle moderately complex robotic tasks. However, robot motion planning tasks with long time horizons remains a severe obstacle for even the fastest point-based POMDP solvers today. This paper proposes Milestone Guided Sampling (MiGS), a new point-based POMDP solver,which exploits state space information to reduce e ective planning horizons. MiGS samples a set of points, called milestones, from a robot's state space and constructs a simpli ed representation of the state space from the sampled milestones. It then uses this representation of the state space to guide
sampling in the belief space and tries to capture the essential features of the belief space with a small number of sampled points. Preliminary results are very promising. We tested MiGS in simulation on several di cult POMDPs that model distinct robotic tasks with long time horizons in both 2-D and 3-D environments. These POMDPs are impossible to solve with the fastest point-based solvers today, but MiGS solved them in a few minutes.
Link
Abstract:
Motion planning with imperfect state information is a crucial capability for autonomous robots to operate reliably in uncertain and dynamic environments. Partially observable Markov decision processes (POMDPs) provide a principled general framework for planning under uncertainty. Using probabilistic sampling, point-based POMDP solvers have drastically improved the speed of POMDP planning, enabling us to handle moderately complex robotic tasks. However, robot motion planning tasks with long time horizons remains a severe obstacle for even the fastest point-based POMDP solvers today. This paper proposes Milestone Guided Sampling (MiGS), a new point-based POMDP solver,which exploits state space information to reduce e ective planning horizons. MiGS samples a set of points, called milestones, from a robot's state space and constructs a simpli ed representation of the state space from the sampled milestones. It then uses this representation of the state space to guide
sampling in the belief space and tries to capture the essential features of the belief space with a small number of sampled points. Preliminary results are very promising. We tested MiGS in simulation on several di cult POMDPs that model distinct robotic tasks with long time horizons in both 2-D and 3-D environments. These POMDPs are impossible to solve with the fastest point-based solvers today, but MiGS solved them in a few minutes.
Link
Wednesday, November 02, 2011
Lab Meeting November 03, 2011 (David): Real-Time Multi-Person Tracking with Detector Assisted Structure Propagation (ICCV'11 Workshop)
Lab Meeting November 03, 2011 (David): Real-Time Multi-Person Tracking with Detector Assisted Structure Propagation (ICCV'11 Workshop)
Authors: Dennis Mitzel and Bastian Leibe
Abstract:
Classical tracking-by-detection approaches require a robust object detector that needs to be executed in each frame. However the detector is typically the most computationally expensive component, especially if more than one object class needs to be detected. In this paper we investigate how the usage of the object detector can be reduced by using stereo range data for following detected objects over time. To this end we propose a hybrid tracking framework consisting of a stereo based ICP (Iterative Closest Point) tracker and a high-level multi-hypothesis tracker. Initiated by a detector response, the ICP tracker follows individual pedestrians over time using just the raw depth information. Its output is then fed into the high-level tracker that is responsible for solving long-term data association and occlusion handling. In addition, we propose to constrain the detector to run only on some small regions of interest (ROIs) that are extracted from a 3D depth based occupancy map of the scene. The ROIs are tracked over time and only newly appearing ROIs are evaluated by the detector. We present experiments on real stereo sequences recorded from a moving camera setup in urban scenarios and show that our proposed approach achieves state of the art performance
Link
Wednesday, October 26, 2011
Lab Meeting October 27, 2011 (ShaoChen): A multiple hypothesis people tracker for teams of mobile robots (ICRA 2010)
Title: A multiple hypothesis people tracker for teams of mobile robots (ICRA 2010)
Authors: Tsokas, N.A. and Kyriakopoulos, K.J.
Abstract: This paper tackles the problem of tracking walking people with multiple moving robots equipped with laser rangefinders. We present an adaptation to the classic Multiple Hypothesis Tracking method, which allows for one-to-many associations between targets and measurements in each cycle and is thus capable of operating in a multi-sensor scenario. In the context of two experiments, the successful integration of our tracking algorithm to a dual-robot setup is assessed.
Wednesday, October 12, 2011
Lab Meeting October 13, 2011 (Alan): A Model-Selection Framework for Multibody Structure-and-Motion of Image Sequences (IJCV 2008)
Title: A Model-Selection Framework for Multibody Structure-and-Motion of Image Sequences (IJCV 2008)
Authors: Konrad Schindler, David Suter and Hanzi Wang
Abstract: Given an image sequence of a scene consisting of multiple rigidly moving objects, multi-body structureand-motion (MSaM) is the task to segment the image feature tracks into the different rigid objects and compute the multiple-view geometry of each object.We present a framework for multibody structure-and-motion based on model selection. In a recover-and-select procedure, a redundant set of hypothetical scene motions is generated. Each subset of this pool of motion candidates is regarded as a possible explanation of the image feature tracks, and the most likely explanation is selected with model selection. The framework is
generic and can be used with any parametric camera model, or with a combination of different models. It can deal with sets of correspondences, which change over time, and it is robust to realistic amounts of outliers. The framework is demonstrated for different camera and scene models.
Link
Authors: Konrad Schindler, David Suter and Hanzi Wang
Abstract: Given an image sequence of a scene consisting of multiple rigidly moving objects, multi-body structureand-motion (MSaM) is the task to segment the image feature tracks into the different rigid objects and compute the multiple-view geometry of each object.We present a framework for multibody structure-and-motion based on model selection. In a recover-and-select procedure, a redundant set of hypothetical scene motions is generated. Each subset of this pool of motion candidates is regarded as a possible explanation of the image feature tracks, and the most likely explanation is selected with model selection. The framework is
generic and can be used with any parametric camera model, or with a combination of different models. It can deal with sets of correspondences, which change over time, and it is robust to realistic amounts of outliers. The framework is demonstrated for different camera and scene models.
Link
Tuesday, October 11, 2011
Lab Meeting October 13th, 2011 (Jeff): Object Mapping, Recognition, and Localization from Tactile Geometry
Title: Object Mapping, Recognition, and Localization from Tactile Geometry
Authors: Zachary Pezzementi, Caitlin Reyda, and Gregory D. Hager
Abstract:
We present a method for performing object recognition using multiple images acquired from a tactile sensor. The method relies on using the tactile sensor as an imaging device, and builds an object representation based on mosaics of tactile measurements. We then describe an algorithm that is able to recognize an object using a small number of tactile sensor readings. Our approach makes extensive use of sequential state estimation techniques from the mobile robotics literature, whereby we view the object recognition problem as one of estimating a consistent location within a set of object maps. We examine and test approaches based on both traditional
particle filtering and histogram filtering. We demonstrate both the mapping and recognition / localization techniques on a set of raised letter shapes using real tactile sensor data.
Link:
IEEE International Conference on Robotics and Automation(ICRA), 2011
LocalLink
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5980363
Authors: Zachary Pezzementi, Caitlin Reyda, and Gregory D. Hager
Abstract:
We present a method for performing object recognition using multiple images acquired from a tactile sensor. The method relies on using the tactile sensor as an imaging device, and builds an object representation based on mosaics of tactile measurements. We then describe an algorithm that is able to recognize an object using a small number of tactile sensor readings. Our approach makes extensive use of sequential state estimation techniques from the mobile robotics literature, whereby we view the object recognition problem as one of estimating a consistent location within a set of object maps. We examine and test approaches based on both traditional
particle filtering and histogram filtering. We demonstrate both the mapping and recognition / localization techniques on a set of raised letter shapes using real tactile sensor data.
Link:
IEEE International Conference on Robotics and Automation(ICRA), 2011
LocalLink
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5980363
Wednesday, September 28, 2011
Lab Meeting Sep. 29, 2011 (Wang Li): A Coarse-to-fine Approach for Fast Deformable Object Detection (CVPR 2011)
A Coarse-to-fine Approach for Fast Deformable Object Detection
Marco Pedersoli
Andrea Vedaldi
Jordi González
Abstract
We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of part-to-image comparisons. To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements. We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy.
Paper Link
Marco Pedersoli
Andrea Vedaldi
Jordi González
Abstract
We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of part-to-image comparisons. To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements. We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy.
Paper Link
Wednesday, September 21, 2011
Lab Meeting September 22nd, 2011 (Jimmy): Vector Field SLAM
Title: Vector Field SLAM
Authors: Jens-Steffen Gutmann, Gabriel Brisson, Ethan Eade, Philip Fong and Mario Munich
In: ICRA 2010
Abstract
Localization in unknown environments using low-cost sensors remains a challenge. This paper presents a new localization approach that learns the spatial variation of an observed continuous signal. We model the signal as a piecewise linear function and estimate its parameters using a simultaneous localization and mapping (SLAM) approach. We apply our framework to a sensor measuring bearing to active beacons where measurements are systematically distorted due to occlusion and signal reflections of walls and other objects present in the environment. Experimental results from running GraphSLAM and EKF-SLAM on manually collected sensor measurements as well as on data recorded on a vacuum-cleaner robot validate our model.
[pdf]
Authors: Jens-Steffen Gutmann, Gabriel Brisson, Ethan Eade, Philip Fong and Mario Munich
In: ICRA 2010
Abstract
Localization in unknown environments using low-cost sensors remains a challenge. This paper presents a new localization approach that learns the spatial variation of an observed continuous signal. We model the signal as a piecewise linear function and estimate its parameters using a simultaneous localization and mapping (SLAM) approach. We apply our framework to a sensor measuring bearing to active beacons where measurements are systematically distorted due to occlusion and signal reflections of walls and other objects present in the environment. Experimental results from running GraphSLAM and EKF-SLAM on manually collected sensor measurements as well as on data recorded on a vacuum-cleaner robot validate our model.
[pdf]
Sunday, September 18, 2011
Lab Meeting September 22nd, 2011 (Jim): Learning the semantics of object–action relations by observation
Title: Learning the semantics of object–action relations by observation
Author: Eren Erdal Aksoy, Alexey Abramov, Johannes Dörr, Kejun Ning, Babette Dellen, and Florentin Wörgötter
The International Journal of Robotics Research 2011;30 1229-1249
Author: Eren Erdal Aksoy, Alexey Abramov, Johannes Dörr, Kejun Ning, Babette Dellen, and Florentin Wörgötter
The International Journal of Robotics Research 2011;30 1229-1249
Abstract:
Recognizing manipulations performed by a human and the transfer and execution of this by a robot is a difficult problem. We address this in the current study by introducing a novel representation of the relations between objects at decisive time points during a manipulation. Thereby, we encode the essential changes in a visual scenery in a condensed way such that a robot can recognize and learn a manipulation without prior object knowledge. To achieve this we continuously track image segments in the video and construct a dynamic graph sequence. Topological transitions of those graphs occur whenever a spatial relation between some segments has changed in a discontinuous way and these moments are stored in a transition matrix called the semantic event chain (SEC). We demonstrate that these time points are highly descriptive for distinguishing between different manipulations. Employing simple sub-string search algorithms, SECs can be compared and type-similar manipulations can be recognized with high confidence. As the approach is generic, statistical learning can be used to find the archetypal SEC of a given manipulation class. ...
http://ijr.sagepub.com/content/30/10/1229.full.pdf+html
Friday, September 09, 2011
Lab Meeting September 9th, 2011 (Steven): Learning Generic Invariances in Object Recognition: Translation and Scale
Title: Learning Generic Invariances in Object Recognition: Translation and Scale
Authors: Joel Z Leibo, Jim Mutch, Lorenzo Rosasco, Shimon Ullman4, and Tomaso Poggio
Abstract:
Authors: Joel Z Leibo, Jim Mutch, Lorenzo Rosasco, Shimon Ullman4, and Tomaso Poggio
Abstract:
Invariance to various transformations is key to object recognition but existing definitions of invariance are somewhat confusing while discussions of invariance are often confused. In this report, we provide an operational definition of invariance by formally defining perceptual tasks as classification problems. The definition should be appropriate for physiology, psychophysics and computational modeling.
For any specific object, invariance can be trivially “learned” by memorizing a sufficient number of example images of the transformed object. While our formal definition of invariance also covers such cases, this report focuses instead on invariance from very few images and mostly on invariances from one example. Image-plane invariances – such as translation, rotation and scaling – can be computed from a single image for any object. They are called generic since in principle they can be hardwired or learned (during development) for any object.
In this perspective, we characterize the invariance range of a class of feedforward architectures for visual recognition that mimic the hierarchical organization of the ventral stream.
We show that this class of models achieves essentially perfect translation and scaling invariance for novel images. In this architecture a new image is represented in terms of weights of ”templates” (e.g. “centers” or “basis functions”) at each level in the hierarchy. Such a representation inherits the invariance of each template, which is implemented through replication of the corresponding “simple” units across positions or scales and their “association” in a “complex” unit. We show simulations on real images that characterize the type and number of templates needed to support the invariant recognition of novel objects. We find that 1) the templates need not be visually similar to the target objects and that 2) a very small number of them is sufficient for good recognition.
These somewhat surprising empirical results have intriguing implications for the learning of invariant recognition during the development of a biological organism, such as a human baby. In particular, we conjecture that invariance to translation and scale may be learned by the association – through temporal contiguity – of a small number of primal templates, that is patches extracted from the images of an object moving on the retina across positions and scales. The number of templates can later be augmented by bootstrapping mechanisms using the correspondence provided by the primal templates – without the need of temporal contiguity.
Link
These somewhat surprising empirical results have intriguing implications for the learning of invariant recognition during the development of a biological organism, such as a human baby. In particular, we conjecture that invariance to translation and scale may be learned by the association – through temporal contiguity – of a small number of primal templates, that is patches extracted from the images of an object moving on the retina across positions and scales. The number of templates can later be augmented by bootstrapping mechanisms using the correspondence provided by the primal templates – without the need of temporal contiguity.
Link
Thursday, September 08, 2011
Lab Meeting September 9th, 2011 (Chih Chung): Identification and Representation of Homotopy (RSS 2011 Best paper)
title: Identification and Representation of Homotopy
Classes of Trajectories for Search-based Path
Planning in 3D
Authors: Subhrajit Bhattacharya, Maxim Likhachev and Vijay Kumar
Abstract: There are many applications in motion planning
where it is important to consider and distinguish between
different homotopy classes of trajectories. Two trajectories are
homotopic if one trajectory can be continuously deformed into
another without passing through an obstacle, and a homotopy
class is a collection of homotopic trajectories. In this paper
we consider the problem of robot exploration and planning in
three-dimensional configuration spaces to (a) identify and classify
different homotopy classes; and (b) plan trajectories constrained
to certain homotopy classes or avoiding specified homotopy
classes. In previous work [1] we have solved this problem for
two-dimensional, static environments using the Cauchy Integral
Theorem in concert with graph search techniques. The robot
workspace is mapped to the complex plane and obstacles are poles
in this plane. The Residue Theorem allows the use of integration
along the path to distinguish between trajectories in different
homotopy classes. However, this idea is fundamentally limited
to two dimensions. In this work we develop new techniques to
solve the same problem, but in three dimensions, using theorems
from electromagnetism. The Biot-Savart law lets us design an
appropriate vector field, the line integral of which, using the
integral form of Ampere’s Law, encodes information about
homotopy classes in three dimensions. Skeletons of obstacles
in the robot world are extracted and are modeled by currentcarrying
conductors. We describe the development of a practical
graph-search based planning tool with theoretical guarantees
by combining integration theory with search techniques, and
illustrate it with examples in three-dimensional spaces such as
two-dimensional, dynamic environments and three-dimensional
static environments.
link
Classes of Trajectories for Search-based Path
Planning in 3D
Authors: Subhrajit Bhattacharya, Maxim Likhachev and Vijay Kumar
Abstract: There are many applications in motion planning
where it is important to consider and distinguish between
different homotopy classes of trajectories. Two trajectories are
homotopic if one trajectory can be continuously deformed into
another without passing through an obstacle, and a homotopy
class is a collection of homotopic trajectories. In this paper
we consider the problem of robot exploration and planning in
three-dimensional configuration spaces to (a) identify and classify
different homotopy classes; and (b) plan trajectories constrained
to certain homotopy classes or avoiding specified homotopy
classes. In previous work [1] we have solved this problem for
two-dimensional, static environments using the Cauchy Integral
Theorem in concert with graph search techniques. The robot
workspace is mapped to the complex plane and obstacles are poles
in this plane. The Residue Theorem allows the use of integration
along the path to distinguish between trajectories in different
homotopy classes. However, this idea is fundamentally limited
to two dimensions. In this work we develop new techniques to
solve the same problem, but in three dimensions, using theorems
from electromagnetism. The Biot-Savart law lets us design an
appropriate vector field, the line integral of which, using the
integral form of Ampere’s Law, encodes information about
homotopy classes in three dimensions. Skeletons of obstacles
in the robot world are extracted and are modeled by currentcarrying
conductors. We describe the development of a practical
graph-search based planning tool with theoretical guarantees
by combining integration theory with search techniques, and
illustrate it with examples in three-dimensional spaces such as
two-dimensional, dynamic environments and three-dimensional
static environments.
link
Thursday, September 01, 2011
Lab Meeting September 2nd, 2011 (David): Multiclass Multimodal Detection and Tracking in Urban Environments
Title: Multiclass Multimodal Detection and Tracking in Urban Environments
Author: Luciano Spinello, Rudolph Triebel and Roland Siegwart
Abstract:
This paper presents a novel approach to detect and track people and cars based on the combined information retrieved from a camera and a laser range scanner. Laser data points are classified by using boosted Conditional Random Fields (CRF), while the image based detector uses an extension of the Implicit Shape Model (ISM), which learns a codebook of local descriptors from a set of hand-labeled images and uses them to vote for centers of detected objects. Our extensions to ISM include the learning of object parts and template masks to obtain more distinctive votes for the particular object classes. The detections from both sensors are then fused and the objects are tracked using a Kalman Filter with multiple motion models. Experiments conducted in real-world urban scenarios demonstrate the effectiveness of our approach.
Link:
IJRR copy
localcopy
Author: Luciano Spinello, Rudolph Triebel and Roland Siegwart
Abstract:
This paper presents a novel approach to detect and track people and cars based on the combined information retrieved from a camera and a laser range scanner. Laser data points are classified by using boosted Conditional Random Fields (CRF), while the image based detector uses an extension of the Implicit Shape Model (ISM), which learns a codebook of local descriptors from a set of hand-labeled images and uses them to vote for centers of detected objects. Our extensions to ISM include the learning of object parts and template masks to obtain more distinctive votes for the particular object classes. The detections from both sensors are then fused and the objects are tracked using a Kalman Filter with multiple motion models. Experiments conducted in real-world urban scenarios demonstrate the effectiveness of our approach.
Link:
IJRR copy
localcopy
Thursday, August 18, 2011
Lab Meeting August 19th, 2011 (Jeff): A Robust Qualitative Planner for Mobile Robot Navigation Using Human-Provided Maps
Title: A Robust Qualitative Planner for Mobile Robot Navigation Using Human-Provided Maps
Authors: Danelle C. Shah and Mark E. Campbell
Abstract:
A novel method for controlling a mobile robot using qualitative inputs in the context of an approximate map, such as one sketched by a human, is presented. By defining a desired trajectory with respect to observable landmarks, human operators can send semi-autonomous robots into areas for which a truth map is not available. Waypoint planning is formulated as a quadratic optimization problem, resulting in robot trajectories in the true environment that are qualitatively similar to those provided by the human. The algorithm is implemented both in simulation and on a mobile robot platform in several different environments. A sensitivity analysis is performed, illustrating how the method is robust to uncertainties, even large sketch distortions, and allows the robot to adapt and re-plan according to its most current perception of the world.
Link:
IEEE International Conference on Robotics and Automation(ICRA), 2011
LocalLink
Authors: Danelle C. Shah and Mark E. Campbell
Abstract:
A novel method for controlling a mobile robot using qualitative inputs in the context of an approximate map, such as one sketched by a human, is presented. By defining a desired trajectory with respect to observable landmarks, human operators can send semi-autonomous robots into areas for which a truth map is not available. Waypoint planning is formulated as a quadratic optimization problem, resulting in robot trajectories in the true environment that are qualitatively similar to those provided by the human. The algorithm is implemented both in simulation and on a mobile robot platform in several different environments. A sensitivity analysis is performed, illustrating how the method is robust to uncertainties, even large sketch distortions, and allows the robot to adapt and re-plan according to its most current perception of the world.
Link:
IEEE International Conference on Robotics and Automation(ICRA), 2011
LocalLink
Saturday, July 23, 2011
Monday, June 27, 2011
Lab meeting June 29th (Jim): A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Title: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Stephane Ross, Geoffrey Gordon, and J. Andrew (Drew) Bagnell
Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS), April, 2011.
Abstracts:
Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. ... In this paper, we propose a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting. We show that any such no regret algorithm, combined with additional reduction assumptions, must find a policy with good performance under the distribution of observations it induces in such sequential settings.
Link
Stephane Ross, Geoffrey Gordon, and J. Andrew (Drew) Bagnell
Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS), April, 2011.
Abstracts:
Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. ... In this paper, we propose a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting. We show that any such no regret algorithm, combined with additional reduction assumptions, must find a policy with good performance under the distribution of observations it induces in such sequential settings.
Link
Monday, June 20, 2011
Lab Meeting June 22th (Chih-Chung):Minimum Snap Trajectory Generation and Control for Quadrotors (ICRA2011,best paper)
Title: Minimum Snap Trajectory Generation and Control for Quadrotors
Authors: Daniel Mellinger and Vijay Kumar
Abstracts:
We address the controller design and the trajectory
generation for a quadrotor maneuvering in three
dimensions in a tightly constrained setting typical of indoor
environments. In such settings, it is necessary to allow for
significant excursions of the attitude from the hover state and
small angle approximations cannot be justified for the roll
and pitch. We develop an algorithm that enables the real-time
generation of optimal trajectories through a sequence of 3-D
positions and yaw angles, while ensuring safe passage through
specified corridors and satisfying constraints on velocities,
accelerations and inputs. A nonlinear controller ensures the
faithful tracking of these trajectories. Experimental results
illustrate the application of the method to fast motion (5-10
body lengths/second) in three-dimensional slalom courses.
[link]
Authors: Daniel Mellinger and Vijay Kumar
Abstracts:
We address the controller design and the trajectory
generation for a quadrotor maneuvering in three
dimensions in a tightly constrained setting typical of indoor
environments. In such settings, it is necessary to allow for
significant excursions of the attitude from the hover state and
small angle approximations cannot be justified for the roll
and pitch. We develop an algorithm that enables the real-time
generation of optimal trajectories through a sequence of 3-D
positions and yaw angles, while ensuring safe passage through
specified corridors and satisfying constraints on velocities,
accelerations and inputs. A nonlinear controller ensures the
faithful tracking of these trajectories. Experimental results
illustrate the application of the method to fast motion (5-10
body lengths/second) in three-dimensional slalom courses.
[link]
Wednesday, June 15, 2011
Lab Meeting June 15th (Shao-Chen): Distributed Robust Data Fusion Based on Dynamic Voting (ICRA2011)
Title: Distributed Robust Data Fusion Based on Dynamic Voting
Authors: Eduardo Montijano, Sonia Mart´ınez and Carlos Sagues
Abstract:
Data association mistakes, estimation and measurement errors are some of the factors that can contribute to incorrect observations in robotic sensor networks. In order to act reliably, a robotic network must be able to fuse and correct its perception of the world by discarding any outlier information. This is a difficult task if the network is to be deployed remotely and the robots do not have access to groundtruth sites or manual calibration. In this paper, we present a novel, distributed scheme for robust data fusion in autonomous robotic networks. The proposed method adapts the RANSAC algorithm to exploit measurement redundancy, and enables robots determine an inlier observation with local communications. Different hypotheses are generated and voted for using a dynamic consensus algorithm. As the hypotheses are computed, the robots can change their opinion making the voting process dynamic. Assuming that at least one hypothesis is initialized with only inliers, we show that the method converges to the maximum likelihood of all the inlier observations in a general instance. Several simulations exhibit the good performance of the algorithm, which also gives acceptable results in situations where the conditions to guarantee convergence do not hold.
[link]
Authors: Eduardo Montijano, Sonia Mart´ınez and Carlos Sagues
Abstract:
Data association mistakes, estimation and measurement errors are some of the factors that can contribute to incorrect observations in robotic sensor networks. In order to act reliably, a robotic network must be able to fuse and correct its perception of the world by discarding any outlier information. This is a difficult task if the network is to be deployed remotely and the robots do not have access to groundtruth sites or manual calibration. In this paper, we present a novel, distributed scheme for robust data fusion in autonomous robotic networks. The proposed method adapts the RANSAC algorithm to exploit measurement redundancy, and enables robots determine an inlier observation with local communications. Different hypotheses are generated and voted for using a dynamic consensus algorithm. As the hypotheses are computed, the robots can change their opinion making the voting process dynamic. Assuming that at least one hypothesis is initialized with only inliers, we show that the method converges to the maximum likelihood of all the inlier observations in a general instance. Several simulations exhibit the good performance of the algorithm, which also gives acceptable results in situations where the conditions to guarantee convergence do not hold.
[link]
Tuesday, June 14, 2011
Lab Meeting June 15th (David): Sparse Scene Flow Segmentation for Moving Object Detection (Intelligent Vehicles Symposium 2011)
Title: Sparse Scene Flow Segmentation for Moving Object Detection (Intelligent Vehicles Symposium 2011)
Authors: P. Lenz, J. Ziegler, A. Geiger, M. Roser
Abstract:
Modern driver assistance systems such as collision avoidance or intersection assistance need reliable information on the current environment. Extracting such information from camera-based systems is a complex and challenging task for inner city taffic scenarios. This paper presents an approach for object detection utilizing sparse scene flow. For consecutive stereo images taken from a moving vehicle, corresponding interest points are extracted. Thus, for every interest point, disparity and optical flow values are known and consequently, scene flow can be calculated. Adjacent interest points describing a similar scene flow are considered to belong to one rigid object. The proposed method does not rely on object classes and allows for a robust detection of dynamic objects in traffic scenes. Leading vehicles are continuously detected for several frames. Oncoming objects are detected within five frames after their appearance.
Link: http://www.rainsoft.de/publications/iv11b.pdf
Authors: P. Lenz, J. Ziegler, A. Geiger, M. Roser
Abstract:
Modern driver assistance systems such as collision avoidance or intersection assistance need reliable information on the current environment. Extracting such information from camera-based systems is a complex and challenging task for inner city taffic scenarios. This paper presents an approach for object detection utilizing sparse scene flow. For consecutive stereo images taken from a moving vehicle, corresponding interest points are extracted. Thus, for every interest point, disparity and optical flow values are known and consequently, scene flow can be calculated. Adjacent interest points describing a similar scene flow are considered to belong to one rigid object. The proposed method does not rely on object classes and allows for a robust detection of dynamic objects in traffic scenes. Leading vehicles are continuously detected for several frames. Oncoming objects are detected within five frames after their appearance.
Link: http://www.rainsoft.de/publications/iv11b.pdf
Tuesday, June 07, 2011
Lab Meeting June 8th, 2011 (Jeff): Incremental Construction of the Saturated-GVG for Multi-Hypothesis Topological SLAM
Title: Incremental Construction of the Saturated-GVG for Multi-Hypothesis
Topological SLAM
Authors: Tong Tao, Stephen Tully, George Kantor, and Howie Choset
Abstract:
The generalized Voronoi graph (GVG) is a topological representation of an environment that can be incrementally constructed with a mobile robot using sensor-based control. However, because of sensor range limitations, the GVG control law will fail when the robot moves into a large open area. This paper discusses an extended GVG approach to topological navigation and mapping: the saturated generalized Voronoi graph (S-GVG), for which the robot employs an additional wall-following behavior to navigate along obstacles at the range limit of the sensor. In this paper, we build upon previous work related to the S-GVG and provide two important contributions: 1) a rigorous discussion of the control laws and algorithm modifications that are necessary for incremental construction of the S-GVG with a mobile robot, and 2) a method for incorporating the S-GVG into a novel multi-hypothesis SLAM algorithm for loop-closing and localization. Experiments with a wheeled mobile robot in an office-like environment validate the ffectiveness of the proposed approach.
Link:
IEEE International Conference on Robotics and Automation(ICRA), 2011
http://www.cs.cmu.edu/~biorobotics/papers/icra11_tao.pdf
LocalLink
Topological SLAM
Authors: Tong Tao, Stephen Tully, George Kantor, and Howie Choset
Abstract:
The generalized Voronoi graph (GVG) is a topological representation of an environment that can be incrementally constructed with a mobile robot using sensor-based control. However, because of sensor range limitations, the GVG control law will fail when the robot moves into a large open area. This paper discusses an extended GVG approach to topological navigation and mapping: the saturated generalized Voronoi graph (S-GVG), for which the robot employs an additional wall-following behavior to navigate along obstacles at the range limit of the sensor. In this paper, we build upon previous work related to the S-GVG and provide two important contributions: 1) a rigorous discussion of the control laws and algorithm modifications that are necessary for incremental construction of the S-GVG with a mobile robot, and 2) a method for incorporating the S-GVG into a novel multi-hypothesis SLAM algorithm for loop-closing and localization. Experiments with a wheeled mobile robot in an office-like environment validate the ffectiveness of the proposed approach.
Link:
IEEE International Conference on Robotics and Automation(ICRA), 2011
http://www.cs.cmu.edu/~biorobotics/papers/icra11_tao.pdf
LocalLink
Wednesday, June 01, 2011
Lab Meeting June 1, 2011 (Alan): Semantic Structure from Motion (CVPR 2011)
Title: Semantic Structure from Motion (CVPR 2011)
Authors: Sid Yingze Bao and Silvio Savarese
Abstract
Conventional rigid structure from motion (SFM) addresses the problem of recovering the camera parameters (motion) and the 3D locations (structure) of scene points, given observed 2D image feature points. In this paper, we propose a new formulation called Semantic Structure From Motion (SSFM). In addition to the geometrical constraints provided by SFM, SSFM takes advantage of both semantic and geometrical properties associated with objects in the scene (Fig. 1). These properties allow us to recover not only the structure and motion but also the 3D locations, poses, and categories of objects in the scene. We cast this problem as a max-likelihood problem where geometry (cameras, points, objects) and semantic information (object classes) are simultaneously estimated. The key intuition is that, in addition to image features, the measurements of objects across views provide additional geometrical constraints that relate cameras and scene parameters. These constraints make the geometry estimation process more robust and, in turn, make object detection more accurate. Our framework has the unique ability to: i) estimate camera poses only from object detections, ii) enhance camera pose estimation, compared to feature-point-based SFM algorithms, iii) improve object detections given multiple uncalibrated images, compared to independently detecting objects in single images. Extensive quantitative results on three datasets – LiDAR cars, street-view pedestrians, and Kinect office desktop – verify our theoretical claims.
Tuesday, May 31, 2011
Lab Meeting June 1, 2011 (Wang Li): Articulated pose estimation with flexible mixtures-of-parts (CVPR 2011)
Articulated pose estimation with flexible mixtures-of-parts
Yi Yang
Deva Ramanan
Abstract
We describe a method for human pose estimation in static images based on a novel representation of part models. Notably, we do not use articulated limb parts, but rather capture orientation with a mixture of templates for each part. We describe a general, flexible mixture model for capturing contextual co-occurrence relations between parts, augmenting standard spring models that encode spatial relations. We show that such relations can capture notions of local rigidity. When co-occurrence and spatial relations are tree-structured, our model can be efficiently optimized with dynamic programming. We present experimental results on standard benchmarks for pose estimation that indicate our approach is the state-of-the-art system for pose estimation, outperforming past work by 50% while being orders of magnitude faster.
Paper Link
Yi Yang
Deva Ramanan
Abstract
We describe a method for human pose estimation in static images based on a novel representation of part models. Notably, we do not use articulated limb parts, but rather capture orientation with a mixture of templates for each part. We describe a general, flexible mixture model for capturing contextual co-occurrence relations between parts, augmenting standard spring models that encode spatial relations. We show that such relations can capture notions of local rigidity. When co-occurrence and spatial relations are tree-structured, our model can be efficiently optimized with dynamic programming. We present experimental results on standard benchmarks for pose estimation that indicate our approach is the state-of-the-art system for pose estimation, outperforming past work by 50% while being orders of magnitude faster.
Paper Link
Monday, May 16, 2011
ICRA 2011 Awards
Best Manipulation Paper
Best Vision Paper
Best Automation Paper
Best Medical Robotics Paper
Best Conference Paper
KUKA Service Robotics Best Paper
Best Video
Best Cognitive Robotics Paper
- WINNER! Characterization of Oscillating Nano Knife for Single Cell Cutting by Nanorobotic Manipulation System Inside ESEM: Yajing Shen, Masahiro Nakajima, Seiji Kojima, Michio Homma, Yasuhito Ode, Toshio Fukuda [pdf]
- Wireless Manipulation of Single Cells Using Magnetic Microtransporters: Mahmut Selman Sakar, Edward Steager, Anthony Cowley, Vijay Kumar, George J Pappas
- Hierarchical Planning in the Now: Leslie Kaelbling, Tomas Lozano-Perez
- Selective Injection and Laser Manipulation of Nanotool Inside a Specific Cell Using Optical Ph Regulation and Optical Tweezers: Hisataka Maruyama, Naoya Inoue, Taisuke Masuda, Fumihito Arai
- Configuration-Based Optimization for Six Degree-Of-Freedom Haptic Rendering for Fine Manipulation: Dangxiao Wang, Xin Zhang, Yuru Zhang, Jing Xiao
Best Vision Paper
- Model-Based Localization of Intraocular Microrobots for Wireless Electromagnetic Control: Christos Bergeles, Bradley Kratochvil, Bradley J. Nelson
- Fusing Optical Flow and Stereo in a Spherical Depth Panorama Using a Single-Camera Folded Catadioptric Rig: Igor Labutov, Carlos Jaramillo, Jizhong Xiao
- 3-D Scene Analysis Via Sequenced Predictions Over Points and Regions: Xuehan Xiong, Daniel Munoz, James Bagnell, Martial Hebert
- Fast and Accurate Computation of Surface Normals from Range Images: Hernan Badino, Daniel Huber, Yongwoon Park, Takeo Kanade
- WINNER! Sparse Distance Learning for Object Recognition Combining RGB and Depth Information: Kevin Lai, Liefeng Bo, Xiaofeng Ren, Dieter Fox [pdf]
Best Automation Paper
- WINNER! Automated Cell Manipulation: Robotic ICSI: Zhe Lu, Xuping Zhang, Clement Leung, Navid Esfandiari, Robert Casper, Yu Sun [pdf]
- Efficient AUV Navigation Fusing Acoustic Ranging and Side-Scan Sonar: Maurice Fallon, Michael Kaess, Hordur Johannsson, John Leonard
- Vision-Based 3D Bicycle Tracking Using Deformable Part Model and Interacting Multiple Model Filter: Hyunggi Cho, Paul E. Rybski, Wende Zhang
- High-Accuracy GPS and GLONASS Positioning by Multipath Mitigation Using Omnidirectional Infrared Camera: Taro Suzuki, Mitsunori Kitamura, Yoshiharu Amano, Takumi Hashizume
- Deployment of a Point and Line Feature Localization System for an Outdoor Agriculture Vehicle: Jacqueline Libby, George Kantor
Best Medical Robotics Paper
- Design of Adjustable Constant-Force Forceps for Robot-Assisted Surgical Manipulation: Chao-Chieh Lan, Jung-Yuan Wang
- Design Optimization of Concentric Tube Robots Based on Task and Anatomical Constraints: Chris Bedell, Jesse Lock, Andrew Gosline, Pierre Dupont
- GyroLock - First in Vivo Experiments of Active Heart Stabilization Using Control Moment Gyro (CMG): Julien Gagne, Olivier Piccin, Edouard Laroche, Michele Diana, Jacques Gangloff
- Metal MEMS Tools for Beating-Heart Tissue Approximation: Evan Butler, Chris Folk, Adam Cohen, Nikolay Vasilyev, Rich Chen, Pedro del Nido, Pierre Dupont
- WINNER! An Articulated Universal Joint Based Flexible Access Robot for Minimally Invasive Surgery: Jianzhong Shang, David Noonan, Christopher Payne, James Clark, Mikael Hans Sodergren, Ara Darzi, Guang-Zhong Yang [pdf]
Best Conference Paper
- WINNER! Minimum Snap Trajectory Generation and Control for Quadrotors: Daniel Mellinger, Vijay Kumar [pdf]
- Autonomous Multi-Floor Indoor Navigation with a Computationally Constrained Micro Aerial Vehicle: Shaojie Shen, Nathan Michael, Vijay Kumar
- Dexhand : A Space Qualfied Multi-Fingered Robotic Hand: Maxime Chalon, Armin Wedler, Andreas Baumann, Wieland Bertleff, Alexander Beyer, Jörg Butterfass, Markus Grebenstein, Robin Gruber, Franz Hacker, Erich Krämer, Klaus Landzettel, Maximilian Maier, Hans-Juergen Sedlmayr, Nikolaus Seitz, Fabian Wappler, Bertram Willberg, Thomas Wimboeck, Frederic Didot, Gerd Hirzinger
- Time Scales and Stability in Networked Multi-Robot Systems: Mac Schwager, Nathan Michael, Vijay Kumar, Daniela Rus
- Bootstrapping Bilinear Models of Robotic Sensorimotor Cascades: Andrea Censi, Richard Murray
KUKA Service Robotics Best Paper
- Distributed Coordination and Data Fusion for Underwater Search: Geoffrey Hollinger, Srinivas Yerramalli, Sanjiv Singh, Urbashi Mitra, Gaurav Sukhatme
- WINNER! Dynamic Shared Control for Human-Wheelchair Cooperation: Qinan Li, Weidong Chen, Jingchuan Wang [pdf]
- Towards Joint Attention for a Domestic Service Robot -- Person Awareness and Gesture Recognition Using Time-Of-Flight Cameras: David Droeschel, Jorg Stuckler, Dirk Holz, Sven Behnke
- Electromyographic Evaluation of Therapeutic Massage Effect Using Multi-Finger Robot Hand: Ren C. Luo, Chih-Chia Chang
Best Video
- Catching Flying Balls and Preparing Coffee: Humanoid Rollin'Justin Performs Dynamic and Sensitive Tasks: Berthold Baeuml, Florian Schmidt, Thomas Wimboeck, Oliver Birbach, Alexander Dietrich, Matthias Fuchs, Werner Friedl, Udo Frese, Christoph Borst, Markus Grebenstein, Oliver Eiberger, Gerd Hirzinger
- Recent Advances in Quadrotor Capabilities: Daniel Mellinger, Nathan Michael, Michael Shomin, Vijay Kumar
- WINNER! High Performance of Magnetically Driven Microtools with Ultrasonic Vibration for Biomedical Innovations: Masaya Hagiwara, Tomohiro Kawahara, Lin Feng, Yoko Yamanishi, Fumihito Arai [pdf]
Best Cognitive Robotics Paper
- WINNER! Donut As I Do: Learning from Failed Demonstrations: Daniel Grollman, Aude Billard [pdf]
- A Discrete Computational Model of Sensorimotor Contingencies for Object Perception and Control of Behavior: Alexander Maye, Andreas Karl Engel
- Skill Learning and Task Outcome Prediction for Manipulation: Peter Pastor, Mrinal Kalakrishnan, Sachin Chitta, Evangelos Theodorou, Stefan Schaal
- Integrating Visual Exploration and Visual Search in Robotic Visual Attention: The Role of Human-Robot Interaction: Momotaz Begum, Fakhri Karray
Tuesday, May 03, 2011
Lab Meeting May 3rd (Andi): Face/Off: Live Facial Puppetry
Thibaut Weise, Hao Li, Luc Van Gool, Mark Pauly
We present a complete integrated system for live facial puppetry that enables high-resolution real-time facial expression tracking with transfer to another person's face. The system utilizes a real-time structured light scanner that provides dense 3D data and texture. A generic template mesh, fitted to a rigid reconstruction of the actor's face, is tracked offline in a training stage through a set of expression sequences. These sequences are used to build a person-specific linear face model that is subsequently used for online face tracking and expression transfer. Even with just a single rigid pose of the target face, convincing real-time facial animations are achievable. The actor becomes a puppeteer with complete and accurate control over a digital face.
Monday, May 02, 2011
Lab Meeting May 3( KuenHan ), Multiple Targets Tracking in World Coordinate with a Single, Minimally Calibrated Camera (ECCV 2010)
Author: Wongun Choi, Silvio Savarese.
Abstract:
Tracking multiple objects is important in many application
domains. We propose a novel algorithm for multi-object tracking that
is capable of working under very challenging conditions such as min-
imal hardware equipment, uncalibrated monocular camera, occlusions
and severe background clutter. To address this problem we propose a
new method that jointly estimates object tracks, estimates correspond-
ing 2D/3D temporal trajectories in the camera reference system as well
as estimates the model parameters (pose, focal length, etc) within a
coherent probabilistic formulation. Since our goal is to estimate stable
and robust tracks that can be univocally associated to the object IDs,
we propose to include in our formulation an interaction (attraction and
repulsion) model that is able to model multiple 2D/3D trajectories in
space-time and handle situations where objects occlude each other. We
use a MCMC particle ltering algorithm for parameter inference and
propose a solution that enables accurate and e cient tracking and cam-
era model estimation. Qualitative and quantitative experimental results
obtained using our own dataset and the publicly available ETH dataset
shows very promising tracking and camera estimation results.
Link
Website
Wednesday, April 20, 2011
NTU PAL Thesis Defense: Mobile Robot Localization in Large-scale Dynamic Environments
Mobile Robot Localization in Large-scale Dynamic Environments
Shao-Wen Yang
Doctoral Dissertation Defense
Department of Computer Science and Information Engineering
National Taiwan University
Time: Thursday, 19 May, 2011 at 8:00AM +0800 (CST)
Location: R542, Der-Tian Hall
Advisor: Chieh-Chih Wang
Thesis Committee:
Li-Chen Fu
Jane Yung-Jen Hsu
Han-Pang Huang
Ta-Te Lin
Chu-Song Chen, Sinica
Jwu-Sheng Hu, NCTU
John J. Leonard, MIT
Abstract:
Localization is the most fundamental problem to providing a mobile robot with autonomous capabilities. Whilst simultaneous localization and mapping (SLAM) and moving object tracking (MOT) have attracted immense attention in the last decade, the focus of robotics continues to shift from stationary robots in a factory automation environment to mobile robots operating in human-inhabited environments. State of the art relying on the static world assumption can fail in the real environment that is typically dynamic. Specifically, the real environment is challenging for mobile robots due to the variety of perceptual inconsistency over space and time. Development of situational awareness is particularly important so that the mobile robots can adapt quickly to changes in the environment.
In this thesis, we explore the problem of mobile robot localization in the real world in theory and practice, and show that localization can benefit from both stationary and dynamic entities.
The performance of ego-motion estimation depends on the consistency between sensory information at successive time steps, whereas the performance of localization relies on the consistency between the sensory information and the a priori map. The inconsistencies make a robot unable to robustly determine its location in the environment. We show that mobile robot localization, as well as ego-motion estimation, and moving object detection are mutually beneficial. Most importantly, addressing the inconsistencies serves as the basis for mobile robot localization, and forms a solid bridge between SLAM and MOT.
Localization, as well as moving object detection, is not only challenging but also difficult to evaluate quantitatively due to the lack of a realistic ground truth. As the key competencies for mobile robotic systems are localization and semantic context interpretation, an annotated data set, as well as an interactive annotation tool, is released to facilitate the development, evaluation and comparison of algorithms for localization, mapping, moving object detection, moving object tracking, etc.
In summary, a unified stochastic framework is introduced to solve the problems of motion estimation and motion segmentation simultaneously in highly dynamic environments in real time. A dual-model localization framework that uses information from both the static scene and dynamic entities is proposed to improve the localization performance by explicitly incorporating, rather than filtering out, moving object information. In the ample experiment, a sub-meter accuracy is achieved, without the aid of GPS, which is adequate for autonomous navigation in crowded urban scenes. The empirical results suggest that the performance of localization can be improved when handling the changing environment explicitly.
Download:
- Thesis draft: http://any.csie.ntu.edu.tw/thesis/yang_thesis-v1_0.pdf
Sunday, April 17, 2011
Lab Meeting April 20, 2011 (fish60): Donut as I do: Learning from failed demonstrations
Title: Donut as I do: Learning from failed demonstrations In: 2011 IEEE International Conference on Robotics and Automation Authors: Grollman, Daniel (Ecole Polytechnique Federale de Lausanne), Billard, Aude (EPFL) Abstract The canonical Robot Learning from Demonstration scenario has a robot observing human demonstrations of a task or behavior in a few situations, and then developing a generalized controller. ... However, the underlying assumption is that the demonstrations are successful, and are appropriate to reproduce. We, instead, consider the possibility that the human has failed in their attempt, and their demonstration is an example of what not to do. Thus, instead of maximizing the similarity of generated behaviors to those of the demonstrators, we examine two methods that deliberately avoid repeating the human's mistakes. Link
Tuesday, April 12, 2011
Lab Meeting April 13, 2011 (Will): Hilbert Space Embeddings of Hidden Markov Models (ICML2010)
Titile: Hilbert Space Embeddings of Hidden Markov Model
In: ICML 2010
Authors: Le Song, Byron Boots, Sajid Siddiqi, Geoffrey Gordon, Alex Smola
Abstract
Hidden Markov Models (HMMs) are important tools for modeling sequence data. However, they are restricted to discrete latent states, and are largely restricted to Gaussian and discrete observations. And, learning algorithms for HMMs have predominantly relied on local search heuristics, with the exception of spectral methods such as those described below. We propose a nonparametric HMM that extends traditional HMMs to structured and non-Gaussian continuous distributions. Furthermore, we derive a local-minimum-free kernel spectral algorithm for learning these HMMs. We apply our method to robot vision data, slot car inertial sensor data and audio event classification data, and show that in these applications, embedded HMMs exceed the previous state-of-the-art performance.
[pdf]
Lab Meeting April 13, 2011 (Jimmy): WiFi-SLAM Using Gaussian Process Latent Variable Models (IJCAI2007)
Title: WiFi-SLAM Using Gaussian Process Latent Variable Models
In: IJCAI 2007
Authors: Brian Ferris, Dieter Fox, and Neil Lawrence
Abstract
WiFi localization, the task of determining the physical location of a mobile device from wireless signal strengths, has been shown to be an accurate method of indoor and outdoor localization and a powerful building block for location-aware applications. However, most localization techniques require a training set of signal strength readings labeled against a ground truth location map, which is prohibitive to collect and maintain as maps grow large. In this paper we propose a novel technique for solving the WiFi SLAM problem using the Gaussian Process Latent Variable Model (GPLVM) to determine the latent-space locations of unlabeled signal strength data. We show how GPLVM, in combination with an appropriate motion dynamics model, can be used to reconstruct a topological connectivity graph from a signal strength sequence which, in combination with the learned Gaussian Process signal strength model, can be used to perform efficient localization.
[pdf]
In: IJCAI 2007
Authors: Brian Ferris, Dieter Fox, and Neil Lawrence
Abstract
WiFi localization, the task of determining the physical location of a mobile device from wireless signal strengths, has been shown to be an accurate method of indoor and outdoor localization and a powerful building block for location-aware applications. However, most localization techniques require a training set of signal strength readings labeled against a ground truth location map, which is prohibitive to collect and maintain as maps grow large. In this paper we propose a novel technique for solving the WiFi SLAM problem using the Gaussian Process Latent Variable Model (GPLVM) to determine the latent-space locations of unlabeled signal strength data. We show how GPLVM, in combination with an appropriate motion dynamics model, can be used to reconstruct a topological connectivity graph from a signal strength sequence which, in combination with the learned Gaussian Process signal strength model, can be used to perform efficient localization.
[pdf]
Tuesday, March 29, 2011
Lab Meeting March 30, 2011 (Chih-Chung): Progress Report
I will show my recent work of moving target tracking and following, using laser scanner and PIONEER3 robot.
Lab Meeting March 30, 2011 (Chung-Han): Progress Report
I will show the updated ground-truth annotation system with the newly collected data set.
Tuesday, March 22, 2011
Lab Meeting March 23, 2011 (David): Object detection and tracking for autonomous navigation in dynamic environments (IJRR 2010)
Title: Object detection and tracking for autonomous navigation in dynamic environments (IJRR 2010)
Authors: Andreas Ess, Konrad Schindler, Bastian Leibe, Luc Van Gool
Abstract:
We address the problem of vision-based navigation in busy inner-city locations, using a stereo rig mounted on a mobile platform. In this scenario semantic information becomes important: rather than modeling moving objects as arbitrary obstacles, they should be categorized and tracked in order to predict their future behavior. To this end, we combine classical geometric world mapping with object category detection and tracking. Object-category-specific detectors serve to find instances of the most important object classes (in our case pedestrians and cars). Based on these detections, multi-object tracking recovers the objects' trajectories, thereby making it possible to predict their future locations, and to employ dynamic path planning. The approach is evaluated on challenging, realistic video sequences recorded at busy inner-city locations.
Link
Authors: Andreas Ess, Konrad Schindler, Bastian Leibe, Luc Van Gool
Abstract:
We address the problem of vision-based navigation in busy inner-city locations, using a stereo rig mounted on a mobile platform. In this scenario semantic information becomes important: rather than modeling moving objects as arbitrary obstacles, they should be categorized and tracked in order to predict their future behavior. To this end, we combine classical geometric world mapping with object category detection and tracking. Object-category-specific detectors serve to find instances of the most important object classes (in our case pedestrians and cars). Based on these detections, multi-object tracking recovers the objects' trajectories, thereby making it possible to predict their future locations, and to employ dynamic path planning. The approach is evaluated on challenging, realistic video sequences recorded at busy inner-city locations.
Link
Lab Meeting March 23, 2011 (Shao-Chen): A Comparison of Track-to-Track Fusion Algorithms for Automotive Sensor Fusion (MFI2008)
Title: A Comparison of Track-to-Track Fusion Algorithms for Automotive Sensor Fusion (MFI2008, Multisensor Fusion and Integration for Intelligent Systems)
Authors: Stephan Matzka and Richard Altendorfer
Abstract:
In exteroceptive automotive sensor fusion, sensor data are usually only available as processed, tracked object data and not as raw sensor data. Applying a Kalman filter to such data leads to additional delays and generally underestimates the fused objects' covariance due to temporal correlations of individual sensor data as well as inter-sensor correlations. We compare the performance of a standard asynchronous Kalman filter applied to tracked sensor data to several algorithms for the track-to-track fusion of sensor objects of unknown correlation, namely covariance union, covariance intersection, and use of cross-covariance. For the simulation setup used in this paper, covariance intersection and use of cross-covariance turn out to yield significantly lower errors than a Kalman filter at a comparable computational load.
Link
Authors: Stephan Matzka and Richard Altendorfer
Abstract:
In exteroceptive automotive sensor fusion, sensor data are usually only available as processed, tracked object data and not as raw sensor data. Applying a Kalman filter to such data leads to additional delays and generally underestimates the fused objects' covariance due to temporal correlations of individual sensor data as well as inter-sensor correlations. We compare the performance of a standard asynchronous Kalman filter applied to tracked sensor data to several algorithms for the track-to-track fusion of sensor objects of unknown correlation, namely covariance union, covariance intersection, and use of cross-covariance. For the simulation setup used in this paper, covariance intersection and use of cross-covariance turn out to yield significantly lower errors than a Kalman filter at a comparable computational load.
Link
Monday, March 14, 2011
Lab Meeting March 16th, 2011 (Andi): 3D Deformable Face Tracking with a Commodity Depth Camera
Qin Cai , David Gallup , Cha Zhang and Zhengyou Zhang
Abstract: Recently, there has been an increasing number of depth cameras available at commodity prices. These cameras can usually capture both color and depth images in real-time, with limited resolution and accuracy. In this paper, we study the problem of 3D deformable face tracking with such commodity depth cameras. A regularized maximum
likelihood deformable model fitting (DMF) algorithm is developed, with special emphasis on handling the noisy input depth data. In particular, we present a maximum likelihood solution that can accommodate sensor noise represented by an arbitrary covariance matrix, which allows more elaborate modeling of the sensor’s accuracy. Furthermore, an 1 regularization scheme is proposed based on the semantics of the deformable face model, which is shown to be very effective in improving the tracking results. To track facial movement in subsequent frames, feature points in the texture images are matched across frames and integrated into the DMF framework seamlessly. The effectiveness of the proposed method is demonstrated with multiple sequences with ground truth information.
Wednesday, March 09, 2011
Lab Meeting March 9th, 2011(KuoHuei): progress report
I will present my progress on Neighboring Objects Interaction models and tracking system.
Tuesday, March 08, 2011
Lab Meeting March 9, 2011 (Wang Li): Real-time Identification and Localization of Body Parts from Depth Images (ICRA 2010)
Real-time Identification and Localization of Body Parts from Depth Images
Christian Plagemann
Varun Ganapathi
Daphne Koller
Sebastian Thrun
Abstract
We deal with the problem of detecting and identifying body parts in depth images at video frame rates. Our solution involves a novel interest point detector for mesh and range data that is particularly well suited for analyzing human shape. The interest points, which are based on identifying geodesic extrema on the surface mesh, coincide with salient points of the body, which can be classified using local shape descriptors. Our approach also provides a natural way of estimating a 3D orientation vector for a given interest point. This can be used to normalize the local shape descriptors to simplify the classification problem as well as to directly estimate the orientation of body parts in space.
Experiments show that our interest points in conjunction with a boosted patch classifier are significantly better in detecting body parts in depth images than state-of-the-art sliding-window based detectors.
Paper Link
Christian Plagemann
Varun Ganapathi
Daphne Koller
Sebastian Thrun
Abstract
We deal with the problem of detecting and identifying body parts in depth images at video frame rates. Our solution involves a novel interest point detector for mesh and range data that is particularly well suited for analyzing human shape. The interest points, which are based on identifying geodesic extrema on the surface mesh, coincide with salient points of the body, which can be classified using local shape descriptors. Our approach also provides a natural way of estimating a 3D orientation vector for a given interest point. This can be used to normalize the local shape descriptors to simplify the classification problem as well as to directly estimate the orientation of body parts in space.
Experiments show that our interest points in conjunction with a boosted patch classifier are significantly better in detecting body parts in depth images than state-of-the-art sliding-window based detectors.
Paper Link
Thursday, March 03, 2011
Article: Perception beyond the Here and Now
by Albrecht Schmidt, Marc Langheinrich, and Kristian Kersting
Computer, February 2011, pp. 86–88
A multitude of senses provide us with information about the here and now. What we see, hear, and feel in turn shape how we perceive our surroundings and understand the world. Our senses are extremely limited, however, and ever since humans began creating and using technology, they have tried to enhance their natural perception in various ways. (pdf)
Computer, February 2011, pp. 86–88
A multitude of senses provide us with information about the here and now. What we see, hear, and feel in turn shape how we perceive our surroundings and understand the world. Our senses are extremely limited, however, and ever since humans began creating and using technology, they have tried to enhance their natural perception in various ways. (pdf)
Monday, February 28, 2011
Lab Meeting March 2nd, 2011 (Jeff): Observability-based Rules for Designing Consistent EKF SLAM Estimators
Title: Observability-based Rules for Designing Consistent EKF SLAM Estimators
Authors: Guoquan P. Huang, Anastasios Mourikis, and Stergios I. Roumeliotis
Abstract:
In this work, we study the inconsistency problem of extended Kalman filter (EKF)-based simultaneous localization and mapping (SLAM) from the perspective of observability. We analytically prove that when the Jacobians of the process and measurement models are evaluated at the latest state estimates during every time step, the linearized error-state system employed in the EKF has an observable subspace of dimension higher than that of the actual, non-linear, SLAM system. As a result, the covariance estimates of the EKF undergo reduction in
directions of the state space where no information is available, which is a primary cause of the inconsistency. Based on these theoretical results, we propose a general framework for improving the consistency of EKF-based SLAM. In this framework, the EKF linearization points are selected in a way that ensures that the resulting linearized system model has an observable subspace of appropriate dimension. We describe two algorithms that are instances of this paradigm. In the first, termed observability constrained (OC)-EKF, the linearization points are selected so as to minimize their expected errors (i.e. the difference between the linearization point and the true state) under the observability constraints. In the second, the filter Jacobians are calculated using the first-ever available estimates for all state variables. This latter approach is termed first-estimates Jacobian (FEJ)-EKF. The proposed algorithms have been tested both in simulation and experimentally, and are shown to significantly outperform the standard EKF both in terms of accuracy and consistency.
Link:
The International Journal of Robotics Research(IJRR), Vol.5 April 2010
http://ijr.sagepub.com/content/29/5/502.full.pdf+html
Authors: Guoquan P. Huang, Anastasios Mourikis, and Stergios I. Roumeliotis
Abstract:
In this work, we study the inconsistency problem of extended Kalman filter (EKF)-based simultaneous localization and mapping (SLAM) from the perspective of observability. We analytically prove that when the Jacobians of the process and measurement models are evaluated at the latest state estimates during every time step, the linearized error-state system employed in the EKF has an observable subspace of dimension higher than that of the actual, non-linear, SLAM system. As a result, the covariance estimates of the EKF undergo reduction in
directions of the state space where no information is available, which is a primary cause of the inconsistency. Based on these theoretical results, we propose a general framework for improving the consistency of EKF-based SLAM. In this framework, the EKF linearization points are selected in a way that ensures that the resulting linearized system model has an observable subspace of appropriate dimension. We describe two algorithms that are instances of this paradigm. In the first, termed observability constrained (OC)-EKF, the linearization points are selected so as to minimize their expected errors (i.e. the difference between the linearization point and the true state) under the observability constraints. In the second, the filter Jacobians are calculated using the first-ever available estimates for all state variables. This latter approach is termed first-estimates Jacobian (FEJ)-EKF. The proposed algorithms have been tested both in simulation and experimentally, and are shown to significantly outperform the standard EKF both in terms of accuracy and consistency.
Link:
The International Journal of Robotics Research(IJRR), Vol.5 April 2010
http://ijr.sagepub.com/content/29/5/502.full.pdf+html
Sunday, February 20, 2011
Wednesday, February 09, 2011
Lab Meeting February 14, 2011 (fish60): Feature Construction for Inverse Reinforcement Learning
Title: Feature Construction for Inverse Reinforcement Learning
Sergey Levine, Zoran Popović, Vladlen Koltun
NIPS 2010
Abstract:
The goal of inverse reinforcement learning is to find a reward function for a
Markov decision process, given example traces from its optimal policy. Current
IRL techniques generally rely on user-supplied features that form a concise basis
for the reward. We present an algorithm that instead constructs reward features
from a large collection of component features, by building logical conjunctions of
those component features that are relevant to the example policy. Given example
traces, the algorithm returns a reward function as well as the constructed features.
Link
Sergey Levine, Zoran Popović, Vladlen Koltun
NIPS 2010
Abstract:
The goal of inverse reinforcement learning is to find a reward function for a
Markov decision process, given example traces from its optimal policy. Current
IRL techniques generally rely on user-supplied features that form a concise basis
for the reward. We present an algorithm that instead constructs reward features
from a large collection of component features, by building logical conjunctions of
those component features that are relevant to the example policy. Given example
traces, the algorithm returns a reward function as well as the constructed features.
Link
Lab Meeting February 14, 2011 (Alan): Multibody Structure-from-Motion in Practice (PAMI 2010)
Title: Multibody Structure-from-Motion in Practice (PAMI 2010)
Authors: Kemal Egemen Ozden, Konrad Schindler, and Luc Van Gool
Abstract—Multibody structure from motion (SfM) is the extension of classical SfM to dynamic scenes with multiple rigidly moving objects. Recent research has unveiled some of the mathematical foundations of the problem, but a practical algorithm which can handle realistic sequences is still missing. In this paper, we discuss the requirements for such an algorithm, highlight theoretical issues and practical problems, and describe how a static structure-from-motion framework needs to be extended to handle real dynamic scenes. Theoretical issues include different situations in which the number of independently moving scene objects changes: Moving objects can enter or leave the field of view, merge into the static background (e.g., when a car is parked), or split off from the background and start moving independently. Practical issues arise due to small freely moving foreground objects with few and short feature tracks. We argue that all of these difficulties need to be handled online as structure-from-motion estimation progresses, and present an exemplary solution using the framework of probabilistic model-scoring.
Link
Authors: Kemal Egemen Ozden, Konrad Schindler, and Luc Van Gool
Abstract—Multibody structure from motion (SfM) is the extension of classical SfM to dynamic scenes with multiple rigidly moving objects. Recent research has unveiled some of the mathematical foundations of the problem, but a practical algorithm which can handle realistic sequences is still missing. In this paper, we discuss the requirements for such an algorithm, highlight theoretical issues and practical problems, and describe how a static structure-from-motion framework needs to be extended to handle real dynamic scenes. Theoretical issues include different situations in which the number of independently moving scene objects changes: Moving objects can enter or leave the field of view, merge into the static background (e.g., when a car is parked), or split off from the background and start moving independently. Practical issues arise due to small freely moving foreground objects with few and short feature tracks. We argue that all of these difficulties need to be handled online as structure-from-motion estimation progresses, and present an exemplary solution using the framework of probabilistic model-scoring.
Link
Monday, January 17, 2011
Lab Meeting January 17( KuenHan ), Moving Object Detection by Multi-View Geometric Techniques from a Single Camera Mounted Robot (IROS 2009)
Title: Moving Object Detection by Multi-View Geometric Techniques from a Single Camera Mounted Robot ( IROS 2009)
Author: Abhijit Kundu, K Madhava Krishna and Jayanthi Sivaswamy
Abstract:
The ability to detect, and track multiple moving
objects like person and other robots, is an important prerequisite
for mobile robots working in dynamic indoor environments.
We approach this problem by detecting independently moving
objects in image sequence from a monocular camera mounted
on a robot. We use multi-view geometric constraints to classify
a pixel as moving or static. The first constraint, we use, is the
epipolar constraint which requires images of static points to
lie on the corresponding epipolar lines in subsequent images.
In the second constraint, we use the knowledge of the robot
motion to estimate a bound in the position of image pixel along
the epipolar line. This is capable of detecting moving objects
followed by a moving camera in the same direction, a so-called
degenerate configuration where the epipolar constraint fails.
To classify the moving pixels robustly, a Bayesian framework
is used to assign a probability that the pixel is stationary
or dynamic based on the above geometric properties and
the probabilities are updated when the pixels are tracked in
subsequent images. The same framework also accounts for the
error in estimation of camera motion. Successful and repeatable
detection and pursuit of people and other moving objects in
realtime with a monocular camera mounted on the Pioneer
3DX, in a cluttered environment confirms the efficacy of the
method.
Link
Author: Abhijit Kundu, K Madhava Krishna and Jayanthi Sivaswamy
Abstract:
The ability to detect, and track multiple moving
objects like person and other robots, is an important prerequisite
for mobile robots working in dynamic indoor environments.
We approach this problem by detecting independently moving
objects in image sequence from a monocular camera mounted
on a robot. We use multi-view geometric constraints to classify
a pixel as moving or static. The first constraint, we use, is the
epipolar constraint which requires images of static points to
lie on the corresponding epipolar lines in subsequent images.
In the second constraint, we use the knowledge of the robot
motion to estimate a bound in the position of image pixel along
the epipolar line. This is capable of detecting moving objects
followed by a moving camera in the same direction, a so-called
degenerate configuration where the epipolar constraint fails.
To classify the moving pixels robustly, a Bayesian framework
is used to assign a probability that the pixel is stationary
or dynamic based on the above geometric properties and
the probabilities are updated when the pixels are tracked in
subsequent images. The same framework also accounts for the
error in estimation of camera motion. Successful and repeatable
detection and pursuit of people and other moving objects in
realtime with a monocular camera mounted on the Pioneer
3DX, in a cluttered environment confirms the efficacy of the
method.
Link
Sunday, January 09, 2011
Lab Meeting January 10th, 2011(Jimmy) : Accurate Image Localization Based on Google Maps Street View (ECCV 2010)
Title: Accurate Image Localization Based on Google Maps Street View
Authors: Amir Roshan Zamir, Mubarak Shah
In ECCV 2010
Abstract
Finding an image's exact GPS location is a challenging computer vision problem that has many real-world applications. In this paper, we address the problem of fi nding the GPS location of images with an accuracy which is comparable to hand-held GPS devices. We leverage a structured data set of about 100,000 images build from Google Maps Street View as the reference images. We propose a localization method in which the SIFT descriptors of the detected SIFT interest points in the reference images are indexed using a tree. In order to localize a query image, the tree is queried using the detected SIFT descriptors in the query image. A novel GPS-tag-based pruning method removes the less reliable descriptors. Then, a smoothing step with an associated voting scheme is utilized; this allows each query descriptor to vote for the location its nearest neighbor belongs to, in order to accurately localize the query image. A parameter called Confidence of Localization which is based on the Kurtosis of the distribution of votes is de fined to determine how reliable the localization of a particular image is. In addition, we propose a novel approach to localize groups of images accurately in a hierarchical manner. First, each image is localized individually; then, the rest of the images in the group are matched against images in the neighboring area of the found first match. The fi nal location is determined based on the Confidence of Localization parameter. The proposed image group localization method can deal with very unclear queries which are not capable of being geolocated individually.
[pdf]
Authors: Amir Roshan Zamir, Mubarak Shah
In ECCV 2010
Abstract
Finding an image's exact GPS location is a challenging computer vision problem that has many real-world applications. In this paper, we address the problem of fi nding the GPS location of images with an accuracy which is comparable to hand-held GPS devices. We leverage a structured data set of about 100,000 images build from Google Maps Street View as the reference images. We propose a localization method in which the SIFT descriptors of the detected SIFT interest points in the reference images are indexed using a tree. In order to localize a query image, the tree is queried using the detected SIFT descriptors in the query image. A novel GPS-tag-based pruning method removes the less reliable descriptors. Then, a smoothing step with an associated voting scheme is utilized; this allows each query descriptor to vote for the location its nearest neighbor belongs to, in order to accurately localize the query image. A parameter called Confidence of Localization which is based on the Kurtosis of the distribution of votes is de fined to determine how reliable the localization of a particular image is. In addition, we propose a novel approach to localize groups of images accurately in a hierarchical manner. First, each image is localized individually; then, the rest of the images in the group are matched against images in the neighboring area of the found first match. The fi nal location is determined based on the Confidence of Localization parameter. The proposed image group localization method can deal with very unclear queries which are not capable of being geolocated individually.
[pdf]
Monday, January 03, 2011
Lab Meeting January 3rd, 2011(Will) : Neural Prothesis & Realtime Bayes Tracking
Topic: Neural Prothesis & Realtime Bayes Tracking
Neural prothesis is a field that use brain to control motors to help disable people.
I'll report my survey on the neural prothesis decoding algorithm.
Po-Wei
Neural prothesis is a field that use brain to control motors to help disable people.
I'll report my survey on the neural prothesis decoding algorithm.
Po-Wei
Subscribe to:
Posts (Atom)