Sunday, October 26, 2008

Lab Meeting October 27th, 2008 (slyfox): Bearings-Only Tracking of Manoeuvring Targets Using Particle Filters


We investigate the problem of bearings-only tracking of manoeuvring targets using particle filters (PFs). Three different (PFs) are proposed for this problem which is formulated as a multiple model tracking problem in a jumpMarkov system (JMS) framework. The proposed filters are (i) multiple model PF (MMPF), (ii) auxiliary MMPF (AUX-MMPF), and (iii) jump Markov system PF (JMS-PF). The performance of these filters is compared with that of standard interacting multiple model (IMM)-based trackers such as IMM-EKF and IMM-UKF for three separate cases: (i) single-sensor case, (ii) multisensor case, and (iii) tracking with hard constraints. A conservative CRLB applicable for this problem is also derived and compared with the RMS error performance of the filters. The results confirm the superiority of the PFs for this difficult nonlinear tracking problem.

EURASIP Journal on Applied Signal Processing, 2004

Friday, October 24, 2008

Lab Meeting Octobor 27th, 2008 (fish60): Smooth Nearness-Diagram Navigation


This paper presents a new method for reactive collision avoidance for mobile robots in complex and cluttered environments. Our technique is to adapt the "divide and conquer" approach of the Nearness-Diagram+ Navigation (ND+) method to generate a single motion law which applies for all navigational situations.
The resulting local path planner considers all the visible obstacles surrounding the robot, not just the closest two. With these changes our new navigation method generates smoother motion while avoiding obstacles. Results from comparisons with ND+ are presented as are experiments using Erratic mobile robots.

2008 IROS Paper

Tuesday, October 21, 2008

Lab Meeting Octobor 27th, 2008 (Jeff):Incremental vision-based topological SLAM

Title: Incremental vision-based topological SLAM

Authors: Adrien Angeli, Stephane Doncieux, Jean-Arcady Meyer, and David Filliat


In robotics, appearance-based topological map building consists in infering the topology of the environment explored by a robot from its sensor measurements. In this paper, we propose a vision-based framework that considers this data association problem from a loop-closure detection perspective in order to correctly assign each measurement to its location. Our approach relies on the visual bag of words paradigm to represent the images and on a discrete Bayes filter to compute the probability of loop-closure. We demonstrate the efficiency of our solution by incremental and real-time consistent map building in an indoor environment and under strong perceptual aliasing conditions using a single monocular wide-angle camera.

IROS2008 Paper

Monday, October 20, 2008

Lab Meeting October 20, 2008 (Jimmy): Learning in Dynamic Environments with Ensemble Selection for Autonomous Outdoor Robot Navigation

Title: Learning in Dynamic Environments with Ensemble Selection for Autonomous Outdoor Robot Navigation (IROS2008)

Authors: Michael J. Procopio, Jane Mulligan, and Greg Grudic

Autonomous robot navigation in unstructured outdoor environments is a challenging area of active research. The navigation task requires identifying safe, traversable paths which allow the robot to progress toward a goal while avoiding obstacles. Machine learning techniques—in particular, classifier ensembles—are well adapted to this task, accomplishing near-to-far learning by augmenting near-field stereo readings in order to identify safe terrain and obstacles in the far field. Composition of the ensemble and subsequent combination of model outputs in this dynamic problem domain remain open questions. Recently, Ensemble Selection has been proposed as a mechanism for selecting and combining models from an existing model library and shown to perform well in static domains. We propose the adaptation of this technique to the time-evolving data associated with the outdoor robot navigation domain. Important research questions as to the composition of the model library, as well as how to combine selected models’ outputs, are addressed in a two-factor experimental evaluation. We evaluate the performance of our technique on six fully labeled datasets, and show that our technique outperforms memoryless baseline techniques that do not leverage past experience.


Sunday, October 19, 2008

Lab Meeting October 20, 2008(Casey): 3D Head tracking and pose-robust 2D Texture Map-Based Face Recognition using a Simple Ellipsoid Model

Title: 3D Head tracking and pose-robust 2D Texture Map-based Face Recognition using a Simple Ellipsoid Model (IROS2008)

Authors: Kwang Ho An and Myung Jin Chung

A human face provides a variety of different communicative functions such as identification, the perception of emotional expression, and lip-reading. For these reasons, many applications in robotics require tracking and recognizing a human face. A novel face recognition system should be able to deal with various changes in face images, such as pose, illumination, and expression, among which pose variation is the most difficult one to deal with. Therefore, face registration (alignment) is the key of robust face recognition. If we can register face images into frontal views, the recognition task would be much easier. To align a face image into a canonical frontal view, we need to know the pose information of a human head. Therefore, in this paper, we propose a novel method for modeling a human head as a simple 3D ellipsoid. And also, we present 3D head tracking and pose estimation methods using the proposed ellipsoidal model. After recovering full motion of the head, we can register face images with pose variations into stabilized view images which are suitable for frontal face recognition. By doing so, simple and efficient frontal face recognition can be easily carried out in the stabilized texture map space instead of the original input image space. To evaluate the feasibility of the proposed approach using a simple ellipsoid model, 3D head tracking experiments are carried out on 45 image sequences with ground truth from Boston University, and several face recognition experiments are conducted on our laboratory database and the Yale Face Database B by using subspace-based face recognition methods such as PCA, PCA+LAD, and DCV.

Saturday, October 18, 2008

MIT CSAIL talk: Modeling Appearance via the Object Class Invariant

Modeling Appearance via the Object Class Invariant
Speaker: Matthew Toews, Harvard Medical School

Date: Friday, October 17 2008
Time: 2:00PM to 3:00PM
Host: Polina Golland, CSAIL

As humans, we are able to identify, localize, describe and classify a wide range of object classes, such as faces, cars or the human brain, by their appearance in images. Designing a general computational model of appearance with similar capabilities remains a long standing goal in the research community. A major challenge is effectively coping with the many sources of variability operative in determining image appearance: illumination, noise, unrelated clutter, occlusion, sensor geometry, natural intra-class variation and abnormal variation due to pathology to name a few. Explicitly modeling sources of variability can be computationally expensive, can lead to domain-specific solutions and may ultimately be unnecessary for the computational tasks at hand.

In this talk, I will show how appearance can instead be modeled in a manner invariant to nuisance variations, or sources of variability unrelated to the tasks at hand. This is done by relating spatially localized image features (e.g. SIFT) to an object class invariant (OCI), a reference frame which remains geometrically consistent with the underlying object class despite nuisance variations. The resulting OCI model is a probabilistic collage of local image patterns that can be automatically learned from sets of images and robustly fit to new images, with little or no manual supervision. Due to its general nature, the OCI model can be used to address a variety of difficult, open problems in the contexts of computer vision and medical image analysis. I will show how the model can be used both as a viewpoint-invariant model of 3D object classes in photographic imagery and as a robust anatomical atlas of the brain in magnetic resonance imagery.

Thursday, October 16, 2008

IROS 2008 Keynote speech: Understanding Human Faces

At IROS 2008, Takeo Kanade delivered a great speech summarizing what he has being working on in terms of understanding human faces. Below is the abstract:

A human face conveys important information: identity, emotion, and intention of the person. Technologies to process and understand human faces have many applications, ranging from biometrics to medical diagnosis, and from surveillance to human-robot interaction. This talk will give an overview of the recent progress that the CMU Face Group has made, in particular, robust face alignment, facial Action Unit (AU) recognition for emotion analysis, and facial video cloning for understanding human dyadic communication.

The video I took is available at As I selected a wrong/low resolution to record this one-hour talk, it is hard to see the slides. Fortunately, the audio is clear. Take a look (or listen to this excellent talk)!


Tuesday, October 14, 2008

CMU RI Thesis Proposal: Pretty Observable Markov Decision Processes: Exploiting Approximate Structure for Efficient Planning under Uncertainty

Title: Pretty Observable Markov Decision Processes: Exploiting Approximate Structure for Efficient Planning under Uncertainty

Nicholas Armstrong-Crews
Robotics Institute
Carnegie Mellon University

NSH 1507

10:00 AM
20 Oct 2008

Planning under uncertainty is a challenging task. POMDP models have become a popular method for describing such domains. Unfortunately, solving a POMDP to find the optimal policy is computationally intractable, in general. Recent advances in solving POMDPs include finding near-optimal policies and exploiting structured representations of the problems. We believe that using these two tools together synergistically, we can tame the complexity of many POMDPs. In this thesis, we propose to further advance these approaches by analyzing new types of structure and new approximation techniques, as well as the methods combining the intersection of the two.

Some of the research we have done to lay the groundwork for this thesis falls into these categories, with promising results. We introduced the Oracular POMDP framework, which takes advantage of an MDP-like structure by allowing direct observation of the state as a (costly) action by the agent, but otherwise the agent receives no information from the environment and in between invocations of this “oracle'' action the agent is again afflicted by uncertainty. We have given an anytime algorithm for solving Oracular POMDPs which we've proven is efficient (poly-time) in all but the number of actions. At any iteration of the anytime algorithm, we have a (provably) near-optimal policy, which we have achieved efficiently by exploiting the structured observation function.

Another vein of our past work addressing solving general POMDPs by approximating them as finite state MDPs. It is a well-known result that POMDPs are equivalent to continuous MDPs whose state space is the belief simplex (the probability distribution over possible hidden states). We sample a finite number of these beliefs to create a finite-state MDP that approximates the original POMDP. We then solve this MDP for an optimal policy, improve our sample of belief states with this policy so that it better approximates the POMDP, and continue in this fashion.

These prior works exhibit an important common methodology: anytime algorithms that give near-optimal policies at every iteration, and in the limit converge to the optimal policy. This property is paramount for tackling problems with approximate structure. We can focus early iterations on the structured portion of the problem, which we can solve quickly; and later iterations can handle the complex, unstructured portion of the problem. In this way, we can quickly reach a near-optimal solution, while guaranteeing convergence to an optimal solution in the limit. Our method of evaluating an algorithm's performance on a given problem, then, is the entire curve of policy quality versus algorithm runtime.

Although the AI literature is rich with attempts to exploit different types of structure, in this thesis we focus on a small subset. Our prior work includes Oracular POMDPs, an extremely structured observation function; and the finite-state MDP approximation to POMDPs, which takes advantage of a structured belief-state space that is learned as the algorithm progresses.

For the remainder of the thesis work, we propose to generalize the concept of Oracular POMDPs to include nearly perfect information from oracles, with nearly no information provided from the environment otherwise; we will also extend the oracle concept to factored state problems, where an oracle can reveal one state variable reliably but not the others. We will investigate automated techniques for learning structure from a given unstructured representation. Finally, we wish to examine in greater detail what can be proven about the optimality-runtime tradeoff of these approximately structured POMDPs.

To evaluate our methods, we will apply them to several types of problems. First, we will introduce new synthetic domains that exhibit the structure we wish to exploit. Second, we will use our structure learning methods on existing domains in the literature. Finally, we will attempt to apply the methods to a real-world robot problem, in order to address doubts (in the minds of the community and of the author) about the usefulness of POMDP methods real robots.

Full text

Sunday, October 12, 2008

Lab Meeting October 13, 2008(Tiffany): Graph Laplacian Based Transfer Learning in Reinforce-ment Learning

Yi-Ting Tsao, Ke-Ting Xiao, Von-Wun Soo

The aim of transfer learning is to accelerate learning in related domains. In reinforcement learning, many different features such as a value function and a policy can be transferred from a source domain to a related target domain. Many researches focused on transfer using hand-coded translation functions that are designed by the experts a priori. However, it is not only very costly but also problem dependent. We propose to apply the Graph Laplacian that is based on the spectral graph theory to decompose the value functions of both a source domain and a target domain into a sum of basis functions respectively. The transfer learning can be carried out by transferring weights on the basis functions of a source do-main to a target domain. We investigate two types of domain transfer, scaling and topological. The results demonstrated that the transferred policy is a better prior policy to reduce the learning time.


Saturday, October 11, 2008

Lab Meeting October 13 (Andi) Extrinsic Laser Scanner / Camera calibration

I will summarize three papers with different approaches for Laser/Camera calibration.

[1] An Algorithm for Extrinsic Parameters Calibration of a Camera and a Laser Range Finder Using Line Features
[2] An efficient extrinsic calibration of a multiple laser scanners and cameras’ sensor system on a mobile platform
[3] Extrinsic calibration of a camera and laser range finder (improves camera calibration)

[1]This paper presents an effective algorithm for calibrating the extrinsic parameters between a camera and a laser range finder whose trace is invisible. On the basis of an analysis of three possible features, we propose to design a right-angled triangular checkerboard and to employ the invisible intersection points of the laser range finder’s slice plane with the edges of the checkerboard to set up the constraints equations. The extrinsic parameters are then calibrated by minimizing the algebraic errors between the measured intersections points and their corresponding projections on the image plane of the camera....
[2] ...In this research, we present a practical method for extrinsic calibration of multiple laser scanners and video cameras that are mounted on a vehicle platform. Refering to a fiducial coordinate system on vehicle platform, a constraint between the data of a laser scanner and of a video camera is established. It is solved in an iterative way to find a best solution from the laser scanner and from the video camera to the fiducial coordinate system. On the other hand, all laser scanners and video cameras are calibrated for each laser scanner and video camera pair that has common in feature points in a sequential way....
[3] We describe theoretical and experimental results for the extrinsic calibration of sensor platform consisting of a camera and a 2D laser range finder. The calibration is based on observing a planar checkerboard pattern and solving for constraints between the “views” of a planar checkerboard calibration pattern from a camera and laser range finder. We give a direct solution that minimizes an algebraic error from this constraint, and subsequent nonlinear refinement minimizes a re-projection error....

Thursday, October 09, 2008

IEEE News: Smart Phones May Detect Sleep Disorders

Technology for screening and diagnosing sleep disorders, and for waking users at the best times in their sleep cycles, has been developed by researchers at two Finnish universities, Tampere University of Technology and the University of Helsinki, who say the first application of the new technology, a smart alarm clock for mobile phones, HappyWakeUp, is now available. The researchers first noticed that a common microphone is very sensitive to any sounds and voices produced by movements in the bed during night-time, they say, and can adapt that technology for the detection of restless sleep, leg movements associated with restless leg syndrome and screening for snoring and sleep apnea. The technology makes it possible to perform several repeated all-night recordings and to diagnose sleep disorders in countries and areas with no previous sleep recording facilities, according to researchers, who say the new technology is extremely cost-efficient, compared to the use of existing s! pecial medical recording devices. Read more
Learn more about broadband wireless in the IEEE Xplore® digital library

Research Scientist Position in Robotics at MIT CSAIL

Research Scientist
Computer Science and Artificial Intelligence Laboratory
Massachusetts Institute of Technology

RESEARCH SCIENTIST, Computer Science and Artificial Intelligence Laboratory (CSAIL), to perform research in the development of perception, planning, control, and human interface software and algorithms for autonomous robots; manage research in autonomous vehicles, including development and testing of techniques for vision, lidar, and radar data processing for mapping, localization, and autonomous path control; and development and field deployment of novel robotic systems for land, air, and sea environments.

REQUIREMENTS: a Ph.D. in robotics or computer vision; and five or more years' experience in perception algorithm and human-computer interface and robotic system programming for autonomous vehicles. Seek motivated, enthusiastic roboticist who demonstrates exceptional programming skills and the ability to perform independent research and manage complex research projects. Must be able to help mentor graduate students and postdocs. Position requires expert knowledge of Bayesian state estimation and computer vision algorithms such as Kalman filters, particle filters, and SIFT feature detection; and general experience in robot system integration, C/C++ network programming in Linux and Windows, CVS, SVN, openGL, perl, and HTML. Must have experience with configuration and management of Linux computer systems using Ubuntu/Debian distributions; deployment and operation of mobile ad-hoc wireless networks; and code development for public-domain robot control software packages such as CARMEN and LCM. Should also have experience creating real-time interfaces to vision, laser, and radar sensors using serial, USB, CANbus, and tcp/ip connections; and in the configuration and operation of SICK laser range scanners.

Applicants may apply online at
(Search for position mit-00005935)

John Leonard (

Wednesday, October 08, 2008

HRI 2009 Evaluation Criteria

The criteria are good guidelines for doing human robot interaction research. -Bob

The Evaluation Criteria for papers are now available. All papers must:

a) Be relevant to the field of human-robot interaction. So, for example, a paper that describes a new face tracking algorithm needs to demonstrate how it is of direct use to human-robot interaction. A paper contributing a face recognition technology should use standard recognition metrics (e.g., ROC curve) as well as demonstrate or highlight a path to “feasibility” in human-robot domains with regard to interactive performance, sufficient accuracy, integration with closed loop control, etc. Similarly, a study of the elderly must show how insights from the study directly inform the design of robots for this population and a wizard-of-oz experiment should show how findings contribute to our understanding of how people might interact with robotic capabilities that are plausible (if not currently available).

b) Clearly articulate: 1) the contribution to HRI, 2) how the contribution advances the state-of-the art or knowledge in HRI, and 3) how the contribution relates to other work in HRI as well as the fields of study on which the paper draws (e.g. psychology, cognitive science, anthropology, computer vision, artificial intelligence, speech recognition, etc.).

c) Be technologically and methodologically sound based on the criteria generally used for that technology/method within a given field. For example, conventions used in psychology for conducting experiments with people and analyzing the data, and reporting the study (e.g. hypotheses, manipulation checks, the creation of scales, ANOVA analyses, correlation tables, etc.) should be applied. Authors should take care to use correct terminology for their methods to avoid being evaluated against the incorrect set of criteria. For example, a user study of 5 people should be referred to as a user study or evaluation and not an experiment.

d) For empirical papers, provide adequate detail for readers to understand what was done, how the data were collected, from how many people, what were the characteristics of these people, what questions were people asked, what type of robot was involved (if a robot was used), etc.

e) Be written to be accessible for a broad, interdisciplinary/multidisciplinary HRI audience.

We particularly encourage papers that bring together subfields and investigate problems that have not been explored and are novel to HRI.

Wednesday, October 01, 2008

CMU RI Thesis Proposal: Generalized Backpropagation

Title: Generalized Backpropagation

Robotics Institute
Carnegie Mellon University

Place: NSH 1507
Time: 2:00 PM
Date: 2 Oct 2008

Robotics problems are often ill-suited for the standard supervised learning paradigms. Consequently, complex tasks like autonomous driving through difficult off-road terrain are decomposed into a series of small subproblems which are each solved with independent modules through learning or hand-engineering. Unfortunately, this approach exacerbates the problem of data labeling, as training data sets must now be labeled for intermediate outputs where the "correct" label that will lead to optimal system performance may be hard to define. Additionally, even if each of the subproblems is solved reasonably well, the performance of the system as a whole may suffer because of the accumulation of small errors in each module. Figuring out which individual modules in a complex system are responsible for system-wide errors, and how to adjust them to improve performance, has consumed many man-hours. 
The gradient backpropagation algorithm has been integral for neural network training because it provides a way to translate errors in the final system performance into updates for each of the parameters in a complex network. Although it is commonly associated with neural networks, backpropagation can be used to train any system of differentiable modules. Backpropagation depends exclusively on labeled data, however, and fails on deep networks where the "blame" for errors becomes too diffuse. This thesis proposes a set of tools, collectively termed "generalized backpropagation", which are derived from recent research into learning in the online, semi-supervised, transductive, and multi-task settings, and address some key limitations of the backpropagation algorithm. These tools make it easier to learn networks of modules if labeled training data is limited and can be understood as ways to create informative priors for the parameters of each module from unlabeled and weakly labeled data. In the proposed work they will be demonstrated on a challenging mobile robot navigation problem.