Tuesday, April 15, 2008

[CVPR 2008] Learning Patch Correspondences for Improved Viewpoint Invariant Face Recognition

Authors: Ahmed Bilal Ashraf, Simon Lucey, Tsuhan Chen

Abstract:
Variation due to viewpoint is one of the key challenges that stand in the way of a complete solution to the face recognition problem. It is easy to note that local regions of the face change differently in appearance as the viewpoint varies. Recently, patch-based approaches, such as those of Kanade and Yamada, have taken advantage of this effect resulting in improved viewpoint invariant face recognition. In this paper we propose a data-driven extension to their approach, in which we not only model how a face patch varies in appearance, but also how it deforms spatially as the viewpoint varies. We propose a novel alignment strategy which we refer to as "stack ﬂow" that discovers viewpoint induced spatial deformities undergone by a face at the patch level. One can then view the spatial deformation of a patch as the correspondence of that patch between two viewpoints. We present improved identiﬁcation and veriﬁcation results to demonstrate the utility of our technique.

[Link]

Monday, April 14, 2008

[Robotics Institute Thesis Proposal ] Structured Prediction Techniques for Imitation Learning

Abstract:
Programming robots is hard. We can often easily demonstrate the behavior we desire, but mapping that intuition into the space of parameters governing the robot's decisions is difficult, time consuming, and ultimately expensive. Machine learning promises “programming by demonstration” paradigms to develop high-performance robotic systems. Unfortunately, many “classical” machine learning techniques, such as decision trees, neural networks, and support vector machines, do not fit the needs of modern robotics systems which are often built around sophisticated planning algorithms that efficiently reason about the future. Consequently, these learning systems often fall short of producing high-quality robot performance.

Rather than ignoring planning algorithms in lieu of pure learning systems, the algorithms I discuss in this proposal embrace optimal cost planning algorithms as a central component of robot behavior. I propose here a set of simple gradient-based algorithms for training cost-based planners from examples of decision sequences provided by an expert. These algorithms are simple, intuitive, easy to implement, and they enjoy both state-of-the-art empirical performance and strong theoretical guarantees. Collectively, we call our framework Maximum Margin Planning (MMP).

Our algorithms fall under the category of imitation learning. In this proposal, I first briefly survey the history of imitation learning and map the progression of algorithms that led to the development of MMP. I then discuss the MMP collection of algorithms at many levels of detail, starting from an intuitive and implementational perspective, and then proceeding to a more formal mathematical derivation. Throughout the discussion I demonstrate the techniques on a wide array of problems found in robotics, from navigational planning and heuristic learning to footstep prediction and grasp planning. Toward the end of the document I outline a set of open problems in imitation learning not solved by MMP and touch on recent progress we have made toward solving them.

Link

Saturday, April 12, 2008

[graphics] VASC Seminar: Yaser Yacoob (U Maryland), Monday April 14, 3:30pm, NSH 1507

VASC Seminar

Image Segmentation Using Meta-Texture Saliency
Yaser Yacoob
University of Maryland
3:30pm, Monday, April 14
NSH 1507

Appointments: Peggy Martin

Abstract:
The rapid increase in megapixel resolution of digital images provides a novel opportunity to capture and analyze information about scene surfaces and expand beyond the commonly used edge/color/texture attributes. The talk will address segmentation of an image into patches that have common underlying salient surface-roughness. Three intrinsic images are derived: reflectance, shading and meta-texture images. A constructive approach is proposed for computing a meta-texture image by preserving, equalizing and enhancing the underlying surface-roughness across color, brightness and illumination variations. We evaluate the performance on sample images and illustrate quantitatively that different patches of the same material, in an image, are normalized in their statistics despite variations in color, brightness and illumination. Image segmentation by line-based boundary-detection is proposed and results are provided and compared to known algorithms.

Biography:
Yaser Yacoob is a Research Faculty at the Computer Vision laboratory at the University of Maryland, College Park. His research is on image and video analysis with focus on topics that are relevant to interpretation of human appearance and motion.

Lab Meeting April 14th, 2008 (fish60): Efficient Motion Planning Algorithm for Stochastic Dynamic Systems with Constraints on Probability of Failure

I will talk about what I have read this week.

Abstract:
Focus on -- Bi-stage Robust Motion Planning algorithm:
Two stage optimization approach, with the upper stage optimizingthe risk allocation and the lower stage calculatingthe optimal control sequence that maximizes the reward.

Link

Friday, April 11, 2008

[ML Lunch] Brian Ziebart on Learning Driving Route Preferences

Speaker: Brian Ziebart
Title: Learning Driving Route Preferences
Venue: NSH 1507
Date: Monday April 14
Time: 12:00 noon

Abstract:
Personal Navigation Devices are useful for obtaining drivingdirections to new destinations, but they are not very intelligent --they observe thousands of miles of preferred driving routes but neverlearn from those observations when planning routes to newdestinations. Motivated by this deficiency, we present a novelapproach for recovering from demonstrated behavior the preferenceweights that drivers place on different types of roads andintersections. The approach resolves ambiguities in inversereinforcement learning (Abbeel and Ng 2004) using the principle ofmaximum entropy (Jaynes 1957), resulting in a probabilistic model forsequential actions. Using the approach, we model thecontext-dependent driving preferences of 25 Yellow Cab Pittsburgh taxidrivers from over 100,000 miles of GPS trace data. Unlike previousapproaches to this modeling problem, which directly modeldistributions over actions at each intersection, our approach learnsthe reasons that make certain routes preferable. Our reason-basedmodel is much more generalizable to new destinations and newcontextual situations, yielding significant performance improvements on a number of driving-related prediction tasks.This is joint work with Andrew Maas, Drew Bagnell, and Anind Dey.

Lab Meeting April 14th, 2008 (Atwood): Loopy Belief Propagation

I will explain the principles behind the belief propagation (BP) algorithm, which is an efficient way to solve inference problems based on passing local messages, and one extension, Residual Belief Propagation, to arbitrary graphs possibly with loops.

References:

Understanding Belief Propagation and its Generalizations, Jonathan S. Yedidia, William T. Freeman, and Yair Weiss, MERL Technical Report, 2001

G. Elidan, I. McGraw, and D. Koller (2006). "Residual Belief Propagation: Informed Scheduling for Asynchronous Message Passing." Proceedings of the Twenty-second Conference on Uncertainty in AI (UAI).

Constructing Free-Energy Approximations and Generalized Belief Propagation Algorithms Yedidia, J.S.; Freeman, W.T.; Weiss, Y. ,IEEE Transactions on Information Theory, 2005

Tuesday, April 08, 2008

Efficient Motion planning Algorithm for Stochastic Dynamic Systems with Constraints on Probability of Failure

Computer Science and Artifical Intelligence Laboratory Technical Report

Abstract:
When controlling dynamic systems such as mobile robots in uncertain environments, there is a trade off between risk and reward. ...
This paper proposes a new appoach to planning a control sequence with guaranteed risk bound.
...
We propose a two-stage optimization approach, with the upper stage optimizing the risk allocation and the lower stage calculating the optimal control sequence that maximizes the reward.

link

CVPR08 : Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model

Title: Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model

Authors:
Wang,, Xiaogang
Ma,, Keng Teck
Ng,, Gee-Wah
Grimson, Eric

Abstract:
We propose a novel nonparametric Bayesian model, Dual Hierarchical Dirichlet Processes (Dual-HDP), for trajectory analysis and semantic region modeling in surveillance settings, in an unsupervised way. In our approach, trajectories are treated as documents and observations of an object on a trajectory are treated as words in a document. Trajectories are clustered into different activities. Abnormal trajectories are detected as samples with low likelihoods. The semantic regions, which are intersections of paths commonly taken by objects, related to activities in the scene are also modeled. Dual-HDP advances the existing Hierarchical Dirichlet Processes (HDP) language model. HDP only clusters co-occurring words from documents into topics and automatically decides the number of topics. Dual-HDP co-clusters both words and documents. It learns both the numbers of word topics and document clusters from data. Under our problem settings, HDP only clusters observations of objects, while Dual-HDP clusters both observations and trajectories. Experiments are evaluated on two data sets, radar tracks collected from a maritime port and visual tracks collected from a parking lot.

link

Monday, April 07, 2008

Lab Meeting April 14th, 2008 (Any): Probabilistic Terrain Analysis For High-Speed Desert Driving

Abstract--The ability to perceive and analyze terrain is a key problem in mobile robot navigation. Terrain perception problems arise in planetary robotics, agriculture, mining, and, of course, self-driving cars. Here, we introduce the PTA (probabilistic terrain analysis) algorithm for terrain classication with a fastmoving robot platform. The PTA algorithm uses probabilistic techniques to integrate range measurements over time, and relies on efficient statistical tests for distinguishing drivable from nondrivable terrain. By using probabilistic techniques, PTA is able to accommodate severe errors in sensing, and identify obstacles with nearly 100% accuracy at speeds of up to 35mph. The PTA algorithm was an essential component in the DARPA Grand Challenge, where it enabled our robot Stanley to traverse the entire course in record time.

S. Thrun, M. Montemerlo, and A. Aron. Probabilistic terrain analysis for high-speed desert driving. In G. Sukhatme, S. Schaal, W. Burgard, and D. Fox, editors, Proceedings of the Robotics Science and Systems Conference, Philadelphia, PA, 2006.

[Lab meeting] 7th April - video: earthmine inc.

3d mapping spin-off from berkeley.

Earthmine.com

link to the demo video

Lab Meeting 2008,04,07(Li-Wei) Omni-directional binocular stereoscopic images from one omni-directional camera

Omni-directional binocular stereoscopic images from one omni-directional camera
Digital and Computational Video, 2002. DCV 2002.
http://w.csie.org/~b93026/01218739.pdf

abstract
An omni-directional binocular stereoscopic image pair consists of two
omni-directional panoramic images, where each image is for the left eye and
the right eye. The panoramic stereo pair provides stereo sensation in a full
360/spl deg/. The omni-directional binocular stereoscopic image pair cannot be
photographed by two omni-directional cameras from two viewpoints, but can be
constructed by mosaicing together the omni-directional images from four
different positions around the user's position. We propose a technique for
producing and evaluating omni-directional binocular stereoscopic images from
one omni-directional lens attached to a digital still camera.

Sunday, April 06, 2008

News: MIT robotics group reveals emotional robot Nexi

The Personal Robots Group at the MIT Media Lab has announced its latest project, a small mobile humanoid robot called Nexi that shows emotion.

A video of the robot in action posted to YouTube has taken the blogosphere by storm.

The robot, which moves around on four wheels and has arms and hands to manipulate objects, was partially funded by the Office of Naval Research through the Defense University Research Instrumentation Program, as well as by a research grant from Microsoft Corp. Nexi and its three counterparts were developed in cooperation with the University of Massachusetts Amherst and private industry.

Cynthia Breazeal, the lead researcher on the project, calls the class of robots "MDS" for mobile/dexterous/social. The robots are targeted for completion sometime this fall. The purpose of the robots is to support research and education goals in human-robot interaction, teaming, and social learning.

Official site

Related articles:
MIT's Nexi bot wants to be your friend (engadget)
Nexi Robot from MIT (übergizmo)
Nexi, The Social Robot From MIT Goes For the Emo Look (gizmodo)
Nexi: MIT's emotive robot - Emotive or creepy? You decide (neoseeker)

Lab Meeting April 7th, 2008 (Andi)

Title: An Automated Method for Large-Scale, Ground-Based City
Model Acquisition

Authors: CHRISTIAN FRUEH and AVIDEH ZAKHOR
Video and Image Processing Laboratory, University of California, Berkeley

from International Journal of Computer Vision 60(1), 5–24, 2004

Abstract: In this paper, we describe an automated method for fast, ground-based acquisition of large-scale 3D city models. Our experimental set up consists of a truck equipped with one camera and two fast, inexpensive 2D laser scanners, being driven on city streets under normal traffic conditions. One scanner is mounted vertically to capture building facades, and the other one is mounted horizontally. Successive horizontal scans are matched with each other in order to determine an estimate of the vehicle’s motion, and relative motion estimates are concatenated to
form an initial path. Assuming that features such as buildings are visible from both ground-based and airborne view, this initial path is globally corrected by Monte-Carlo Localization techniques. Specifically, the final global pose is obtained by utilizing an aerial photograph or a Digital Surface Model as a global map, to which the ground-based horizontal laser scans are matched. A fairly accurate, textured 3D cof the downtown Berkeley area has been acquired in a matter of minutes, limited only by traffic conditions during the data acquisition phase. Subsequent automated processing time to accurately localize the acquisition vehicle is 235 minutes for a 37 minutes or 10.2 km drive, i.e. 23 minutes per kilometer.

Link to the full paper

Lab Meeting April 7th, 2008 (ZhenYu)

I will show some results of my recent work.

Friday, April 04, 2008

VASC Seminar : Visual Analysis of Crowded Scenes

Speaker :
Saad Ali
University of Central Florida
Thursday, April 3, 3:30pm, NSH 1307

Abstract:
Automatic localization, tracking, and event detection in videos ofcrowded environments is an important visual surveillance problem.Despite the sophistication of current surveillance systems, they havenot yet attained the desirable level of applicability and robustnessrequired for handling crowded scenes like parades, concerts, footballmatches, train stations, airports, city centers, malls etc.
In this talk, I will first present a framework for segmenting scenesinto dynamically distinct crowd regions using Lagrangian particledynamics. For this purpose, the spatial extent of the video is treatedas a phase space of a non-autonomous dynamical system where transportfrom one region of the phase space to the other is controlled by theoptical flow. A grid of particles is advected through the phase spaceusing the optical flow using a numerical integration scheme, and theamount by which neighboring particles diverge is quantified by using aCauchy-Green deformation tensor. The maximum eigenvalue of this tensoris used to construct a Finite Time Lyapunov Exponent (FTLE) field,which reveals the time-dependent invariant manifolds of thenon-autonomous dynamical system which are called Lagrangian CoherentStructures (LCS). The LCS in turn divides the crowd flow into regionsof different dynamics, and therefore are used to the segment the sceneinto distinct crowd regions. This segmentation is then used to detectany change in the behavior of the crowd over time. Next, I willpresent an algorithm for tracking individual targets in high density(hundreds of people) crowded scenes. The novelty of the algorithm liesin a scene structure based force model, which is used in conjunctionwith the available appearance information for tracking individuals in a complex crowded scene. The key ingredients of the scene structureforce model are three fields namely, `Static Floor Field' (SFF),`Dynamic Floor Field' (DFF), and `Boundary Floor Field' (BFF). Thesefields determine the probability of a person moving from one locationto another in a way that the object movement is more likely in thedirection of higher fields.

Bio:Saad Ali is currently a PhD candidate at the University of CentralFlorida, advised by Prof. Mubarak Shah. His research interests includesurveillance in crowded and aerial scenes, action recognition, objectrecognition and dynamical systems. He is a student member of IEEE.

FRC Seminar: Stingray and Daredevil: High-Speed Teleoperation and All-Weather Perception for Small UGVs

Speaker: Brian Yamauchi, Lead Roboticist, iRobot Research

Abstract:

The mission of the iRobot Research Group is to conduct applied research to develop and integrate new technologies for iRobot products. In this talk, I will describe two ongoing research projects aimed at solving key problems in mobile robotics -- teleoperating UGVs at high speeds through urban environments (Stingray) -- and navigating autonomously in poor weather and detecting obstacles through foliage (Daredevil).

For Stingray, we have partnered with Chatten Associates to provide immersive telepresence for small UGVs using the Chatten Head-Aimed Remote Viewer (HARV). We have controlled the iRobot Warrior UGV and a high-speed 1/5-scale gas-powered radio-controlled car using the HARV. We will be adding driver assist behaviors to aid the operator in driving at high speeds.

For Daredevil, we are developing an all-weather perception payload for the PackBot that integrates ultra wideband (UWB) radar, LIDAR, and stereo vision. In initial experiments, we have demonstrated that UWB radar can detect obstacles through precipitation, smoke/fog, and sparse-to-moderate foliage. The payload will fuse the low-resolution UWB radar data with high-resolution range data from LIDAR and stereo vision. This will enable the PackBot to perform obstacle avoidance, waypoint navigation, path planning, and autonomous exploration in adverse weather and through foliage.

Speaker Bio:

Dr. Brian Yamauchi is a Lead Roboticist with iRobot's Research Group. He has been conducting robotics research and development for the last 19 years. He is the Principal Investigator for the Daredevil and Stingray Projects, both funded by the US Army Tank-Automotive Research, Development, and Engineering Center (TARDEC). At iRobot, he has conducted research in mobile robot navigation and mapping, autonomous vehicles, heterogeneous mobile robot teams, robotic casualty extraction, UAV/UGV collaboration, and hybrid UAV/UGVs. Prior to joining iRobot, he conducted robotics research at the Naval Research Laboratory, the Jet Propulsion Laboratory, Kennedy Space Center, and the Institute for the Study of Learning and Expertise. He earned his BS in Applied Math/Computer Science at Carnegie Mellon University, his MS in Computer Science at the University of Rochester, and his Ph.D. in Computer Science from Case Western Reserve University.

Tuesday, April 01, 2008

[Robotics Institute Thesis Proposal ] Adaptive Model-Predictive Motion Planning for Navigation in Complex Environments

Author:
Thomas Howard
Robotics Institute
Carnegie Mellon University

Abstract:
Outdoor mobile robot motion planning and navigation is a challenging problem in robot autonomy because of the dimensionality of the search space, the complexity of the system dynamics and the environmental interaction, and the typically limited perceptual horizon. In general, it is intractable to generate a motion plan between arbitrary boundary states that consider sophisticated models of vehicle dynamics and the entire set of feasible actions for nontrivial systems. It is even more difficult to accomplish the aforementioned goals in real time, which is necessary due to dynamic environments and updated perceptual information.

In this proposal, complex environments are defined as worlds where locally optimal motion plans are numerous and where the sensitivity of the cost function is highly dependent on state and mobility model fidelity. Examples of these include domains where obstacles are prevalent, terrain shape is varied, and the consideration of terramechanical models is important. Sequential search processes provide globally optimal solutions but are constrained to search only edges that exist in the graph and satisfy state constraints in the discretized representation of the world. Optimization and relaxation techniques determine only locally optimal, possibly homotopically distinct trajectories and it can be difficult to provide good initial guesses of solutions. Such techniques are arguably more informed and efficient as they follow the gradients of the cost functions to optimize trajectories and can satisfy boundary state constraints in the continuum. A better solution is to leverage the benefits of each approach and to apply it in a hybrid optimization method, relaxing local and regional motion planning sequential search spaces to improve relative optimality of solutions. Relative optimality is defined as the relationship between the quality of a motion plan and the amount of effort (time, computational resources, etc...) required to produce it. In order to achieve this, real-time processes for informed action generation (production of trajectories that consider sophisticated models of motion, suspension, and interaction with the environment) at the regional motion planning level to initialize the optimization must be developed. Since the optimality of executed path can directly correlated to fidelity of the motion model, a related issue is that of system identification, the adaptation of vehicle models using state and sensor data to model predictable disturbances.

In this thesis, I propose to develop techniques to generate feasible motion plans at the local and regional levels that consider sophisticated dynamics models, wheel-terrain interaction, and vehicle configuration to improve navigation capabilities of mobile robots operating in complex environments. The proposed work approaches this problem through developing, applying, and characterizing the benefits of four distinct extensions of work in model-predictive motion planning. The first is the development of a hybrid optimization technique that considers informed mobility models to improve the relative optimality of motion plans in complex environments. The second involves the optimization of search spaces through relaxation of edges and nodes. The third and fourth extensions involve the development of methods for real-time informed action generation that considers varying mobility models and simultaneous model identification and control to tune the predictive motion models. All of this work is in line with the greater goal of developing mobile robot motion planners that effectively navigate in complex environments while considering relative optimality of actions. The application of such techniques may resolve many undesirable behaviors of real systems, leading to mobile robots that are more efficient, robust, and effective at performing tasks in the real world.

Link

Sunday, March 30, 2008

[Robot Perception and Learning] Lab Meeting March 31st, 2008 (Hero)

Design and evaluation of a reactive and deliberative collision avoidance and escape architecture for autonomous robots

Abstract
We present the design and evaluation of an architecture for collision avoidance and escape of mobile autonomous robots operating in unstructured environments. The approach mixes both reactive and deliberative components. This provides the vehicle’s behavior designers with an explicit means to design-in avoidance strategies that match system requirements in concepts of operations and for robot certification. The now traditional three layer architecture is extended to include a fourth Scenario layer, where scripts describing specific responses are selected and parameterized on the fly. A local map is maintained using available sensor data, and adjacent objects are combined as they are observed. This has been observed to create safer trajectories. Objects have persistence and fade if not re-observed over time. In common with behavior based approaches, a reactive layer is maintained containing pre-defined knee jerk responses for extreme situations. The reactive layer can inhibit outputs from above. Path planning of updated goal point outputs from the Scenario layer is performed using a fast marching method made more efficient through lifelong planning techniques. The architecture is applied to applications with Autonomous Underwater Vehicles. Both simulated and open water tests are carried out to establish the performance and usefulness of the approach.

http://www.springerlink.com/content/87704333473236x3/

Lab Meeting March 31st, 2008 (Ekker)

Title:Visual Odometry System Using Multiple Stereo Cameras and Inertial Measurement
Unit
Author:Taragay Oskiper, Zhiwei Zhu, Supun Samarasekera ,Rakesh Kumar
From:Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference, Publication Date:-22 June 2007

Abstract
Over the past decade, tremendous amount of research
activity has focused around the problem of localization in
GPS denied environments. Challenges with localization are
highlighted in human wearable systems where the operator
can freely move through both indoors and outdoors. In
this paper, we present a robust method that addresses these
challenges using a human wearable system with two pairs
of backward and forward looking stereo cameras together
with an inertial measurement unit (IMU). This algorithm
can run in real-time with 15Hz update rate on a dual-core
2GHz laptop PC and it is designed to be a highly accurate
local (relative) pose estimation mechanism acting as
the front-end to a Simultaneous Localization and Mapping
(SLAM) type method capable of global corrections through
landmark matching. Extensive tests of our prototype system
so far, reveal that without any global landmark matching,
we achieve between 0.5% and 1% accuracy in localizing a
person over a 500 meter travel indoors and outdoors. To
our knowledge, such performance results with a real time
system have not been reported before.

Saturday, March 29, 2008

Lab Meeting March 31st, 2008 (Stanley)

Title: User-adapted plan recognition and user-adapted shared control: A Bayesian approach to semi-autonomous wheelchair driving

Authors: Eric Demeester · Alexander Hüntemann · Dirk Vanhooydonck · Gerolf Vanacker · Hendrik Van Brussel · Marnix Nuttin

From: Autonomous Robots (2008) 24: 193–211

Abstract:
Many elderly and physically impaired people experience difficulties when maneuvering a powered wheelchair. In order to ease maneuvering, powered wheelchairs have been equipped with sensors, additional computing power and intelligence by various research groups.
This paper presents a Bayesian approach to maneuvering assistance for wheelchair driving, which can be adapted to a specific user. The proposed framework is able to model and estimate even complex user intents, i.e. wheelchair maneuvers that the driver has in mind. Furthermore, it explicitly takes the uncertainty on the user's intent into account. Besides during intent estimation, user-specific properties and uncertainty on the user's intent are incorporated when taking assistive actions, such that assistance is tailored to the user's driving skills. This decision making is modeled as a greedy Partially Observable Markov Decision Process (POMDP).
Benefits of this approach are shown using experimental results in simulation and on our wheelchair platform Sharioto.

Friday, March 28, 2008

News: All terrain Roomba/iRobot

Byron Lahey and other researchers at Arizona State University are working on this all terrain iRobot modification. He writes:

Robot Create robots are easily programmable and expandable in their functionality, but can only travel on a very limited varieties of terrain (carpet, hard flat floors and other typical interior domestic surfaces). As part of the Arts, Media and Engineering program we are working on systems related to the Mars exploration rovers. To teach about these systems and allow students to test and explore terrain navigation and mapping with robots, we needed robots with expanded terrain navigation capabilities. This video demonstrates an early prototype modification of the iRobot, comparing the performance of a modified and unmodified robot traveling on a rocky surface.

This video first shows how epicly the standard iRobot Create fails on a non-flat surface, then shows how his mods make it work better.

Check here for photos of the modifications.

Thursday, March 27, 2008

VASC Seminar: Free Space computation using Stochastic Occupancy Grids and Dynamic Programming

Hernán Badino
Goethe Frankfurt University (Germany)
Monday March 31 @ 3:30pm

Abstract--The computation of free space available in an environment is an essential task for many intelligent automotive and robotic applications. In this talk, I propose a new approach, which builds a stochastic occupancy grid to address the free space problem as a dynamic programming task. Stereo measurements are integrated over time reducing disparity uncertainty. These integrated measurements are entered into an occupancy grid, taking into account the noise properties of the measurements. In order to cope with real-time requirements of the application, three occupancy grid types are proposed. Their applicabilities and implementations are also discussed. Experimental results with real stereo sequences show the robustness and accuracy of the method. The current implementation of the method runs on off-the-shelf hardware at 20 Hz.

Bio--Hernán Badino received his degree of Engineer from the National Technological University, Córdoba, Argentina, in 2002. He is at the moment presenting his doctoral thesis at the J. W. Goethe Frankfurt University, Germany. Mr. Badino has worked during his PhD. D with the Image Based Environment Perception Group at Daimler AG, in Stuttgart, Germany. He is currently member of the Visual Sensorics and Information Processing Group, at the Frankfurt University, engaged in a project of camera-based urban traffic sensing for driver assistance systems. Mr. Badino particular research interests in the area of computer vision include the computation of ego-motion from sequences of stereo images and the development of stereo vision algorithms for the real-time detection and tracking of static and moving objects for automotive applications.

Tuesday, March 25, 2008

CMU RI Thesis: Studies in using image segmentation to improve object recognition

Caroline Pantofaru
Robotics Institute
Carnegie Mellon University

Abstract:

Recognizing object classes is a central problem in computer vision, and recently there has been renewed interest in also precisely localizing objects with pixel-accurate masks. Since classes of deformable objects can take a very large number of shapes in any given image, a requirement for recognizing and generating masks for such objects is a method for reducing the number of pixel sets which need to be examined. One method for proposing accurate spatial support for objects and features is data-driven pixel grouping through unsupervised image segmentation. The goals of this thesis are to define and address the issues associated with incorporating image segmentation into an object recognition framework.

The first part of this thesis examines the nature of image segmentation and the implications for an object recognition system. We develop a scheme for comparing and evaluating image segmentation algorithms which includes the definition of criteria that an algorithm must satisfy to be a useful black box, experiments for evaluating these criteria, and a measure of automatic segmentation correctness versus human image labeling. This evaluation scheme is used to perform experiments with popular segmentation algorithms, the results of which motivate our work in the remainder of this thesis.

The second part of this thesis explores approaches to incorporating the regions generated by unsupervised image segmentation into an object recognition framework. Influenced by our experiments with segmentation, we propose principled methods for describing such regions. Given the instability inherent in image segmentation, we experiment with increasing robustness by integrating the information from multiple segmentations. Finally, we examine the possibility of learning explicit spatial relationships between regions. The efficacy of these techniques is demonstrated on a number of challenging data sets.

--

Further details : here

CMU RI Thesis: Statistical Approaches to Multi-scale Point Cloud Processing

Ranjith Unnikrishnan
Robotics Institute
Carnegie Mellon University

Abstract:

In recent years, 3D geometry has gained increasing popularity as the new form of digital media content. Due to advances in sensor technology, it is now feasible to acquire highly detailed 3D scans of complex scenes to obtain millions of data points at high sampling rates over large spatial extents. This ability to acquire high-resolution depth information brings with it the possibility of using 3D geometric data to construct detailed shape models and of perhaps combining 3D depth with visual appearance from images to address challenging problems in computer vision.

However, geometric information represented as a 3D point cloud presents challenges uniquely different from other data modalities such as images or audio. Due to a combination of reasons such as the spatial irregularity of the data and the implicit nature of 3D observations, an easy substitution of traditional signal processing operators from images for processing unorganized 3D points is not possible. Furthermore, traditional estimators from classical statistics are not suitable for processing data in this domain, and new algorithms as well as different criteria for evaluating these algorithms are necessary.

This dissertation contributes towards the development of two fundamental building blocks for processing point clouds. The first is of geometric model fitting, where we present a class of locally semi-parametric estimators that allows finite-sample analysis of accuracy and also explicitly addresses the problem of support-radius selection in local fitting. The second is of multi-scale filtering operators for point clouds that allow detection of interest regions whose locations as well as spatial extent are completely data-driven. The proposed approaches are distinguished from related work by operating directly in the input 3D space on unorganized points without assuming an available mesh or resorting to an intermediate global 2D parameterization.

Results are presented for several applications including surface reconstruction, accurate shape descriptor computation and repeatable interest region detection, on synthetic data, as well as outdoor aerial and ground-based data obtained with a laser scanner.

--
Check here for further details.

Monday, March 24, 2008

Lab Meeting March 24th, 2008 (Dwayne Yu): 3D Reconstruction of Environments for Planetary Exploration

Abstract
In this paper we present our approach to 3D surface reconstruction from large sparse range data sets. In space robotics constructing an accurate model of the environment is very important for a variety of reasons. In particular, the constructed model can be used for: safe tele-operation, path planning, planetary exploration and mapping of points of interest. Our approach is based on acquiring range scans from different view-points with overlapping regions, merge them together into a single data set, and fit a triangular mesh on the merged data points.We demonstrate the effectiveness of our approach in a path planning scenario and also by creating the accessibility map for a portion of the Mars Yard located in the Canadian Space Agency.

Link

Sunday, March 23, 2008

Lab Meeting March 24th, 2008 (Kuo-Hwei Lin): Recent work

I will present the recent result about my work. I used the SCRIM algorithm to determine segments pairs possible movement, then make the full set hypothesis of whether segments is static. After weighting average the results of SCRIM in each hypothesis, I will show the sorted hypotheses applied to one step (two scans).

VASC Seminar: Toward a Perceptual Space for Reflectance

Title: Toward a Perceptual Space for Reflectance
Speaker: Sameer Agarwal

Location: NSH 1507, Univ. of Washington
Time: 3:30pm Monday, 24 March

Abstract -- As we make progress in measuring and modeling reflectance, it is also important that we develop a better understanding of how the human visual system perceives the reflection of light. Such a development not only has implications for efficient image synthesis, but also for computer vision where an understanding of reflectance perception will give us insight into the priors and constraints used by humans to solve various shading related problems, e.g., shape from shading and object recognition over variable and unknown lighting.

In this talk I will present a study of the perception of reflectance. I will argue that our methodology based on paired comparisons is better suited for capturing human perception and is less susceptible to experimental errors than previously used methods. The analysis of paired comparisons required the development of a new data analysis tool. In the second part of the talk I will present a new multidimensional scaling algorithm for analyzing paired comparisons. Based on semi-definite programming, this algorithm is a more general and efficient replacement for the widely used Non-metric MDS algorithm.

Using this algorithm we obtain a perceptual embedding of BRDFs from the MIT/MERL Database. This embedding, constructed purely from psychophysical data, exhibits some striking correlations with the material appearance standards that have been developed independently in the paper and paint industries. Finally, I will describe a novel perceptual interpolation scheme that uses this embedding to provide the user with an intuitive interface for navigating the space of reflectances and constructing new ones.

Friday, March 21, 2008

News: People prefer robots that do small talk

* NewScientist.com
* 20 March 2008

ROBOTS of the future may have to learn to make small talk if humans are to accept them.

To find out how quickly domestic robots should respond to their owners' requests, Toshiyuki Shiwa and colleagues at the ATR laboratories in Kyoto, Japan, asked 38 students to give orders such as "take out the trash" to a robot, which took between zero and 5 seconds to respond.

The students liked delays of no more than 1 second best, with 2 seconds being their limit. However, when the robot took longer, impatient students were assuaged if it filled the time with words such as "well" or "er". "When the robot used conversational fillers to buy time until it could respond, people didn't notice the delay," Shiwa says. He presented the study last week at Human-Robot Interaction 2008 in Amsterdam, the Netherlands.

link
related paper

Monday, March 17, 2008

Effective Computer Agents for Interacting With People

CSAIL Event Calendar

Speaker: Ya'akov (Kobi) Gal, MIT CSAIL and Harvard School of Applied Sciences
Date: Wednesday, March 5 2008
Time: 2:00PM to 3:30PM

This talk will present work that addresses challenges by synthesizing techniques from computer science with insights from the behavioral and social sciences.
...
A language that makes explicit the different mental models agents use to make their decisions, described as nodes in a graphical network. The language defines an equilibrium that makes a distinction between agents' optimal strategies and the way they actually behave in reality.
...
For best performance, computers participating in mixed human-computer settings must model human behavior in a way that reflects the contextual setting in which the decision is presented to people.

Link to the speaker:
http://www.eecs.harvard.edu/~gal

Monday, March 10, 2008

Lab Meeting March 10th, 2008 (Leo): Recent work

I will show some initial results of my recent work.

Lab Meeting March 10th, 2008 (Yu-Hsiang) : Interacting Object Tracking in Crowded Urban Areas

Author : Chieh-Chih Wang, Tzu-Chien Lo and Shao-Wen Yang

Abstract:

Tracking in crowded urban areas is a daunting task. High crowdedness causes challenging data association problems. Different motion patterns from a wide variety of moving objects make motion modeling difficult. Accompanying with traditional motion modeling techniques, this paper introduces a scene interaction model and a neighboring object interaction model to respectively take long-term and short-term interactions between the tracked objects and its surroundings into account. With the use of the interaction models, anomalous activity recognition is accomplished easily. In addition, move-stop hypothesis tracking is applied to deal with move-stop-move maneuvers. All these approaches are seamlessly intergraded under the variable-structure multiple-model estimation framework. The proposed approaches have been demonstrated using data from a laser scanner mounted on the PAL1 robot at a crowded intersection. Interacting pedestrians, bicycles, motorcycles, cars and trucks are successfully tracked in difficult situations with occlusion.

link

Sunday, March 09, 2008

Lab Meeting March 10th, 2008 (Jeff):Progress report

I will try to do live demo, if possible.

Or, I will show some previous data set results.

Saturday, March 08, 2008

Snake robot uses obstacles for propulsion

NewScientist.com news service:

A new snake-like robot can replicate a trick of real snakes, pushing off obstacles it encounters to move forwards.

A virtual double of the robot that accurately predicts its real life behaviour has also been developed, something not achieved for a realistic snake robot before

Researchers have been working on snake-inspired robots for decades, but they usually have wheels or treads on their body to help them move. These make it easier for a snake robot to slither forward, by converting its writhing motion into a forward slide.

But that approach works best on smooth surfaces. "In a collapsed building where there's a lot of rubble, for example after an earthquake, a wheeled snake would probably get stuck," says Aksel Transeth of Norwegian research organization SINTEF in Trondheim.
Look, no wheels

A more versatile snake robot would move in a truly snaky way, pushing off of obstacles, such as rocks, that it encounters, Transeth argues. Along with colleagues at the Norwegian University of Science and Technology, also in Trondheim, he developed a wheelless snake robot that can do just that...

Link to the full article: NewScientist

Link to the related IEEE Paper

... and finally a video

MIT Brains & Machines Seminar Series: Online Learning with Limited Feedback: An Efficient and Optimal Algorithm

Speaker: Alexander Rakhlin, Berkeley, Dpt. Computer Science
Date: Tuesday, March 11 2008
Host: Tomaso Poggio, CSAIL, BCS
Relevant URL: http://cbcl.mit.edu/

Title: "Online Learning with Limited Feedback: An Efficient and Optimal Algorithm"

Abstract:
One's ability to learn and make decisions rests heavily on the availability of feedback. In sequential decision-making problems such feedback is often limited. A gambler, for example, can observe entirely the outcome of a horse race regardless of where he placed his bet; however, when the same gambler chooses his route to travel to the race track, perhaps at a busy hour, he will likely never learn the outcome of possible alternatives. The latter limited-feedback problem is the focus of this talk.

The problem can be phrased as an Online Linear Optimization game with ``bandit'' feedback. The existence of a low-regret algorithm has been an open question since the work of Awerbuch and Kleinberg in 2004. We present the first known efficient algorithm for bandit Online Linear Optimization over arbitrary convex decision sets. We show how the difficulties encountered by previous approaches are overcome by employing Regularization -- a method well-known in statistical learning, but under-appreciated in online learning. Furthermore, our solution reveals surprising connections between online learning and Interior Point methods in Optimization.

In particular, our method solves the Online Shortest Path problem: at each round, a path from source to sink is chosen and only the total length (delay) of this path is revealed. Our method has numerous applications in network routing, resource allocation, dynamic treatment of patients, and many more. The worst-case guarantees imply robustness with respect to noise and malicious adversary.

Joint work with Jacob Abernethy and Elad Hazan.

MIT CSAIL Theory Colloquium: Conditional Computational Entropy, or Towards Separating Pseudoentropy from Compressibility

Theory Colloquium: Conditional Computational Entropy, or Towards Separating Pseudoentropy from Compressibility

Speaker: Leonid Reyzin, Boston University
Date: Tuesday, March 11 2008

Computational entropy measures the amount of randomness a distribution appears to have to a computationally bounded observer. It is an open question whether two definitions of this entropy -- the so-called "HILL entropy" (based on indistinguishability from truly random distributions) and "Yao entropy" (based on incompressibility) are equivalent, as they are in the information-theoretic setting. We observe that most of the time the observer has some correlated information, and thus define and study _conditional_ computational entropy. By considering conditional versions of HILL and Yao entropies, we obtain:
-- a separation between conditional HILL and Yao entropies;
-- the first demonstration of a distribution from which extraction techniques based on Yao entropy produce more pseudorandom bits than appears possible by the traditional HILL-entropy-based techniques;
-- a new, natural notion of unpredictability entropy, which, in particular, can handle entropy of singleton distributions, and allows for known extraction and hardcore bit results to be stated and used more generally.

Joint work with Chun-Yuan Hsiao and Chi-Jen Lu.

Friday, March 07, 2008

MIT talk: Translating reactive tasks to reactive controllers

SPECIAL SEMINAR
Hadas Kress-Gazit
University of Pennsylvania
March 5, 2008, 11:00 a.m. 32-G449 (Stata Center)

Translating reactive tasks to reactive controllers

Abstract: How can we automatically create controllers for our system, be it a robot or a team of UAVs, that are guaranteed to satisfy reactive high level tasks such as "Search for Nemo and if you find him transmit his location"? How do we transition from systems that require each behavior or task to be hand coded, tested and verified to systems that allow anyone to just specify an abstract mission and then the system takes care of the rest while taking into account dynamically changing environments? As systems become more sophisticated and mechanically complex, providing theory and tools that answer these questions is crucial for creating truly autonomous systems. In this talk, Hadas Kress-Gazit will present a formal approach to creating robot controllers that ensure the robot satisfies a given high level task. She will describe a framework in which a user specifies a complex and reactive task in structured English. This task is then automatically translated, using logic and tools from the formal methods world, into a hybrid controller. This controller is guaranteed to control the robot such that its motion and actions satisfy the intended task, under some assumptions, in a variety of different environments. As an example, she will show how tasks related to DARPA's Urban challenge can be handled using this framework.

http://www.csail.mit.edu/events/eventcalendar/calendar.php?show=event&id=1777

MIT talk : From Detection to Tracking

Speaker: Dr. Fatih Porikli, Mitsubishi Electric Research Laboratory (MERL)
Date: Wednesday, March 5 2008
Time: 3:00PM to 4:00PM Refreshments: 2:45PM
Location: Star Seminar Room (32-D463)
Host: C. Mario Christoudias, Gerald Dalley, MIT CSAIL
Contact: C. Mario Christoudias, Gerald Dalley, 3-4278, 3-6095, cmch@csail.mit.edu , dalleyg@mit.edu

Object detection and tracking, as our eyes do so innately, are among the most challenging tasks in computer vision. In general, natural objects belong to same class exhibit a large variance in their appearance. Besides, varying imaging conditions, partial occlusions, non-rigid shape deformations, multifaceted profiles and insufficient image resolutions make the detection more difficult. Similarly, a tracked object may undergo severe appearance transformations, suddenly change its motion, become fully occluded, congregate into a group of identical objects, etc. Traditional approaches tend to address these issues separately, often out of context by aiming for fixed generic solutions. Recently, there is push towards making use of any useful bit of information embedded in priori and contextual cues. More systems seek to provide online adaptation to local conditions. In this talk, various aspects of the conventional and contextual detection and tracking methods will be dissected and a unifying statistical descriptor will be examined.

link

Thursday, March 06, 2008

CMU Intelligence Seminar: Hearing the Shape of a State Space: New Frontiers in Representation Discovery

Hearing the Shape of a State Space: New Frontiers in Representation Discovery
Sridhar Mahadevan
University of Massachusetts
Tuesday 3/18

In this talk, we will explore new frontiers in representation discovery, where agents construct a basis for approximation of functions on a space well-adapted to its nonlinear geometry. For example, most spatial environments contain significant bottlenecks (e.g. doors, elevators, exits) that factor into our daily decision-making. Similarly, discovery of latent structure in collections of images or text documents is also facilitated by a deeper understanding of the geometry of particular document or image spaces.

We will describe an approach to representation discovery where agents construct novel bases by "hearing the shape" of the underlying state space. Formally, the proposed framework builds on recent advances in harmonic analysis, specifically Fourier and wavelet analysis on graphs, which transform spatial and temporal structure to frequency-oriented representations. Efficient algorithms for basis construction involves computational challenges, which will be addressed by sampling, matrix compression, and domain knowledge.

A range of case studies will be presented, including a novel paradigm for solving Markov decision processes where representation and control are learned simultaneously; a novel multiscale wavelet method for clustering of text documents where the topic hierarchy is automatically constructed; and a new compression method for computer graphics based on multiscale analysis of object geometry.

Speaker Bio
Professor Sridhar Mahadevan is Co-Director of the Autonomous Learning Laboratory at the Department of Computer Science, University of Massachusetts, Amherst. His research interests are in artificial intelligence and machine learning. He is an associate editor of the Journal of Machine Learning Research, and was a tutorial speaker at AAAI 2007, IJCAI 2007, and ICML 2006.

Wednesday, March 05, 2008

MIT Thesis Defense: Learning Coupled Conditional Random Field for Image Decomposition: Theory and Application in Object Categorization

MIT Thesis Defense: Learning Coupled Conditional Random Field for Image Decomposition: Theory and Application in Object Categorization

Speaker: Xiaoxu Ma, MIT CSAIL
Date: Wednesday, March 5 2008

The goal of this thesis is to build a computational system that is able to identify object categories within images. To this end, this thesis proposes a computational model of "recognition-through-decomposition-and-fusion" based on the psychophysical theories of information dissociation and integration in human visual perception. At the lowest level, contour and texture processes are measured. In the mid-level, a coupled Conditional Random Field model is proposed to model and decompose the contour and texture processes in natural images. Various matching schemes are introduced to match the decomposed contour and texture channels in a dissociative manner. As a counterpart to the integrative process in the human visual system, adaptive combination is applied to fuse the perception in the decomposed contour and texture channels.

The proposed coupled Conditional Random Field model is shown to be an important extension of popular single-layer Random Field models for modeling image processes, by dedicating a separate layer of random field grid to each individual image process and explicitly capturing the distinct properties of multiple visual processes. The decomposition enables the system to fully leverage each decomposed visual stimulus to its full potential in discriminating different object classes. Adaptive combination of multiple visual cues mirrors the fact that different visual cues play different roles in distinguishing various object classes. Experimental results demonstrate that the proposed computational model of "recognition-through-decomposition-and-fusion" achieves better performance than most of the state-of-the-art methods in recognizing the objects in Caltech-101, especially when only a limited number of training samples are available, which conforms with the capability of learning to recognize a class of objects from a few sample images in the human visual system.

Tuesday, March 04, 2008

News: Which robot would you take home with you?

Do you find any one of the three robots more likable than the others? your choice might just reveal something about your personality type, or about your current mood.
Kerstin Dautenhahn and colleagues at the University of Hertfordshire asked volunteers to watch videos of the robots as they answered the door for an owner who had missed the bell. The volunteers were then asked to fill out a questionnaire rating the likeability of each robot, and to complete a simple personality test.
Volunteers tended to prefer the Humanoid robot while "introverts and participants with lower emotional stability" warmed to the two more mechanical robots slightly more.

full text
related paper - Avoiding the uncanny valley: robot appearance, personality and consistency of behavior in an attention-seeking home scenario for a robot companion (link)

[Thesis] Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

Title: Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

Joan Solà Ortega

February 2007

Docteur de l’Institut National Politechnique de Toulouse

Full text: http://ethesis.inp-toulouse.fr/archive/00000528/01/sola.pdf

Saturday, March 01, 2008

[Lab meeting] Mar. 3, 2008 (Jim Yu): PAL2's arm progress

The new arms are on PAL2 now, I could use command line to control it.
I will do the demo or just show the vedio (If I fix some basic parameters these days).

CMU RI PhD Thesis Proposal: Pairwise Constraints for Matching, Perceptual Grouping and Recognition

Pairwise Constraints for Matching, Perceptual Grouping and Recognition

Marius Leordeanu
Robotics Institute
Carnegie Mellon University

10 March 2008

Abstract:
Object category recognition is a challenging problem in computer vision, which currently receives a growing interest in the field. This problem is almost ill-posed, because there is no formal definition of what constitutes an object category. While people largely agree on common, useful categories, it is still not clear which are the objects' properties that help us group them into such categories. In this thesis we represent the object category models as graphs of features, and focus mainly on the second order relationships between them: pairwise category-dependent (e.g. shape) as well as pairwise perceptual grouping constraints (e.g. geometrical and color based). The main theme of this thesis is that higher order relationships between model features are more important for category recognition than local, first order features. We present several novel algorithms that take full advantage of such pairwise constraints. Firstly, we present our spectral matching algorithm for the Quadratic Assignment Problem (also known as Graph Matching), along with a novel, efficient method for learning the pairwise parameters. Secondly, we present a novel optimization method which can handle nonlinear, complex functions, and present some of its applications in the context of our work. Thirdly, we discuss our object category recognition approach based on shape alone, which uses pairwise geometric constraints only. Next, we explore ways (based on both color and geometry) to establish perceptual grouping relationships between pairs of features, which are category independent. And finally, we talk about how we plan on combining both the category dependent and the perceptual relationships in order to perform object category recognition.

Further Details

A copy of the thesis proposal document can be found at http://www.cs.cmu.edu/~manudanu/marius_proposal.pdf.

Thesis Committee

* Martial Hebert, Chair
* Rahul Sukthankar
* Fernando De la Torre
* David Lowe, University of British Columbia

Friday, February 29, 2008

[Lab meeting] Mar. 3, 2008 (Atwood): Hiden-state Conditional Random Field

I will present my experiments on hand postures by Hidden Conditional Random Field and a related paper from IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007.

paper link

Paper Abstract:

We present a discriminative latent variable model for classification problems in structured domains where inputs can be represented by a graph of local observations. A hidden-state Conditional Random Field framework learns a set of latent variables conditioned on local features. Observations need not be independent and may overlap in space and time. We evaluate our model on object detection and gesture recognition tasks.

[Paper] Enhanced Sound Localization

Title: Enhanced Sound Localization
IEEE Transactions on Systems, Man and Cybernetics, Part B
June 2004

Abstract:

A new approach to sound localization, known as enhanced sound localization, is introduced, offering two major benefits over state-of-the-art algorithms. First, higher localization accuracy can be achieved compared to existing methods. Second, an estimate of the source orientation is obtained jointly, as a consequence of the proposed sound localization technique. The orientation estimates and improved localizations are a result of explicitly modeling the various factors that affect a microphone's level of access to different spatial positions and orientations in an acoustic environment. Three primary factors are accounted for, namely the source directivity, microphone directivity, and source-microphone distances. Using this model of the acoustic environment, several different enhanced sound localization algorithms are derived. Experiments are carried out in a real environment whose reverberation time is 0.1 seconds, with the average microphone SNR ranging between 10-20 dB. Using a 24-element microphone array, a weighted version of the SRP-PHAT algorithm is found to give an average localization error of 13.7 cm with 3.7% anomalies, compared to 14.7 cm and 7.8% anomalies with the standard SRP-PHAT technique.

[Lab meeting] Mar. 3rd, 2008 (Yi-liu Chao): Progress Report on Monocular Visual SLAMMOT

I'm going to present my progress in monocular visual SLAMMOT focussing primarily on the moving object detection method.

Thursday, February 28, 2008

MIT news:Learning about brains from computers, and vice versa

Learning about brains from computers, and vice versa

David Chandler, MIT News Office
February 16, 2008

For many years, Tomaso Poggio's lab at MIT ran two parallel lines of research. Some projects were aimed at understanding how the brain works, using complex computational models. Others were aimed at improving the abilities of computers to perform tasks that our brains do with ease, such as making sense of complex visual images.

But recently Poggio has found that the work has progressed so far, and the two tasks have begun to overlap to such a degree, that it's now time to combine the two lines of research.

See the full article.

Sunday, February 24, 2008

[Lab meeting] Feb. 25th, 2008 (Der-Yeuan Yu): Indoor 3D Mapping Progress Report

I will give a brief report on my progress in constructing an indoor 3D environment map. The hardware set up would be two orthogonally placed SICK S200 laser scanners mounted on PAL5. One of the main challenges is the matter of matching maps in different floors together. Below are some interesting references.

C. Früh and A. Zakhor, "Fast 3D model generation in urban environments", in International Conference on Multisensor Fusion and Integration for Intelligent Systems 2001, Baden-Baden, Germany, August 2001, p. 165-170.
http://www-video.eecs.berkeley.edu/papers/frueh/mfi2001.pdf

Zho, H. and Shibasaki, R., "Reconstructing Urban 3D Model using Vehicle-borne Laser Range Scanners", 3-D Digital Imaging and Modeling, 2001. Proceedings, p. 349-356.
http://shiba.iis.u-tokyo.ac.jp/member/current/zhao/pub/3dim2001.pdf

Saturday, February 23, 2008

[Lab meeting] 02/25/08 (Ekker): 3D-Odometry for rough terrain-Towards real 3D navigation

Pierre Lamon and Roland Siegwart
Swiss Federal Institute of Technology, Lausanne (EPFL)
Pierre.Lamon@epfl.ch, Roland.Siegwart@epfl.ch
From :Robotics and Automation, 2003. Proceedings. ICRA '03. IEEE International
Abstract
Up to recently autonomous mobile robots were mostly
designed to run within an indoor, yet partly structured
and flat, environment. In rough terrain many problems
arise and position tracking becomes more difficult. The
robot has to deal with wheel slippage and large
orientation changes. In this paper we will first present the
recent developments on the off-road rover Shrimp. Then a
new method, called 3D-Odometry, which extends the
standard 2D odometry to the 3D space will be developed.
Since it accounts for transitions, the 3D-Odometry
provides better position estimates. It will certainly help to
go towards real 3D navigation for outdoor robots.
fulltext

Friday, February 22, 2008

[ML Lunch] Shuheng Zhou on High Dimensional Sparse Regression and Structure Estimation

Speaker: Shuheng Zhou
Title: High Dimensional Sparse Regression and Structure Estimation
Venue: NSH 1507
Date: Monday February 25Time: 12:00 noon

Abstract:
Recent research has demonstrated that sparsity is a powerful technique insignal reconstruction and in statistical inference. Recent work shows that$\ell_1$-regularized least squares regression can accurately estimate a sparsemodel from n noisy samples in $p$ dimensions, even if p is much larger than n.My talk focuses on studying the role of sparsity in high dimensionalregression when the original noisy samples are compressed, and onstructure estimation in Gaussian graphical models when the graphs evolveover time.

In high-dimensional regression, the sparse object is a vector \betain Y = X \beta + \epsilon, where X is n by p matrix such that $n <<> n even for the case when $\epsilon =0$. However, when the vector \betais sparse, one can recover an empirical $\hat \beta$ that is consistent interms of its support with true $\beta$. In joint work with John Lafferty andLarry Wasserman, we studied the regression problem under the setting that theoriginal n input variables are compressed by a random Gaussian ensemble to m
examples in $p$ dimensions, where m << n or p. A primary motivation for thiscompression procedure is to anonymize the data and preserve privacy byrevealing little information about the original data. We establishedsufficient mutual incoherence conditions on X, under which a sparse linearmodel can be successfully recovered from the compressed data. Wecharacterized the number of random projections that are required for$\ell_1$-regularized compressed regression to identify the nonzerocoefficients in the true model with probability approaching one. Inaddition, we showed that $\ell_1$-regularized compressed regressionasymptotically predicts as well as an oracle linear model, a propertycalled ``persistence''. Finally, we established upper bounds on the mutualinformation between the compressed and uncompressed data that decay to zero.

Undirected graphs are often used to describe high dimensional distributions.Under sparsity conditions, the graph can be estimated using $L_1$ penalizationmethods. However, current methods assume that the data are independent andidentically distributed. If the distribution---and hence the graph--- evolvesover time then the data are not longer identically distributed. In the secondpart of the talk, I show how to estimate the sequence of graphs fornon-identically distributed data and establish some theoretical results onconvergence rate in the predictive risks and the Frobenius norm of theinverse covariance matrix. This is joint work with John Lafferty and LarryWasserman.

Thursday, February 21, 2008

Robotics Institute Thesis Proposal : Mapping Large Urban Environments with GPS-Aided SLAM

Mapping Large Urban Environments with GPS-Aided SLAM
Justin Carlson
Robotics Institute
Carnegie Mellon University

Place and time
NSH 3305
9:00 AM 27 Feb 2008

Abstract
Simultaneous Localization and Mapping (SLAM) has been an active area of research for several decades, and has become a foundation of indoor mobile robotics. Although the scale and quality of results has improved markedly in that time period, no current technique can effectively handle city-sized urban areas.
The Global Positioning System (GPS) is an extraordinarily useful source of localization information. Unfortunately, the noise characteristics of the system are complex, arising from a large number of sources, some of which have large autocorrelation. Incorporation of GPS signals into SLAM algorithms requires using low-level system information and explicit models of the underlying system to appropriately make use of the information. The potential beneﬁts of combining GPS and SLAM include increased robustness, increased scalability, and improved accuracy of localization.
This proposal will present a theoretical background for GPS-SLAM fusion, initial results in simulation, and initial results using data gathered near the Carnegie Mellon Qatar Campus. Future work will look into speciﬁc GPS-SLAM pairings, and demonstrate the ability to generate large-scale maps of urban areas.

Further Details
A copy of the thesis proposal document can be found at http://www.cs.cmu.edu/~justinca/jcarlson_proposal.pdf.

Thesis Committee
Charles Thorpe, Chair
Brett Browning
Martial Hebert
Frank Dellaert, Georgia Institute of Technology

Friday, February 01, 2008

CMU Intelligence Seminar: Understanding Shape using Probabilistic Correspondence

Title: Understanding Shape using Probabilistic Correspondence

Daphne Koller
Stanford University

Faculty Host: Carlos Guestrin

Physical objects in a given class often have a characteristic shape: we can all recognize a giraffe or a coffee mug even from a simple line drawing. This talk describes a characterization of object shape, both in 3D and in 2D, as a probabilistic graphical model, and demonstrates its application to problems in both vision and graphics. Our shape modeling framework encompasses signification variation both of general object shape and of object pose. We show how to learn this model from a collection of unlabeled instances of object shape. A key building block in this approach is the correspondence task, where we map points in the shape of one objects to the points in another. We describe a probabilistic formulation of this task and solutions for addressing it. We also present a method for automatically decomposing a shape into its articulated parts, and for learning a probabilistic model for its shape variation. Finally, we present applications of this framework to a variety of tasks. In the context of graphics, we show applications to shape completion and to shape synthesis from motion capture data. In the context of vision, we show how shape models can be used to precisely outline objects in a cluttered image. We also show how a semantically consistent shape model for an object class, learned from an unlabeled set of object shapes, can be used, with only a handful of labeled instances, to accurately answer semantic queries such as whether a cheetah is running or whether an airplane is taking off. Thus, a more detailed model of object shape can be used as a building block in semantic interpretation of the physical world.

Speaker Bio
Daphne Koller received her BSc and MSc degrees from the Hebrew University of Jerusalem, Israel, and her PhD from Stanford University in 1993. After a two-year postdoc at Berkeley, she returned to Stanford, where she is now a Professor in the Computer Science Department. Her main research interest is in developing and using machine learning and probabilistic methods to model and analyze complex domains. Her current research projects include models in computational biology and in reasoning about the physical worls. Daphne Koller is the author of over 100 refereed publications, which have appeared in venues spanning Science, Nature Genetics, the Journal of Games and Economic Behavior, and a variety of conferences and journals in AI and Computer Science. She was the program co-chair of the NIPS 2007 and UAI 2001 conferences, and has served on numerous program committees and as associate editor of the Journal of Artificial Intelligence Research and of the Machine Learning Journal. She was awarded the Arthur Samuel Thesis Award in 1994, the Sloan Foundation Faculty Fellowship in 1996, the ONR Young Investigator Award in 1998, the Presidential Early Career Award for Scientists and Engineers (PECASE) in 1999, the IJCAI Computers and Thought Award in 2001, the Cox Medal for excellence in fostering undergraduate research at Stanford in 2003, and the MacArthur Foundation Fellowship in 2004.

Tuesday, January 22, 2008

CMU RI seminar: Robust Sensor Placements, Active Learning and Submodular Functions

Title: Robust Sensor Placements, Active Learning and Submodular Functions
Speaker: Carlos Guestrin, Carnegie Mellon University
Date: Jan 25, 2008

In this talk, we tackle a fundamental problem that arises when using sensors to monitor the ecological condition of rivers and lakes, the network of pipes that bring water to our taps, or the activities of an elderly individual when sitting on a chair: Where should we place the sensors in order to make effective and robust predictions?

Optimizing the informativeness of the observations collected by the sensors is an NP-hard problem, even in the simplest settings. We will first identify a fundamental property of sensing tasks, submodularity, an intuitive diminishing returns property. By exploiting submodularity, we develop effective approximation algorithms for the placement problem which have strong theoretical guarantees in terms of the quality of the solution. These algorithms address settings where, in addition to sensing, nodes must maintain effective wireless connectivity, the data may be collected by mobile robots, or we seek to have solutions that are robust to adversaries.

We demonstrate our approach on several real-world settings, including data from real deployments, from a built activity recognition chair, from stories propagating through blogs, and from a sensor placement competition.

This talk is primarily based on joint work with Andreas Krause.

Speaker Biography: Carlos Guestrin's current research spans the areas of planning, reasoning and learning in uncertain dynamic environments, focusing on applications in sensor networks. He is an assistant professor in the Machine Learning and in the Computer Science Departments at Carnegie Mellon University. Previously, he was a senior researcher at the Intel Research Lab in Berkeley. Carlos received his MSc and PhD in Computer Science from Stanford University in 2000 and 2003, respectively, and a Mechatronics Engineer degree from the Polytechnic School of the University of Sao Paulo, Brazil, in 1998. Carlos Guestrin work received awards at a number of conferences and a journal: KDD 2007, IPSN 2005 and 2006, VLDB 2004, NIPS 2003 and 2007, UAI 2005, ICML 2005, and JAIR in 2007. He is also a recipient of the NSF Career Award, Alfred P. Sloan Fellowship, IBM Faculty Fellowship, the Siebel Scholarship and the Stanford Centennial Teaching Assistant Award.

[Lab meeting] Jan. 22nd, 2008 (Kuo-Hwei Lin): Mathematical Model Derivation of SLAMMOT

I will present the mathematical model of SLAMMOT, without the assumption "measurement can be decomposed into stationary and moving objects".

[Lab meeting] 22.01.2008(Andi) Extrinsic self calibration of a camera and a 3D laser range finder from natural scenes

Authors: Scaramuzza, Davide; Harati, Ahad; Siegwart, Roland

Autonomous Systems Laboratory (ASL) at the Swiss Federal Institute of Technology Zurich (ETH), Switzerland;

From: Intelligent Robots and Systems, 2007. IROS

Abstract: In this paper, we describe a new approach for the extrinsic calibration of a camera with a 3D laser range finder, that can be done on the fly. This approach does not require any calibration object. Only few point correspondences are used, which are manually selected by the user from a scene viewed by the two sensors. The proposed method relies on a novel technique to visualize the range information obtained from a 3D laser scanner. This technique converts the visually ambiguous 3D range information into a 2D map where natural features of a scene are highlighted. We show that by enhancing the features the user can easily find the corresponding points of the camera image points. Therefore, visually identifying laser camera correspondences becomes as easy as image pairing. Once point correspondences are given, extrinsic calibration is done using the well-known PnP algorithm followed by a non linear refinement process. We show the performance of our approach through experimental results. In these experiments, we will use an omnidirectional camera. The implication of this method is important because it brings 3 computer vision systems out of the laboratory and into practical use.

LINK

Monday, January 21, 2008

[Lab meeting] Jan. 22nd, 2008 (Stanley): Apprenticeship Learning via Inverse Reinforcement Learning

Author: Pieter Abbeel, Andrew Y. Ng
From: Proceedings of the 21 st International Conference on Machine Learning, Banff, Canada, 2004.
Link

Abstract: We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to perform. This setting is useful in applications (such as the task of driving) where it may be diffcult to write down an explicit reward function specifying exactly how di erent desiderata should be traded off. We think of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and give an algorithm for learning the task demonstrated by the expert. Our algorithm is based on using "inverse reinforcement learning" to try to recover the unknown reward function. We show that our algorithm terminates in a small number of iterations, and that even though we may never recover the expert's reward function, the policy output by the algorithm will attain performance close to that of the expert, where here performance is measured with respect to the expert's unknown reward function.

Saturday, January 12, 2008

CMU Intelligence Seminar: Steps towards human-level AI

Link: http://www.cs.cmu.edu/~iseminar/

Tuesday 1/15, 3:30pm in Wean 5409
Speaker:Kenneth D. Forbus, Northwestern University
Title: Steps towards human-level AI

Faculty Host: Scott Fahlman
Appointments: Barbara Grandillo <bag+@cs.cmu.edu>

Abstract
A confluence of three factors is changing the kinds of AI experiments that can be done: (1) increasing computational power, (2) off-the-shelf representational resources, and (3) steady scientific progress, both in AI and in other areas of Cognitive Science. Consequently, I believe it is time for the field to spend more of its energy experimenting with larger-scale systems, and attempting to capture larger constellations of human cognitive abilities. This talk will summarize experiments with two larger-scale systems we have built at Northwestern: (1) Learning to solve AP Physics problems, in the Companions cognitive architecture. In an evaluation conducted by the Educational Testing Service, a Companion showed it was able to transfer knowledge across multiple types of variant problems. (2) Learning by reading, using the Learning Reader prototype. Learning Reader includes a novel process, rumination, where the system improves its learning by asking itself questions about material it has read.

Speaker Bio
Kenneth D. Forbus is the Walter P. Murphy Professor of Computer Science and Professor of Education at Northwestern University. His research interests include qualitative reasoning, analogy and similarity, sketch understanding, spatial reasoning, cognitive simulation, reasoning system design, articulate educational software, and the use of AI in computer gaming. He received his degrees from MIT (Ph.D. in 1984). He is a Fellow of the American Association for Artificial Intelligence, the Cognitive Science Society, and the Association for Computing Machinery. He serves on the editorial boards of Cognitive Science, the AAAI Press, and on the Advisory Board of the Journal of Game Development.

[VASC Seminar]Fast IKSVM and other Generalizations of Linear SVMs

Title: Fast IKSVM and other Generalizations of Linear SVMs
Speaker: Alexander C. Berg, Yahoo! Research

Date: Monday, Jan 14

Abstract:

We show that one can build histogram intersection kernel SVMs (IKSVMs)
with runtime complexity of the classifier logarithmic in the number of
support vectors as opposed to linear for the standard approach. We
further show that by pre-computing auxiliary tables we can construct an
approximate classifier with constant runtime and space requirements,
independent of the number of support vectors, with negligible loss in
classification accuracy on various tasks.

This result is based on noticing that the IKSVM decision function is a sum
of piece-wise linear functions of each coordinate. We generalize this
notion and show that the resulting classifiers can be learned efficiently.

The practical results are classifiers strictly more general than linear
svms that in practice provide better classification performance for a
range of tasks all at reasonable computational cost.

We also introduce novel features based on multi-level histograms of
oriented edge energy and present experiments on various detection
datasets. On the INRIA pedestrian dataset an approximate IKSVM classifier
based on these features has a miss rate 13% lower at 10^-6 False Positive
Per Window than the linear SVM detector of Dalal and Triggs while being
only twice as slow for classification. On the Daimler Chrysler pedestrian
dataset IKSVM gives comparable accuracy to the best results (based on
quadratic SVMs), while being 15x faster. In these experiments our
approximate IKSVM is up to 2000x faster than a standard implementation and
requires 200x less memory. Finally we show that a 50x speed up is possible
using approximate IKSVM based on spatial pyramid features on the Caltech
101 dataset with negligible loss of accuracy.

Related Papers:
Histogram intersection kernel for image classification
Generalized Histogram Intersection Kernel for Image Recognition

Biography:
Alex Berg's research concerns computational visual recognition. He is a
research scientist at Yahoo! Research and a visiting scholar at U.C.
Berkeley. He has worked on general object recognition in images, action
recognition in video, human pose identification in images, image parsing,
face recognition, image search, and machine learning for computer vision.
His Ph.D. at U.C. Berkeley developed a novel approach to deformable
template matching. He earned a BA and MA in Mathematics from Johns
Hopkins University.

Wednesday, January 09, 2008

CMU ML/Google Seminar: RECENT DEVELOPMENTS IN MARKOV LOGIC

January 14, 2008
Speaker: Pedro Domingos Associate Professor, University of Washington
Title: RECENT DEVELOPMENTS IN MARKOV LOGIC

Abstract:
Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching weights to first-order formulas and viewing them as templates for features of Markov networks. Learning and inference algorithms for Markov logic are available in the open-source Alchemy system. In this talk I will discuss some of the latest developments in Markov logic, including lifted first-order probabilistic inference, relational decision theory, statistical predicate invention, efficient second-order algorithms for weight learning, extending the representation to continuous features, and transferring learned knowledge across domains. I will give an overview of recent and ongoing applications in natural language processing, robot mapping, social network analysis, and computational biology, and conclude with a discussion of open problems and exciting research directions. (Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Aniruddh Nath, Matt Richardson, Parag Singla, Marc Sumner, and Jue Wang.)

Speaker Bio: I received an undergraduate degree (1988) and M.S. in Electrical Engineering and Computer Science (1992) from IST, in Lisbon. I received an M.S. (1994) and Ph.D. (1997) in Information and Computer Science from the University of California at Irvine. I spent two years as an assistant professor at IST, before joining the faculty of the University of Washington in 1999. I'm the author or co-author of over 100 technical publications in machine learning, data mining, and other areas. I'm a member of the editorial board of the Machine Learning journal and the advisory board of JAIR, and a co-founder of the International Machine Learning Society. I was program co-chair of KDD-2003, and I've served on the program committees of AAAI, ICML, IJCAI, KDD, SIGMOD, WWW, and others. I've received a Sloan Fellowship, an NSF CAREER Award, a Fulbright Scholarship, an IBM Faculty Award, two KDD best paper awards, and other distinctions.

CMU talk: The Maximum Entropy Principle

The Maximum Entropy Principle
Miroslav Dudik, Postdoctoral Researcher, MLD, CMU
The link

Abstract
The maximum entropy principle (maxent) has been applied to solve density estimation problems in physics (since 1871), statistics and information theory (since 1957), as well as machine learning (since 1993). According to this principle, we should represent available information as constraints and among all the distributions satisfying the constraints choose the one of maximum entropy. In this overview I will contrast various motivations of maxent with the main focus on applications in statistical inference. I will discuss the equivalence between robust Bayes, maximum entropy, and regularized maximum likelihood estimation, and the implications for principled statistical inference. Finally, I will describe how maxent has been applied to model natural languages and geographic distributions of species.

Monday, January 07, 2008

Lab Meeting January 8th, 2008 (Yu-Hsiang): CRF-Matching: Conditional Random Fields for Feature-Based Scan Matching

Author :
Fabio Ramos,Dieter Fox,Hugh Durrant-Whyte

Abstract :
Matching laser range scans observed at differentpoints in time is a crucial component of many robotics tasks,including mobile robot localization and mapping. While existingtechniques such as the Iterative Closest Point (ICP) algorithmperform well under many circumstances, they often fail when theinitial estimate of the offset between scans is highly uncertain.This paper presents a novel approach to 2D laser scan matching.CRF-Matching generates a Condition Random Field (CRF) toreason about the joint association between the measurementsof the two scans. The approach is able to consider arbitraryshape and appearance features in order to match laser scans.The model parameters are learned from labeled training data.Inference is performed efficiently using loopy belief propagation.Experiments using data collected by a car navigating throughurban environments show that CRF-Matching is able to reliablyand efficiently match laser scans even when no a priori knowledgeabout their offset is given. They additionally demonstrate that ourapproach can seamlessly integrate camera information, therebyfurther improving performance.

link

New Laboratory Robot Can Lift The Burden Of Boring Work

We have been hearing and reading for a long time about assistant robots that silently and carefully zip around humans to liberate them from burdensome work. Nevertheless, a truly convincing high-tech assistant with a gripper arm is not yet commercially available. LISA – short for life science assistant – is intended to change that.
...
LISA is equipped with a sensing gripper arm designed to hold plastic dishes but not injure human beings. Its “artificial skin” consists of conductive foam and textiles and intelligent signal processing electronics. This skin immediately senses and cushions inadvertent jostling. A thermographic camera additionally registers body heat and indicates for instance if a human colleague’s hand is in the way.
...
Link

Lab Meeting January 8th, 2008 (Yu-chun): Socially Distributed Perception: GRACE plays Social Tag at AAAI 2005

M.P. Michalowski, S. Sabanovic, C.F. DiSalvo, D. Busquets Font, L.M. Hiatt, N. Melchior, and R. Simmons

Autonomous Robots, 2007

This paper presents a robot search task (social tag) that uses social interaction, in the form of asking for help, as an integral component of task completion. Socially distributed perception is defined as a robot's ability to augment its limited sensory capacities through social interaction. We describe the task of social tag and its implementation on the robot GRACE for the AAAI 2005 Mobile Robot Competition & Exhibition. We then discuss our observations and analyses of GRACE's performance as a situated interaction with conference participants. Our results suggest we were successful in promoting a form of social interaction that allowed people to help the robot achieve its goal. Furthermore, we found that different social uses of the physical space had an effect on the nature of the interaction. Finally, we discuss the implications of this design approach for effective and compelling human-robot interaction, considering its relationship to concepts such as dependency, mixed initiative, and socially distributed cognition.

link

Tuesday, January 01, 2008

[CMU RI Thesis] On the Multi-View Fitting and Construction of Dense Deformable Face Models

Title: On the Multi-View Fitting and Construction of Dense Deformable Face Models

Author: K. Ramnath

Abstract:
Active Appearance Models (AAMs) are generative, parametric models that have been successfully used in the past to model deformable objects such as human faces. Fitting an AAM to an image consists of minimizing the error between the input image and the closest model instance; i.e. solving a nonlinear optimization problem. In this thesis we study three important topics related to deformable face models such as AAMs: (1) multi-view 3D face model fitting, (2) multi-view 3D face model construction, and (3) automatic dense deformable face model construction.

The original AAMs formulation was 2D, but they have recently been extended to include a 3D shape model. A variety of single-view algorithms exist for fitting and constructing 3D AAMs but one area that has not been studied is multi-view algorithms. In the first part of this thesis we describe an algorithm for fitting a single AAM to multiple images, captured simultaneously by cameras with arbitrary locations, rotations, and response functions. This algorithm uses the scaled orthographic imaging model used by previous authors, and in the process of fitting computes, or calibrates, the scaled orthographic camera matrices. We also describe an extension of this algorithm to calibrate weak perspective (or full perspective) camera models for each of the cameras. In essence, we use the human face as a (nonrigid) calibration grid. We demonstrate that the performance of this algorithm is roughly comparable to a standard algorithm using a calibration grid. We then show how camera calibration improves the performance of AAM fitting.

A variety of non-rigid structure-from-motion algorithms, both single-view and multiview, have been proposed that can be used to construct the corresponding 3D non-rigid shape models of a 2D AAM. In the second part of this thesis we show that constructing a 3D face model using non-rigid structure-from-motion suffers from the Bas-Relief ambiguity and may result in a �scaled� (stretched/compressed) model. We outline a robust non-rigid motion-stereo algorithm for calibrated multi-view 3D AAM construction and show how using calibrated multi-view motion-stereo can eliminate the Bas-Relief ambiguity and yield face models with higher 3D fidelity.

An important step in computing dense deformable face models such as 3D Morphable Models (3DMMs) is to register the input texture maps using optical flow. However, optical flow algorithms perform poorly on images of faces because of the appearance and disappearance of structure such as teeth and wrinkles, and because of the non-Lambertian, textureless cheek regions. In the final part of this thesis we propose a different approach to building dense face models. Our algorithm iteratively builds a face model, fits the model to the input image data, and then refines the model. The refinement consists of three steps: (1) the addition of more mesh points to increase the density, (2) image consistent re-triangulation of the mesh, and (3) refinement of the shape modes. Using a carefully collected dataset containing hidden marker ground-truth, we show that our algorithm generates dense models that are quantitatively better than those obtained using off the shelf optical flow algorithms. We also show how our algorithm can be used to construct dense deformable models automatically, starting with a rigid planar model of the face that is subsequently refined to model the non-planarity and the non-rigid components.

The full text can be found here.