Robot Perception and Learning: July 2009

Thursday, July 30, 2009

MIT Talk: Biomedical Imaging and Analysis Seminar

Biomedical Imaging and Analysis Seminar

Speaker: Arno Klein , Columbia University

Date: Friday, July 31 2009

Time: 2:00PM to 3:00PM

Location: 32-D507

Host: Thomas Yeo, CSAIL

Contact: Thomas Yeo, 617-253-4143, ythomas@csail.mit.edu

Establishing correspondences across brains for the purposes of comparison and group analysis is almost universally done by registering images to one another either directly or via a template. However, there are many registration algorithms to choose from. The first part of this talk will give an overview of a recent evaluation study comparing fully automated nonlinear deformation methods applied to brain image registration (Klein et al. 2009). This study was restricted to volume-based methods, and an ongoing extension of this study is the first known to the authors that directly compares some of the most accurate of these methods with surface-based registration methods, as well as the first study to compare registrations of whole-head and de-skulled brain images. More than 6,000 registrations between 40 manually labeled brain images have been performed so far by the volume-based algorithms ART and SyN and the surface-based algorithms FreeSurfer and Spherical Demons. We used permutation tests and indifference-zone ranking to compare the overlap performance for eight scenarios: ART and SyN on brain images with and without skulls, SyN, FreeSurfer, and Spherical Demons via custom templates, and FreeSurfer via its default atlas.

Link

Wednesday, July 29, 2009

Phd oral exam : Large Scale Scene Matching for Graphics and Vision

July 29, 2009
James H. Hays
11:00 AM, 7220 Wean Hall
Thesis Oral
Title: Large Scale Scene Matching for Graphics and Vision

Abstract:
Our visual experience is extraordinarily varied and complex. Thediversity of the visual world makes it difficult for computervision to understand images and for computer graphics tosynthesize visual content. But for all its richness, it turns outthat the space of "scenes" might not be astronomically large.With access to imagery on an Internet scale, regularities start toemerge - for most images, there exist numerous examples ofsemantically and structurally similar scenes. Is it possible tosample the space of scenes so densely that one can use similarscenes to "brute force" otherwise difficult image understandingand manipulation tasks? This thesis is focused on exploiting andrefining large scale scene matching to short circuit thetypical computer vision and graphics pipelines for imageunderstanding and manipulation.

First, in "Scene Completion" we patch up holes in images bycopying content from matching scenes. We find scenes so similarthat the manipulations are undetectable to naive viewers and wequantify our success rate with a perceptual study. Second, in"im2gps" we estimate geographic properties and globalgeolocation for photos using scene matching with a database of 6million geo-tagged Internet images. We geolocate sequences ofphotos four times as accurately as the single image case bymodelling the global spatiotemporal statistics of photographers.We introduce a range of features for scene matching and use them,together with lazy SVM learning, to dramatically improve scenematching -- doubling the performance of single image geolocationover our baseline method. Third, we study human photo geolocationto gain insights into the geolocation problem, our algorithms, andhuman scene understanding. This study shows that our algorithmssignificantly exceed human geolocation performance. Finally, weuse our geography estimates, as well as Internet text annotations,to provide context for deeper image understanding, such as objectdetection.

Thesis Committee:Alexei A. Efros, ChairMartial HebertJessica K. HodginsTakeo KanadeRichard Szeliski, Microsoft Research

Saturday, July 18, 2009

Lab Meeting August 3, 2009(Shao-Chen): Time-bounded Lattice for Efficient Planning in Dynamic Environments(ICRA09)

Title:Time-bounded Lattice for Efficient Planning in Dynamic Environments(ICRA09)

Authors:Aleksandr Kushleyev, Maxim Likhachev

Abstract

For vehicles navigating initially unknown cluttered environments, current state-of-the-art planning algorithms are able to plan and re-plan dynamically-feasible paths efficiently and robustly. It is still a challenge, however, to deal well with the surroundings that are both cluttered and highly dynamic. Planning under these conditions is more difficult for two reasons. First, tracking and predicting the trajectories of moving objects (i.e., cars, humans) is very noisy. Second, the planning process is computationally more expensive because of the increased dimensionality of the state-space, with time as an additional variable. Moreover, re-planning needs to be invoked more often since the trajectories of moving obstacles need to be constantly re-estimated.

In this paper, we develop a path planning algorithm that addresses these challenges. First, we choose a representation of dynamic obstacles that efficiently models their predicted trajectories and the uncertainty associated with the predictions. Second, to provide real-time guarantees on the performance of planning with dynamic obstacles, we propose to utilize a novel data structure for planning - a time-bounded lattice - that merges together short-term planning in time with longterm planning without time. We demonstrate the effectiveness of the approach in both simulations with up to 30 dynamic obstacles and on real robots.

link

Saturday, July 11, 2009

Lab Meeting July 13, 2009 (Alan): An Embodied Cognition Approach to Mindreading Skills for Socially Intelligent Robots (IJRR 2009)

Title: An Embodied Cognition Approach to Mindreading Skills for Socially Intelligent Robots (IJRR 2009)
Authors: Cynthia Breazeal, Jesse Gray, Matt Berlin

Abstract
Future applications for personal robots motivate research into developing robots that are intelligent in their interactions with people. Toward this goal, in this paper we present an integrated socio-cognitive architecture to endow an anthropomorphic robot with the ability to infer mental states such as beliefs, intents, and desires from the observable behavior of its human partner. The design of our architecture is informed by recent findings from neuroscience and embodies cognition that reveals how living systems leverage their physical and cognitive embodiment through simulation-theoretic mechanisms to infer the mental states of others. We assess the robot’s mindreading skills on a suite of benchmark tasks where the robot interacts with a human partner in a cooperative scenario and a learning scenario. In addition, we have conducted human subjects experiments using the same task scenarios to assess human performance on these tasks and to compare the robot’s performance with that of people. In the process, our human subject studies also reveal some interesting insights into human behavior.

Link

Friday, July 10, 2009

Lab Meeting 7/13, 2009(Casey): A Closed-Form Solution to Non-Rigid Shape and Motion Recovery

Title: A Closed-Form Solution to Non-Rigid Shape and Motion Recovery (IJCV 2006)

Authors: J. Xiao, J.Chai, and T.Kanade

Abstract:

Abstract. Recovery ofthree dimensional (3D) shape and motion ofnon-static scenes from a monocular video sequence is important for applications like robot navigation and human computer interaction. Ifev ery point in the scene randomly moves, it is impossible to recover the non-rigid shapes. In practice, many non-rigid objects, e.g. the human face under various expressions, deform with certain structures. Their shapes can be regarded as a weighted combination of certain shape bases. Shape and motion recovery under such situations has attracted much interest. Previous work on this problem [6, 4, 14] utilized only orthonormality constraints on the camera rotations (rotation constraints). This paper proves that using only the rotation constraints results in ambiguous and invalid solutions. The ambiguity arises from the fact that the shape bases are not unique. An arbitrary linear transformation of the bases produces another set ofeligible bases. To eliminate the ambiguity, we propose a set of novel constraints, basis constraints, which uniquely determine the shape bases. We prove that, under the weak-perspective projection model, enforcing both the basis and the rotation constraints leads to a closed-form solution to the problem of non-rigid shape and motion recovery. The accuracy and robustness ofour closed-form solution is evaluated quantitatively on synthetic data and qualitatively on real video sequences.

[Link]

Lab Meeting 7/13, 2009(fish60): Autonomous Robot Navigation in Outdoor Cluttered Pedestrian Walkways

Autonomous Robot Navigation in Outdoor Cluttered Pedestrian Walkways
Yoichi Morales, Alexander Carballo, Eijiro Takeuchi, Atsushi Aburadani, and Takashi Tsubouchi
Journal of Field Robotics 26(8), 609–635 (2009)

Abstract:
This paper describes an implementation of a mobile robot system for autonomous navigationin outdoor concurred walkways. The task was to navigate through nonmodified pedestrian paths with people and bicycles passing by. The proposed approach proved to be robust for outdoor navigationin cluttered and crowded walkways, first on campus paths and then running the challenge course multiple times between trials and the challenge final. The paper reports experimental results and overall performance of the system. Finally the lessons learned are discussed. The main contribution of this work is the report of a system integration approach for autonomous outdoor navigation and its evaluation.

Link