Robot Perception and Learning

Saturday, September 06, 2008

CfP: Autonomous Robots - Special Issue on Robot Learning

===================================================

Call For Papers: Autonomous Robots - Special Issue on Robot Learning
===================================================
Quick Facts
=========
Editors: Jan Peters, Max Planck Institute for Biological Cybernetics,
Andrew Y. Ng, Stanford University
Journal: Autonomous Robots
Submission Deadline: November 8, 2008
Author Notification: March 1, 2009
Revised Manuscripts: June 1, 2009
Approximate Publication Date: 4th Quarter, 2009

Abstract
======
Creating autonomous robots that can learn to act in unpredictable environments has been a long standing goal of robotics, artificial intelligence, and the cognitive sciences. In contrast, current commercially available industrial and service robots mostly execute fixed tasks and exhibit little adaptability. To bridge this gap, machine learning offers a myriad set of methods some of which have already been applied with great success to robotics problems. Machine learning is also likely play an increasingly important role in robotics as we take robots out of research labs and factory floors, into the unstructured environments inhabited by humans and into other
natural environments.

To carry out increasingly difficult and diverse sets of tasks, future robots will need to make proper use of perceptual stimuli such as vision, lidar, proprioceptive sensing and tactile feedback, and translate these into appropriate motor commands. In order to close this complex loop from perception to action, machine learning will be needed in various stages such as scene understanding, sensory-based action generation, high-level plan generation, and torque level motor control. Among the important problems hidden in these steps are robotic perception, perceptuo-action coupling, imitation learning, movement decomposition, probabilistic planning, motor primitive learning, reinforcement learning, model learning, motor control, and
many others.

Driven by high-profile competitions such as RoboCup and the DARPA Challenges, as well as the growing number of robot learning research programs funded by governments around the world (e.g., FP7-ICT, the euCognition initiative, DARPA Legged Locomotion and LAGR programs), interest in robot learning has reached an unprecedented high point. The interest in machine learning and statistics within robotics has increased substantially; and, robot applications have also become important for motivating new algorithms and formalisms in the machine learning community.

In this Autonomous Robots Special Issue on Robot Learning, we intend to outline recent successes in the application of domain-driven machine learning methods to robotics. Examples of topics of interest include, but are not limited to:

• learning models of robots, task or environments

• learning deep hierarchies or levels of representations from sensor & motor representations to task abstractions
• learning plans and control policies by imitation, apprenticeship and reinforcement learning
• finding low-dimensional embeddings of movement as implicit generative models
• integrating learning with control architectures
• methods for probabilistic inference from multi-modal sensory information (e.g., proprioceptive, tactile, vision)
• structured spatio-temporal representations designed for robot learning
• probabilistic inference in non-linear, non-Gaussian stochastic systems (e.g., for planning as well as for optimal or adaptive control)

From several recent workshops, it has become apparent that there is a significant body of novel work on these topics. The special issue will only focus on high quality articles based on sound theoretical development as well as evaluations on real robot systems.

Lab Meeting September 8th, 2008 (slyfox): Improving Localization Robustness in Monocular SLAM Using a High-Speed Camera

Title: Improving Localization Robustness in Monocular SLAM Using a High-Speed Camera

Authors: Peter Gemeiner, Andrew J. Davison, and Markus Vincze

Abstract:

In the robotics community localization and mapping of an unknown environment is a well-studied problem. To solve this problem in real-time using visual input, a standard
monocular Simultaneous Localization and Mapping (SLAM) algorithm can be used. This algorithm is very stable when smooth motion is expected, but in case of erratic or sudden movements, the camera pose typically gets lost. To improve robustness in Monocular SLAM (MonoSLAM) we propose to use a camera with faster readout speed to obtain a frame rate of 200Hz. We further present an extended MonoSLAM motion model, which can handle movements with significant jitter. In this work the improved localization and mapping have been evaluated against ground truth, which is reconstructed from off-line vision. To explain the benefits of using a high frame rate vision input in MonoSLAM framework, we performed epeatable experiments with a high-speed camera mounted onto a robotic arm. Due to the dense visual information MonoSLAM can faster shrink localization and mapping uncertainties and can operate under fast, erratic, or sudden movements. The extended motion model can provide additional robustness against significant handheld jitter when throwing or shaking the camera.

Link:
RSS2008 Paper
http://www.doc.ic.ac.uk/~ajd/Publications/gemeiner_etal_rss2008.pdf
http://www.doc.ic.ac.uk/~ajd/Movies/RSS2008_video_0071_high.avi

CMU RI Seminar: Engineering Self-Organizing Systems

Joint Intelligence Seminar (http://www.cs.cmu.edu/~iseminar) /
Robotics Institute Seminar

September 12, 2008

Title: Engineering Self-Organizing Systems

Radhika Nagpal, Harvard University

Biological systems, from embryos to ant colonies, achieve tremendous mileage by using vast numbers of cheap and unreliable components to achieve complex goals reliably. We are rapidly building embedded systems with similar characteristics, from self-assembling modular robots to vast sensor networks. How do we engineer robust collective behavior?

In this talk, I will describe two projects from my group where we have used inspiration from nature, both cells and social insects, to design decentralized algorithms for programmable self-assembly. In the first project, we use insights from social insects to design algorithms for collective construction by simple mobile robots. In the second project we use insights from multicellular tissues to design a modular robot that can form complex environmentally-adaptive shapes. In both cases we can achieve "global-to-local compilation": the agents rely on simple and local interactions that provably self-organize a wide class of user-specified global goals. Finally, time permitting, I will show an example of "local-to-global" phenomena that happens in real tissue self-assembly.

Bio:
Radhika Nagpal is an Assistant Professor of Computer Science at Harvard University since 2004. She received her PhD degree in Computer Science from MIT, and spent a year as a research fellow at Harvard Medical School. She is a recipient of the 2005 Microsoft New Faculty Fellowship award and the 2007 NSF Career award. Her research interests are biologically-inspired engineering principles for multi-agent systems and modelling multicellular biology.

Her student, Chih-Han Yu, gave a talk here on July 7, 2008. -Bob

CMU VASC Seminar: Metric Learning for Image Alignment and Classification

VASC Seminar
Monday, September 8, 2008

Metric Learning for Image Alignment and Classification
Minh Hoai Nguyen
Robotics Institute, Carnegie Mellon University

Abstract:

What constitutes good metrics to encode and compare? This talk will address this fundamental question that concerns computer vision scientists. We will show how to learn metrics that are optimal for image alignment with Active Appearance Models (AAMs), and image classification using Support Vector Machines (SVMs). Traditionally, feature extraction/selection and metric learning methods have been inferred independently of model estimation (e.g. SVM, AAM). Independently learning features and model parameters may result in the loss of information that is relevant to the alignment or classification process. Rather, we propose a convex framework for jointly learning image metrics and model parameters. To illustrate the benefits of our approach, this talk is divided in two parts. In the first part, we will discuss the problem of learning image metrics to avoid local minima in template alignment and AAMs. We learn a cost function that explicitly optimizes the occurrence of local minima at and only at the places corresponding to the correct alignment parameters. In the second part of the talk, we will consider the problem of building a fast classifier for facial feature detection. We will show how to jointly learn SVM parameters together with a subset of the pixels that are relevant for classification. This work is done in collaboration with Joan Perez and Fernando De la Torre.

Bio:

Minh Hoai Nguyen received his B.E. in Software Engineering from University of New South Wales, Australia in 2005. He has been a Ph.D. student in Carnegie Mellon University's Robotics Institute since 2006 and is advised by Fernando de la Torre. His research interests are in the area of computer vision and machine learning, especially at the intersection of the two. He is particularly interested in using data-driven techniques to learn representations of images (e.g. pixel selection, non-linear pixel combination) that are optimal for classification, clustering, visual tracking, and modeling.

Swem: add this to your reading list. -Bob

Friday, September 05, 2008

Lab Meeting September 8th, 2008 (fish60): High Performance Outdoor Navigation from Overhead Data using Imitation Learning

David Silver, J. Andrew Bagnell, Anthony StentzRobotics Institute, Carnegie Mellon UniversityPittsburgh, Pennsylvania USA
Robotics Science and Systems, June, 2008

High performance, long-distance autonomous navigationis a central problem for field robotics. Recently, a class of machine learning techniques have been developed that rely upon expert human demonstration to develop a function mapping overhead data to traversal cost. In this work, we extend these methods to automate interpretation of overhead data. We address key challenges, including interpolation-based planners, non-linear approximation techniques, and imperfect expert demonstration, necessary to apply these methods for learning to search for effective terrain interpretations.

Link

Thursday, September 04, 2008

CFP Abstract for "Experimental Design for Real-World Systems

CALL FOR PAPERS

Experimental Design for Real-World Systems

AAAI Spring 2009 Symposium, March 23-25, Palo Alto, CA

Submission deadline: October 3, 2008

As more artificial intelligence (AI) research is fielded in real-world applications, the evaluation of systems designed for human-machine interaction becomes critical. AI research often intersects with other areas of study, including human-robot interaction, human-computer interaction, assistive technology, and ethics. Designing experiments to test hypotheses at the intersections of multiple research fields can be incredibly challenging. Many commonalities and differences already exist in experimental design for real-world systems. For example, the fields of human-robot interaction and human-computer interaction are two fields that have both joint and discrete goals. They look to evaluate very different aspects of design, interface, and interaction. In some instances, these two fields can share aspects of experimental design, while, in others, the experimental design must be fundamentally different.

We will provide a forum for researchers from many disciplines to discuss experiment design and the evaluation of real-world systems. We invite researchers from all applicable fields of human-machine interaction. We also invite researchers from allied fields, such as psychology, anthropology, design, human-computer interaction, human-robot interaction, rehabilitation and clinical care, assistive technology, and other related disciplines.

This symposium will focus on a wide variety of topics that address the challenges of experiment design for real-world systems including:
* the design of system evaluations,
* successes and failures in system evaluations,
* survey design for user studies,
* understanding the role technology plays in society,
* ethics of human subject studies,
* evaluating the use of machines as interventions,
* the uses of quantitative and qualitative data,
* and other related topics.

Format and Submissions

We will have a mix of plenary speakers, short presentations, and break-out groups. We will also have a poster session. Short presentations and posters are invited to submit an abstract (< 3 pages) on experiments conducted during their research, focused on the experimental methodology, especially those with unusual and effective methodologies. Submission formatting details at http://robotics.usc.edu/~dfseifer/aaai-expdesign/. Email submissions to aaai-sss-2009@cs.uml.edu.

Important Dates

* Call for Papers Due: October 3, 2008
* Authors Notified: November 2008
* Camera Ready Due: January 9, 2009

Organizing Committee

David Feil-Seifer (USC), Heidy Maldonado (Stanford), Bilge Mutlu (CMU), Leila Takayama (Stanford), Katherine Tsui (UMass Lowell)

Program Committee

Jenny Burke (USF), Kerstin Dautenhahn (Hertfordshire), Gert Jan Gelderbloom (VILANS), Maja Mataric (USC), Aaron Steinfeld (CMU), Holly Yanco (UMass Lowell)

Lab Meeting September 8th, 2008 (Jeff):On Handling Uncertainty in the Fundamental Matrix for Scene and Motion Adaptive Pose Recovery

Title: On Handling Uncertainty in the Fundamental Matrix for Scene and Motion Adaptive Pose Recovery

Authors: Sreenivas R. Sukumar, Hamparsum Bozdogan, David L. Page, Andreas F. Koschan and Mongi A. Abidi

Abstract:

The estimation of the fundamental matrix is the key step in feature-based camera ego-motion estimation for applications in scene modeling and vehicle navigation. In this paper, we present a new method of analyzing and further reducing the risk in the fundamental matrix due to
the choice of a particular feature detector, the choice of the matching algorithm, the motion model, iterative hypothesis generation and verification paradigms. Our scheme makes use of model-selection theory to guide the switch to optimal methods for fundamental matrix
estimation within the hypothesis-and-test architecture. We demonstrate our proposed method for vision-based robot localization in large-scale environments where the environment is constantly changing and navigation within the environment is unpredictable.

Link:
CVPR2008 Paper
http://imaging.utk.edu/publications/papers/2008/CVPR08_ss.pdf

CMU Intelligence Seminar: Predicting Neural Representations of Word Meanings

Intelligence Seminar (http://www.cs.cmu.edu/~iseminar)

September 9, 2008

Title: Predicting Neural Representations of Word Meanings

Tom M. Mitchell
E. Fredkin Professor
Machine Learning Department
Carnegie Mellon University

How does the human brain represent meanings of words and pictures in terms of the neural activity observable through fMRI brain imaging? This talk will present our research using machine learning methods to study this question. One line of our research has involved training classifiers that identify which word a person is thinking about, based on the image of their fMRI brain activity. A more recent line involves developing a generative computational model capable of predicting the neural activity associated with arbitrary English words, including words for which we do not yet have brain image data. This computational model was trained using a combination of fMRI data associated with several dozen concrete nouns, together with statistics gathered from a trillion-word text corpus. Once trained, the model predicts fMRI activation for any other concrete noun appearing in the tera-word text corpus, with highly significant accuracies over the 60 nouns for which we currently have fMRI data.

This work is based on a collaboration with a number of researchers, including my primary collaborator Marcel Just.

Bio:
Tom M. Mitchell is the E. Fredkin Professor and head of the Machine Learning Department at Carnegie Mellon University. Mitchell is a past President of the American Association of Artificial Intelligence (AAAI), past Chair of the American Association for the Advancement of Science (AAAS) section on Information, Computing and Communication, and a recent member of the US National Research Council's Computer Science and Telecommunications Board. His general research interests lie in machine learning, artificial intelligence, and cognitive neuroscience. Mitchell believes the field of machine learning will be the fastest growing branch of computer science during the 21st century.

Wednesday, September 03, 2008

CMU thesis defense: Dynamics of Large Networks

Title: Dynamics of Large Networks
Speaker: Jure Leskovec
Date: September 3, 2008

Abstract:
A basic premise behind the study of large networks is that interaction leads to complex collective behavior. In our work we found interesting and counterintuitive patterns for time evolving networks, which change some of the basic assumptions that were made in the past. We then develop network models, fit such models to real networks, and use them to generate realistic graphs or give formal explanations about their properties.

Another important aspect of our research is the study information diffusion and the spread of influence in a large person-to-person product recommendation network and its effect on purchases. We also model the propagation of information on the blogosphere, and propose algorithms to efficiently find influential nodes in the network.

A central topic of our thesis is also the analysis of large datasets as certain network properties only emerge and thus become visible when dealing with lots of data. We analyze the world's social and communication network of Microsoft Instant Messenger with 240 million people and 255 billion conversations. We also made interesting and counterintuitive observations about network community structure that suggest that only small network clusters exist, and that they merge and vanish as they grow.

To view a draft of the thesis see:
http://www.cs.cmu.edu/~jure/pubs/thesis/jure-thesis.pdf

COMMITTEE:
Christos Faloutsos
Avim Blum
Jon Lafferty
Jon Kleinberg

Monday, September 01, 2008

Lab Meeting September 1, 2008 (Casey): Prior Data and Kernel Conditional Random Fields for Obstacle Detection

Authors: Carlos Vallespi, Anthony (Tony) Stentz
Robotics: Science and Systems 2008

Abstract—We consider the task of training an obstacle detection(OD) system based on a monocular color camera usingminimal supervision. We train it to match the performance of asystem that uses a laser rangefinder to estimate the presenceof obstacles by size and shape. However, the lack of rangedata in the image cannot be compensated by the extraction oflocal features alone. Thus, we investigate contextual techniquesbased on Conditional Random Fields (CRFs) that can exploitthe global context of the image, and we compare them to aconventional learning approach. Furthermore, we describe aprocedure for introducing prior data in the OD system to increaseits performance in “familiar” terrains. Finally, we performexperiments using sequences of images taken from a vehicle for autonomous vehicle navigation applications.

[Paper Link]

Sunday, August 31, 2008

Lab Meeting September 1, 2008 (Jimmy): Recovering Surface Layout from an Image

Title: Recovering Surface Layout from an Image
Authors: Derek Hoiem, Alexei A. Efros, and Martial Hebert
IJCV, Vol. 75, No. 1, October 2007

Abstract:
Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes. Geometric classes describe the 3D orientation of an image region with respect to the camera. We provide a multiple hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label. These confidences can then be used to improve the performance of many other applications. We provide a thorough quantitative evaluation of our algorithm on a set of outdoor images and demonstrate its usefulness in two applications: object detection and automatic singleview reconstruction.

Full text: pdf
Journal version: pdf

Sunday, August 17, 2008

Lab Meeting August 18, 2008 (Andi): Multi-Sensor Lane Finding in Urban Road Networks

Title: Multi-Sensor Lane Finding in Urban Road Networks

Authors: Albert Huang, David Moore, Matthew Antone, Edwin Olson, Seth Teller

Abstract: This paper describes a perception-based system for detecting and estimating the properties of multiple travel lanes in an urban road network from calibrated video imagery and laser range data acquired by a moving vehicle. The system operates in several stages on multiple processors, fusing detected road markings, obstacles, and curbs into a stable non-parametric estimate of nearby travel lanes. The system incorporates elements of a provided piecewise-linear road network as a weak prior. Our method is notable in several respects: it fuses asynchronous, heterogenous sensor streams; it distributes naturally across several CPUs communicating only through message-passing; it handles high-curvature roads; and it makes no assumption about the position or orientation of the vehicle with respect to the travel lane. We analyze the system's performance in the context of the 2007 DARPA Urban Challenge, where with five cameras and thirteen lidars it was incorporated into a closed-loop controller to successfully guide an autonomous vehicle through a 90~km urban course at speeds up to 40 km/h.

read the full paper here

Lab Meeting August 18, 2008 (Atwood): Laser and Vision Based Outdoor Object Mapping

Title: Laser and Vision Based Outdoor Object Mapping

Arthur: Bertrand Douillard, Dieter Fox, Fabio Ramos

Abstract:
Generating rich representations of environments is a fundamental task in mobile robotics. In this paper we introduce a novel approach to building object type maps of outdoor environments. Our approach uses conditional random ﬁelds (CRF) to jointly classify the laser returns in a 2D scan map into seven object types (car, wall, tree trunk, foliage,
person, grass, and other). The spatial connectivity of the CRF is determined via Delaunay triangulation of the laser map. Our model incorporates laser shape features, visual appearance features, visual object detectors trained on existing image data sets and structural information extracted from clusters of laser returns. The parameters of the CRF are trained from partially labeled laser and camera data collected by a car moving through
an urban environment. Our approach achieves 77% accuracy in classifying the object types observed along a 750 meters long test trajectory.

fulltext

Lab Meeting August 18, 2008 (Chuan-Heng): Using Recognition to Guide a Robot's Attention

Title:Using Recognition to Guide a Robot's Attention

Authors: Alexander Thomas, Vittorio Ferrari, Bastian Leibe, Tinne Tuytelaars and Luc Van Gool

Abstract: In the transition from industrial to service robotics,
robots will have to deal with increasingly unpredictable and
variable environments. We present a system that is able to
recognize objects of a certain class in an image and to identify
their parts for potential interactions. This is demonstrated for
object instances that have never been observed during training,
and under partial occlusion and against cluttered backgrounds.
Our approach builds on the Implicit Shape Model of Leibe and
Schiele, and extends it to couple recognition to the provision of
meta-data useful for a task. Meta-data can for example consist of
part labels or depth estimates. We present experimental results
on wheelchairs and cars.

RSS Online Proceedings: here
Abstract: here
PDF: here

Saturday, August 16, 2008

Lab Meeting August 17, 2008 (Bob): CVPR 2008 Summary (II)

I will keep summarizing CVPR 2008. If time permitted, the CMU motion planning system in urban environments will be discussed.

-Bob

Friday, August 15, 2008

Robot News

Rise of the rat-brained robots. [link]

Rubbery conductor promises robots a stretchy skin. [Link] [Journal reference: Science (DOI:10.1126/science.1160309)]

Saturday, August 09, 2008

Lab Meeting August 11, 2008 (Bob): CVPR 2008 Summary (I)

I will summarize the CVPR 2008 papers at the lab meeting.

Cheers,

-Bob

Wednesday, August 06, 2008

Learning Obstacle Avoidance Parameters from Operator Behavior

Bradley Hamner, Sanjiv Singh,and Sebastian Scherer

This paper concerns an outdoor mobile robot that learns to avoid collisions by observing a human driver operate a vehicle equipped with sensors that continuously produce a map of the local environment.

Here we present the formulation for this control system and its independent parameters and then show how these parameters can be automatically estimated by observing a human driver. We also present results from operation on an autonomous robot as well as in simulation, and compare the results from our method to another commonly used learning method.

Link

Tuesday, August 05, 2008

Lab Meeting August 11, 2008 (Any): Model Based Vehicle Tracking for Autonomous Driving in Urban Environments

Title: Model Based Vehicle Tracking for Autonomous Driving in Urban Environments

Authors: Anna Petrovskaya and Sebastian Thrun

Abstract: Situational awareness is crucial for autonomous driving in urban environments. This paper describes moving vehicle tracking module that we developed for our successful entry in the Urban Grand Challenge, an autonomous driving race organized by the U.S. Government in 2007. The module provides reliable tracking of moving vehicles from a high-speed moving platform using laser range finders. Our approach models both dynamic and geometric properties of the tracked vehicles and estimates them using a single Bayes filter. We also show how to build efficient 2D representations out of 3D range data and how to detect poorly visible black vehicles.

In contrast to prior art, we propose a model based approach which encompasses both geometric and dynamic properties of the tracked vehicle in a single Bayes filter. The approach naturally handles data segmentation and association, so that these pre-processing steps are not required.

RSS Online Proceedings: here
Abstract: here
PDF: here

Monday, August 04, 2008

Lab Meeting August 11, 2008（ZhenYu）：Variable Baseline/Resolution Stereo

Title：Variable Baseline/Resolution Stereo（CVPR08）

Authors：David Gallup, Jan-Michael Frahm, Philippos Mordohai, Marc Pollefeys

Abstract：
We present a novel multi-baseline, multi-resolution stereo method, which varies the baseline and resolution proportionally to depth to obtain a reconstruction in which the depth error is constant. This is in contrast to traditional stereo, in which the error grows quadratically with depth, which means that the accuracy in the near range far exceeds that of the far range. This accuracy in the near range is unnecessarily high and comes at signiﬁcant computational cost. It is, however, non-trivial to reduce this without also reducing the accuracy in the far range. Many datasets, such as video captured from a moving camera, allow the baseline to be selected with signiﬁcant ﬂexibility. By selecting an appropriate baseline and resolution (realized using an image pyramid), our algorithm computes a depthmap which has these properties: 1) the depth accuracy is constant over the reconstructed volume, 2) the computational effort is spread evenly over the volume, 3) the angle of triangulation is held constant w.r.t. depth. Our approach achieves a given target accuracy with minimal computational effort, and is orders of magnitude faster than traditional stereo.

[Link]

Sunday, August 03, 2008

Lab Meeting August 4th, 2008 (Yu-Chun): Robots in Organizations: The Role of Workflow, Social, and Environmental Factors in Human-Robot Interaction

Robots in Organizations: The Role of Workflow, Social, and Environmental Factors in Human-Robot Interaction

Authors: Bilge Mutlu and Jodi Forlizzi

HRI 2008 Best Conference Paper [PDF]

Abstract:
Robots are becoming increasingly integrated into the workplace, impacting organizational structures and processes, and affecting products and services created by these organizations. While robots promise significant benefits to organizations, their introduction poses a variety of design challenges. In this paper, we use ethnographic data collected at a hospital using an autonomous delivery robot to examine how organizational factors affect the way its members respond to robots and the changes engendered by their use. Our analysis uncovered dramatic differences between the medical and post-partum units in how people integrated the robot into their workflow and their perceptions of and interactions with it. Different patient profiles in these units led to differences in workflow, goals, social dynamics, and the use of the physical environment. In medical units, low tolerance for interruptions, a discrepancy between the perceived cost and benefits of using the robot, and breakdowns due to high traffic and clutter in the robot's path caused the robot to have a negative impact on the workflow and staff resistance. On the contrary, post-partum units integrated the robot into their workflow and social context. Based on our findings, we provide design guidelines for the development of robots for organizations.

Thursday, July 31, 2008

Lab Meeting August 4th, 2008(Szu-Wei) Calbration of Ground Truth Labeling System

In this meeting, I will introduce the structure of Leo's "Ground Truth Labeling System." and show how to improve accuracy of camera calibration in this system.

Lab Meeting (2008/8/4) (Chung-Han):Random Sample Consensus: A Paradigm for Model Fitting with Apphcatlons to Image Analysis and Automated Cartography

Title : Random Sample Consensus: A Paradigm for Model Fitting with Apphcatlons to Image Analysis and Automated Cartography.

Authors : Martin A. Fischler and Robert C. Bolles SRI International(June, 1981)

Abstract :

A new paradigm, Random Sample Consensus
(RANSAC), for fitting a model to experimental data is
introduced. RANSAC is capable of interpreting/
smoothing data containing a significant percentage of
gross errors, and is thus ideally suited for applications
in automated image analysis where interpretation is
based on the data provided by error-prone feature
detectors. A major portion of this paper describes the
application of RANSAC to the Location Determination
Problem (LDP): Given an image depicting a set of
landmarks with known locations, determine that point
in space from which the image was obtained. In
response to a RANSAC requirement, new results are
derived on the minimum number of landmarks needed
to obtain a solution, and algorithms are presented for
computing these minimum-landmark solutions in closed
form. These results provide the basis for an automatic
system that can solve the LDP under difficult viewing
and analysis conditions. Implementation details and
computational examples are also presented.
Key Words and Phrases: model fitting, scene
analysis, camera calibration, image matching, location
determination, automated cartography.

Link : Full text

Lab Meeting July 31st, 2008 (Jimmy): Goal-Directed Pedestrian Model with Application to Robot Motion Planning

I will talk about the work done in my MS. thesis.

Lab Meeting July 31st, 2008 (Yu-Hsiang): Abnormal Activity Recognition by Learning and Inferring Scene Interaction

I will present my idea for abnormal activity recognition and some related works.

Wednesday, July 30, 2008

Lab Meeting July 31st, 2008 (swem): Method of determining hand waving signal

In the lab meeting, I will represent method of determining hand waving signal.

The representation will focus on
-Motion History Image
-Sobel gradient (Convolution Matrix filter)

Tuesday, July 22, 2008

Lab Meeting July 22nd, 2008 (Wei-Chun): Bearings-Only Tracking Problem

Outline:

Bearings-only tracking.
Comparisons between regular and inverse-velocity representation form.
Modified gain extended Kalman filter

Please check your mailbox for the slides.

Monday, July 21, 2008

Lab Meeting July 22nd, 2008 (Jeff):Single Camera Vision-Only SLAM on a Suburban Road Network

Title: Single Camera Vision-Only SLAM on a Suburban Road Network

Authors: Michael J. Milford and Gordon F. Wyeth

Abstract:

Simultaneous Localization And Mapping (SLAM) is one of the major challenges in mobile robotics. Probabilistic techniques using high-end range finding devices are well established in the field, but recent work has investigated visiononly approaches. This paper presents a method for generating approximate rotational and translation velocity information from a single vehicle-mounted consumer camera, without the computationally expensive process of tracking landmarks. The method is tested by employing it to provide the odometric and
visual information for the RatSLAM system while mapping a complex suburban road network. RatSLAM generates a coherent map of the environment during an 18 km long trip through suburban traffic at speeds of up to 60 km/hr. This result demonstrates the potential of ground-based vision-only SLAM using low cost sensing and computational hardware.

Link:
ICRA 2008 Paper
Please see the ICRA disk:0596.pdf

Sunday, July 20, 2008

Lab Meeting July 22th, 2008 (fish60): Planning Long Dynamically-Feasible Maneuvers for Autonomous Vehicles

Maxim Likhachev and Dave Ferguson
Proceedings of the Robotics: Science and Systems Conference (RSS), 2008

Abstract:
In this paper, we present an algorithm for generating complex dynamically-feasible maneuvers for autonomous vehicles traveling at high speeds over large distances. Our approachis based on performing anytime incremental search on a multi-resolution, dynamically-feasible lattice state space. The resulting planner provides real-time performance and guarantees on and control of the sub-optimality of its solution.

link

Saturday, July 19, 2008

Robot PAL Master Thesis Oral July 29 2008

WEAKLY INTERACTING OBJECT TRACKING IN INDOOR ENVIRONMENTS

Kao-Wei Wan

Place: CSIE R524
Time: 9:00 AM

Thesis Committee:
Chieh-Chih Wang (Chair)
Li-Chen Fu
Han-Pang Huang
Jenhwa Guo
Chu-Song Chen (Academia Sinica)

Robot PAL Master Thesis Oral July 21 2008

HAND POSTURE RECOGNITION USING HIDDEN CONDITIONAL RANDOM FIELDS

Te-Cheng Liu

Place: CSIE R524
Time: 4:30 PM

Thesis Committee:
Chieh-ChihWang (Chair)
Ruey-Feng Chang
Yung-Yu Chuang
Tyng-Luh Liu (Academia Sinica)

Robot PAL Master Thesis Oral July 18 2008

ENVIRONMENT AND HUMAN BEHAVIOR LEARNING FOR ROBOT MOTION CONTROL

Yueh-chi Yu

Place: CSIE R524
Time: 3:00 pm

Thesis Committee:
Chieh-Chih Wang (Chair)
Cheng-yuan Liou
Feng-Li Lian
Pei-Chun Lin
Tsai-Yen Li (National Chengchi University)

Wednesday, July 16, 2008

CVPR 2008 Best Papers

Best Paper
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search, Christoph H. Lampert, Matthew B.Blaschko,Thomas Hofmann

Best Paper
Global Stereo Reconstruction under Second Order Smoothness Priors, Oliver Woodford, Ian Reid, Philip Torr, Andrew Fitzgibbon

Best Student Paper
Fast Image Search for Learned Metrics, Prateek Jain, Brian Kulis, Kristen Grauman

Best Poster
The Patch Transform and its Applications to Image Editing, Taeg Sang Cho, Moshe Butman, Shai Avidan, William Freeman

Best Student Poster
Robust dual motion deblurring, Jia Chen, Lu Yuan, Chi-Keung Tang, Long Quan

Sunday, July 13, 2008

Robotics Institute Thesis Oral 17 Jul 2008

Integrating Perception & Planning for Humanoid Autonomy

Philipp Michel
Robotics Institute
Carnegie Mellon University

Abstract:
This thesis explores appropriate approaches to perception on humanoids and ways of coupling sensing and planning to generate navigation and manipulation strategies that can be executed reliably.
...
We examine how predictive information about the future state of the world gathered from observation enables navigation in the presence of challenging moving obstacles. We show how programmable graphics hardware can be exploited to create a novel, model-based 3D tracking system able to robustly address the difficulties of real-time sensing specifically encountered on a locomoting humanoid. This thesis argues furthermore that reliability of autonomous operation can be improved by reasoning about perception during the planning process, rather than maintaining the traditional separation of the sensing and planning stages.

Monday, July 07, 2008

Lab Meeting July 7th, 2008 (Casey): A Fast Local Descriptor for Dense Matching

Title:A Fast Local Descriptor for Dense Matching
Authors:Engin Tola, Vincent Lepetit, Pascal Fua
Abstract:
We introduce a novel local image descriptor designed for dense wide-baseline matching purposes. We feed our descriptors to a graph-cuts based dense depth map estimation algorithm and this yields better wide-baseline performance than the commonly used correlation windows for which the size is hard to tune. As a result, unlike competing techniques that require many high-resolution images to produce good reconstructions, our descriptor can compute them from pairs of low-quality images such as the ones captured by video streams.Our descriptor is inspired from earlier ones such as SIFT and GLOH but can be computed much faster for our purposes. Unlike SURF which can also be computed efficientlyat every pixel, it does not introduce artifacts that degrade the matching performance.Our approach was tested with ground truth laser scanned depth maps as well as on a wide variety of image pairs of different resolutions and we show that good reconstructions are achieved even with only two low quality images.

[Link]

Lab Meeting July 7th, 2008 (Atwood): Progress Report

I will talk about my recent experiments on hand posture.

Sunday, July 06, 2008

Lab Meeting July 7th, 2008 (Any): Classifying Dynamic Objects: An Unsupervised Learning Approach

Title: Classifying Dynamic Objects: An Unsupervised Learning Approach
Authors: Matthias Luber, Kai O. Arras, Christian Plagemann, and Wolfram Burgard

Abstract: For robots operating in real-world environments, the ability to deal with dynamic entities such as humans, animals, vehicles, or other robots is of fundamental importance. The variability of dynamic objects, however, is large in general, which makes it hard to manually design suitable models for their appearance and dynamics. In this paper, we present an unsupervised learning approach to this model-building problem. We describe an exemplar-based model for representing the time-varying appearance of objects in planar laser scans as well as a clustering procedure that builds a set of object classes from given training sequences. Extensive experiments in real environments demonstrate that our system is able to autonomously learn useful models for, e.g., pedestrians, skaters, or cyclists without being provided with external class information.

PDF via Robotics: Science and Systems IV

Monday, June 30, 2008

[Lab Meeting] June 30th, 2008 (ZhenYu)

I will give a progress report.

Sunday, June 29, 2008

[Lab Meeting] June 30th, 2008 (Ekker) :Intelligent Shoes for Abnormal Gait Detection

Intelligent Shoes for Abnormal Gait Detection
Meng Chen, Bufu Huang, and Yangsheng Xu
2008 IEEE International Conference on
Robotics and Automation
Pasadena, CA, USA, May 19-23, 2008

Abstract—In this paper we introduce a shoe-integrated
system for human abnormal gait detection. This intelligent
system focuses on detecting the following patterns: normal gait,
toe in, toe out, oversupination, and heel walking gait abnormalities.
An inertial measurement unit (IMU) consisting of
three-dimensional gyroscopes and accelerometers is employed
to measure angular velocities and accelerations of the foot. Four
force sensing resistors (FSRs) and one bend sensor are installed
on the insole of each foot for force and flexion information
acquisition. The proposed detection method is mainly based
on Principal Component Analysis (PCA) for feature generation
and Support Vector Machine (SVM) for multi-pattern
classification. In the present study, four subjects tested the
shoe-integrated device in outdoor environments. Experimental
results demonstrate that the proposed approach is robust and
efficient in detecting abnormal gait patterns. Our goal is to
provide a cost-effective system for detecting gait abnormalities
in order to assist persons with abnormal gaits in the developing
of a normal walking pattern in their daily life.

Lab Meeting June 30th, 2008 (Hero) : Path and Trajectoty Diversity:Theory and Algorithms

Path and Trajectory Diversity: Theory and Algorithms

Michael S. Branicky Ross A. Knepper† James J. Kuffner†

2008 IEEE International Conference on
Robotics and Automation
Pasadena, CA, USA, May 19-23, 2008

abstract:
We present heuristic algorithms for pruning large sets of candidate paths or trajectories down to smaller subsets that maintain desirable characteristics in terms of overall reachability and path length. Consider the example of a set of candidate paths in an environment that is the result of a forward search tree built over a set of actions or behaviors. The tree is precomputed and stored in memory to be used online to compute collision-free paths from the root of the tree to a particular goal node. In general, such a set of paths may be quite large, growing exponentially in the depth of the search tree. In practice, however, many of these paths may be close together and could be pruned without a loss to the overall problem of path-finding. The best such pruning for a given resulting tree size is the one that maximizes path diversity, which is quantified as the probability of the survival of paths, averaged over all possible obstacle environments. We formalize this notion and provide formulas for computing it exactly. We also present experimental results for two approximate algorithms for path set reduction that are efficient and yield desirable properties in terms of overall path diversity. The exact formulas and approximate algorithms generalize to the computation and maximization of spatio-temporal diversity for trajectories.

http://www.csie.ntu.edu.tw/~b91501097/lab_meeting/0463.pdf

Monday, June 23, 2008

Lab Meeting June 23rd, 2008 (Der-Yeuan Yu)

I will show some progress on matching 3D laser scans based on image features.

Lab Meeting June 23rd, 2008 (KuoHwei Lin)

I will report my progress on SLAMMOT and show some results.

Sunday, June 22, 2008

Lab Meeting June 23rd, 2008 (Stanley)

I will give a (short?) progress report.

Friday, June 20, 2008

News: Google is doing 3D city modeling?

These two pictures are taken in Italy and San Francisco, respectively. As you can see, several SICK laser scanners are mounted. There are two side-facing vertical scanners, and another forward-facing horizontal scanner. So, what is Google doing with 3D laser data? The obvious application is 3D reconstruction for Google Earth.

--
Educating Silicon - Google Street View - Soon in 3D?
Engadget - Google 工程車被開單

New Algorithms for Feature-Based 2d and 3d Registration

VASC Seminar (Monday, June 23, 3:30pm-4:30pm, NSH 1507)

Charles V. Stewart
Rensselaer Polytechnic Institute and DualAlign, LLC

Abstract
This talk presents a series of algorithms for feature-based registration. The Dual-Bootstrap approach to registration "grows" inter-image transformations by starting with single-keypoint matching in small image regions. It has been used to develop highly-successful algorithms for 2d-to-2d image registration, 3d-to-3d LiDAR scan registration, and the 3d-to-2d problem of determining the location of a camera with respect to a 3d model. The Dual-Bootstrap is now undergoing commercial development for a wide-variety of applications. More recently, the Location Registration and Recognition (LRR) algorithm has been developed as an aid to longitudinal diagnosis and treatment monitoring, particularly for lung cancer. Rather than applying deformable registration, clinical regions of interest in one CT scan (such as small volumes surrounding nodules) are automatically recognized and aligned in a second CT scan. Like the Dual-Bootstrap, LRR uses a combination of keypoint indexing, (local) feature-based refinement, and learned decision criteria. LRR works at near interactive speeds and is (slightly) more accurate than the best current deformable registration technique.

Speaker
Charles V. Stewart is a professor in the Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York. He has done sabbaticals at the GE Center for Research and Development in Niskayuna, New York, and at the Johns Hopkins University. In 1999, together with Ali Can and Badrinath Roysam, he received the Best Paper Award at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). In 2007, he founded DualAlign LLC, where he is currently working as chief scientist while on leave from Rensselaer.

RPIs Computer Vision group

News: Audio processing is on the move

There is a quiet revolution underway in audio processing - although it's quiet only in that the developments are mainly aimed at eliminating troublesome noise and echos. To find out more, we talked to a few of the companies leading the charge.

It may seem counter-intuitive, but as audio headsets get more and more clever, their users are unlikely to hear the improvements. That's because the work is mostly aimed at improving the outbound sound quality via techniques such as noise cancellation - although headset users should see benefits in areas such as voice dialling or speech recognition.

To make those improvements, modern headsets increasingly use multiple microphones, plus powerful digital signal processors (DSPs) to re-work the sounds thus collected.

For example, Blue Ant's Z9 Bluetooth headset has two microphones, plus a DSP which uses the signals to measure the distance to the sound source and thereby triangulate on the mouth, says Taisen Maddern, the company's CEO.

Maddern notes that as devices such as this are essentially software-driven, they can also be upgraded as new and better algorithms come along. He points for example to the emerging wideband speech profile for Bluetooth, which will allow headsets to support 3G's broader audio spectrum.

He adds that as developers look to make Bluetooth easier to use, voice control is an obvious possibility - but it's one that relies upon a clear audio signal.

"We have a headset coming that will be the first voice-command headset that talks you through the process of pairing," he says. "You can use voice commands to request help, check the battery level, make calls."

It's not just audio that can be collected and made use of, adds Alex Affely, the CTO of Aliph, the company behind the Jawbone headset.

As its name suggests, the Jawbone not only picks up exterior and spoken sound, it also has a third microphone which "taps into your jaw vibration," says Affely.

Wireless has really given audio technology a big boost, he reckons. He says that while some of the work behind Jawbone dates back to the early 1990s and the First Gulf War, when it was realised that the wired headsets used by soldiers needed better noise cancellation, it wasn't until Bluetooth came into focus a few years ago that it really took off.

As well as noise and echo cancellation, developers are also taking advantage of research into areas such as pattern recognition, says Jennifer Stagnaro, the marketing VP at Audience, which develops voice processing technology for integration into mobile phones.

"We have reverse-engineered the human auditory system, using an optimised DSP chip and software. Most past technologies could cancel stationary noise, we can cancel non-stationary noise," she claims.

Audience's voice processor includes algorithms based on research into auditory scene analysis - a complex technique for taking a mixture of sounds and sorting them into packages, each of which probably has the same source. Other useful techniques for signal selectivity include beamforming and blind signal separation, she says.

One advantage of the DSP-and-algorithm approach is that it can be position-neutral, so it can be used in handsets as well as headsets, she argues.

"The challenge in the handset is bigger," Stagnaro says. "With a headset you can do [bone] conduction pick-up. Ours still uses two microphones, but is position-neutral so it works in speakerphone mode, for example."

But whether the technology's in the headset or the handset, it still only cleans up the outgoing audio, notes Alex Affely.

He adds, provocatively: "What about incoming sound? That's one possibility for the future, it's one of the interesting things out there."

link

Thursday, June 12, 2008

IEEE news: Intelligent Computers See Your Human Traits

Intelligent Computers See Your Human Traits
In order to make human-computer interaction more natural and friendly, computer engineers are currently working on a way to give computers a more personal touch. By combining audio and visual data, Yongjin Wang, University of Toronto, Ontario, Canada and Ling Guan, Ryerson University, Toronto, Ontario, Canada, have developed a system that recognizes six human emotional states: happiness, sadness, anger, fear, surprise and disgust. "Human-centered computing focuses on understanding humans, including recognition of face, emotions, gestures, speech, body movements, etc.," said Wang. "Emotion recognition systems help the computer to understand the affective state of the user, and hence the computer can respond accordingly based on that perception." Their system can recognize emotions in people from different cultures and who speak different languages with a success rate of 82%. Read more Learn more about human-computer interaction in IEEE Xplore®

IEEE Spectrum Special Issue: The Singularity

Hi Folks, you have to check out this issue of IEEE Spectrum. -Bob

Spectrum Special Issue: The Singularity
Be sure to read the special June issue of IEEE Spectrum, which focuses on the singularity and how continuous technological innovation in artificial intelligence could one day outstrip human brain power, changing life as we know it. The special report touches on a wide variety of singularity arguments, including how to create consciousness if we do not know what it really is; thoughts on reversing the human brain; and whether or not it is possible to escape death by uploading our minds into machines. The issue also includes a "Who's Who in The Singularity" feature, outlining all the inventors, researchers, academics and authors who have had something to say about the controversial topic. Read more

Monday, June 09, 2008

The meeting tomorrow afternoon

As Bob has a meeting at noon, all meetings (afternoon) will be postponed one hour.

Lab Meeting June 9th, 2008 (Yu-chun): GUMSAWS: A Generic User Modeling Server for Adaptive Web Systems

Communication Networks and Services Research, 2007

Author: Jie Zhang and Ali A. Ghorbani

Abstract:
In this paper we focus on the architecture, design and implementation of a generic user modeling server for adaptive web systems (GUMSAWS), reaching the goals of generality, extendability and replaceability. GUMSAWS acts as a centralized user modeling server to assist several adaptive web systems (possibly in different domains) concurrently. It incrementally builds up user models, provides functions of storing, updating and deleting entries in user profiles, and maintains consistency of user models. Our system is also able to infer missing entries in user profiles from different information sources, including direct information, groups information, association rules and general facts. We further evaluate its inference performance within the context of e-commerce. Experimental results show that the average accuracy of inferring user missing property values from different information resources is found to be almost 70%. We also use a personalized electronic news system to demonstrate the example of our system in use.

link

Sunday, June 08, 2008

News:Scientists Figure Out Human Mobility Patterns from Cell phone Signals

Researchers in Boston tracked 100,000 people by their cellphone signals for six months to determine characteristics of how we get around to determine what underlies our travel patterns. The results show that we are a very predictable bunch, and that should be good news to urban planners, epidemiologists, and traffic analysts.

link

Lab Meeting June 9th, 2008 (Yu-Hsiang) : Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model

Computer Vision and Pattern Recognition 2008

Author:

Xiaogang Wang, Keng Teck Ma, Gee-Wah Ng, Eric Grimson

Abstract:

We propose a novel nonparametric Bayesian model, Dual Hierarchical Dirichlet Processes (Dual-HDP), for trajectory analysis and semantic region modeling in surveillance settings, in an unsupervised way. In our approach, trajectories are treated as documents and observations of an object on a trajectory are treated as words in a document. Trajectories are clustered into different activities. Abnormal trajectories are detected as samples with low likelihoods. The semantic regions,which are intersections of paths commonly taken by objects, related to activities in the scene are also modeled. Dual-HDP advances the existing Hierarchical Dirichlet Processes (HDP) language model. HDP only clusters co-occurring words fromdocuments into topics and automatically decides the number of topics. Dual-HDP co-clusters both words and documents.It learns both the numbers of word topics and document clusters from data. Under our problem settings, HDP only clusters observations of objects, while Dual-HDP clusters both observations and trajectories. Experiments are evaluated on two datasets, radar tracks collected from a maritime port and visual tracks collected from a parking lot.

link

Lab Meeting June 9th, 2008 (Jeff):Progress report

I will show the dataset I get from PAL3 for achieving SLAM.

And try to describe PAL3 motion model.

Monday, June 02, 2008

[Lab Meeting] June 2nd, 2008 (Leo): Tracking Interacting Targets with Laser Scanner via On-line Supervised Learning

Authors: Xuan Song, Jinshi Cui, Xulei Wang, Huijing Zhao and Hongbin Zha

From: ICRA 2008

Abstract: Successful multi-target tracking requires
locating the targets and labeling their identities. For the laser
based tracking system, the latter becomes significantly more
challenging when the targets frequently interact with each
other. This paper presents a novel on-line supervised learning
based method for tracking interacting targets with laser
scanner. When the targets do not interact with each other, we
collect samples and train a classifier for each target. When the
targets are in close proximity, we use these classifiers to assist
in tracking. Different evaluations demonstrate that this
method has a better tracking performance than previous
methods when interactions occur, and can maintain correct
tracking under various complex tracking situations.

The pdf version can be found on our FTP server.

Monday, May 19, 2008

[Lab Meeting] May 19th, 2008 (Atwood): Progress Report

I will talk about my idea on semi-supervised Parts Condition Random Field

Outline

1. Introduction
2. Relates works
2.1 Hidden Conditional Random Field
2.2 Link analysis of feature matching
2.3 Spectral clustering
3. Semi-supervised Parts Conditional Random Field
3.1 Clustering for part-structure
3.2 Parts correspondence
4. Experiments

Sunday, May 18, 2008

[Lab Meeting] May 19th, 2008 (fish60): Learning Grasp Strategies with Partial Shape Information

Abstract:
We consider the problem of grasping novel objects incluttered environments.
...
In this paper,we propose an approach to grasping that estimates thestability of different grasps, given only noisy estimatesof the shape of visible portions of an object, such as thatobtained from a depth sensor. By combining this witha kinematic description of a robot arm and hand, ouralgorithm is able to compute a specific positioning ofthe robot’s fingers so as to grasp an object.

Link

Saturday, May 17, 2008

News: Microsoft Research Explores - Robots Among Us - Human-robot Interaction

The link.

Moving the technology toward these so-called “social robots” are researchers in a variety of disciplines engaged in the growing field of human-robot interaction (HRI). To explore some of the challenges in realizing the potential of HRI, Microsoft Research launched the “Robots Among Us” request for proposals (RFP) last October with the bold declaration, “The robots are coming!”

Eight winners will receive a share of more than US$500,000 awarded under the program. Winning research proposals were selected from 74 submissions from academic researchers from 24 countries. The research projects explore broad range of devices, technologies and functions as robots begin to work with and alongside human beings.

“Snackbot: A Service Robot,” Jodi Forlizzi and Sara Kiesler, Carnegie Mellon University. Snackbot will roam the halls of two large office buildings at Carnegie Mellon University, selling (or in some cases, giving away) snacks and performing other services. Microsoft’s grant will help the team link its current robot prototype to the Web, e-mail, instant messaging and mobile services. The group will also deploy the robot in a field study to understand the uptake of robotic products and services.
“Human-Robot-Human Interface for an autonomous vehicle in challenging environments,” Ioannis Rekleitis and Gregory Dudek, McGill University, Canada.Utilizing Microsoft Robotics Studio, this group will work to provide an interface for controlling a robot operating on land and underwater, as well as a visualization tool for interpreting the visual feedback. The work will also create a new method for communicating with AQUA when a direct link to a controlling console is not available.
“Personal Digital Interfaces for Intelligent Wheelchairs,” Nicholas Roy,Massachusetts Institute of Technology.Using a Windows Mobile PDA outfitted with a remote microphone and speech processor, this group will create a single, flexible point of interaction to control wheelchairs. The project will address human-robot interaction challenges in how the spatial context of the interaction varies depending on the location of the wheelchair, the location of the hand-held device and the location of the resident. This project is part of an ongoing collaboration with a specialized care residence in Boston.
Human-Robot Interaction to Monitor Climate Change via Networked Robotic Observatories, Dezhen Song, Texas A&M University, and Ken Goldberg, University of California, Berkeley. This team will develop a new Human-TeleRobot system to engage the public in documenting climate change effects on natural environments and wildlife, and provide a testbed for study of Human Robot Interaction. To facilitate this, a new type of human-robot system will be built to allow anyone via a browser to participate in viewing and collecting data via the Internet. The Human Robot Interface will combine telerobotic cameras and sensors with a competitive game where “players” score points by taking photos and classifying the photos of others.
FaceBots: Robots utilizing and publishing social information in FaceBook, Nikolaos Mavridis and Tamer Rabie, United Arab Emirates University. The system to be developed by Mavridis and Rabie is expected to achieve two significant novelties: arguably being the first robot that is truly embedded in a social web, and being the first robot that can purposefully exploit and create social information available online. Furthermore, it is expected to provide empirical support for their main hypothesis - that the formation of shared episodic memories within a social web can lead to more meaningful long-term human-robot relationships.
Multi-Touch Human-Robot Interaction for Disaster Response, Holly Yanco, University of Massachusetts. This group wants create a common computing platform that can interact with many different information systems, personnel from different backgrounds and expertise, and robots deployed for a variety of task in the event of a disaster. The proposed research intends to bridge the technological gaps through the use of collaborative tabletop multi-touch displays such as the Microsoft Surface. The group will develop an interface between the multi-touch display and Microsoft Robotics Studio to create a multi-robot interface for command staff to monitor and interact with all of the robots deployed at a disaster response.
Survivor Buddy: A Web-Enabled Robot as a Social Medium for Trapped Victims, Robin Murphy, University of South Florida.The main focus of this group is the assistance of humans who will be dependent on a robot for long periods of time. One function is to provide two-way audio communication between the survivor and the emergency response personnel. Other ideas are being studied, such as playing therapeutic music with a beat designed to regulate heartbeats or breathing. The idea is that a web-enabled, multi-media robot allows: 1) the survivor to take some control over the situation and find a soothing activity while waiting for extrication; and 2) responders to support and influence the state of mind of the victim.
Prosody Recognition for Human-Robot Interaction, Brian Scassellati, Yale University. This group will work to build a novel prosody recognition algorithm for release as a component for Microsoft Robotics Studio. Vocal prosody is the information contained in your tone of voice that conveys affect, and is a critical aspect to human-human interactions. In order to move beyond direct control of robots toward autonomous social interaction between humans and robots, the robots must be able to construct models of human affect by indirect, social means.

ICRA 2008 Conference and RAS Awards Finalists:

Below is the ICRA08 award finalist list. Some of you are assigned to read the papers and will lead discussions in the lab meetings. Of course all of you are encouraged to study these papers.

Best,

-Bob

Best Conference Paper Finalists:

Employing Wave Variables for Coordinated Control of Robots with Distributed Control Architecture by Christian Ott and Yoshihiko Nakamura (Hero)
Trajectory Generation for Dynamic Bipedal Walking through Qualitative Model Based Manifold Learning by Subramanian Ramamoorthy and Benjamin Kuipers (Stanley, Jim Yu)
Consensus Learning for Distributed Coverage Control by Mac Schwager, Jean-Jacques E. Slotine and Daniela Rus (Hero Chen)
Planning in Information Space for a Quadrotor Helicopter in a GPS-Denied Environment by Ruijie He, Sam Prentice and Nicholas Roy (Stanley, Jim Yu)

Best Student Paper Finalists (The name(s) of nominated student(s) are in bold font):

Hybrid Simulation of a Dual-Arm Space Robot Colliding with a Floating Object by Ryohei Takahashi, Hiroto Ise, Daisuke Sato, Atsushi Konno and Masaru Uchiyama
Partial barrier coverage: Using game theory to optimize probability of undetected intrusion in polygonal environments by Stephen Kloder and Seth Hutchinson (Der-Yeuan)
Decentralized Feedback Controllers for Multi-Agent Teams in Environments with Obstacles by Nora Ayanian and Vijay Kumar (Hero, Jim Yu)
Gecko-Inspired Climbing Behaviors on Vertical and Overhanging Surfaces by Daniel Santos, Barrett Heyneman, Sangbae Kim, Noe Esparza and Mark Cutkosky
High Quality 3D LIDAR from Low Cost 2D Ranging Under General Vehicle Motion by Alastair Harrison and Paul Newman (Andi)

Best Automation Paper Finalists

Dynamic Analysis of a High-Bandwidth, Large-Strain, PZT Cellular Muscle Actuator with Layered Strain Amplification by Thomas Secord, Jun Ueda and Harry Asada
Event-Based Two Degree-Of-Freedom Control for Micro-/Nanoscale Systems Based on Differential Flatness by Ruoting Yang, T. J. Tarn and Mingjun Zhang
On the Design of Traps for Feeding 3D Parts on Vibratory Tracks by Onno Goemans and Frank van der Stappen
Fabrication of Functional Gel-Microbead for Local Environment Measurement in Microchip by Hisataka Maruyama, Fumihito Arai and Toshio Fukuda

Best Manipulation Paper (sponsored by Ben Wegbreit) Finalists:

Skilled-Motion Plannings of Multi-Body Systems Based upon Riemannian Distance by Masahiro Sekimoto, Suguru Arimoto, Ji-Hun Bae and Sadao Kawamura
Transportation of Hard Disk Media using Electrostatic Levitation and Tilt Control by Ewoud Frank van West, Akio Yamamoto and Toshiro Higuchi
Adaptive Grasping by Multi Fingered Hand with Tactile Sensor Based on Robust Force and Position Control by Taro Takahashi, Toshimitsu Tsuboi, Takeo Kishida, Yasunori Kawanami, Satoru Shimizu, Masatsugu Iribe, Tetsuharu Fukushima and Masahiro Fujita
Manipulating Articulated Objects With Interactive Perception by Dov Katz and Oliver Brock (Jim Yu)
Synergistic Design of a Humanoid Hand with Hybrid DC Motor - SMA Array Actuators Embedded in the Palm by Josiah Rosmarin and Harry Asada

Best Vision Paper (sponsored by Ben Wegbreit) Finalists

Accurate Calibration of Intrinsic Camera Parameters by Observing Parallel Light Pairs by Ryusuke Sagawa and Yasushi Yagi (54ways)
Information-Optimal Selective Data Return for Autonomous Rover Traverse Science and Survey by David R. Thompson, Trey Smith and David Wettergreen (Yu-Hsiang)
Image Moments-based Ultrasound Visual Servoing by Rafik Mebarki, Alexandre Krupa and Francois Chaumette (Jeff)
Robust and Efficient Stereo Feature Tracking for Visual Odometry by Andrew E. Johnson, Steven B. Goldberg, Yang Cheng and Larry H. Matthies (Jeff, Yu-Hsiang)
Accelerated Appearance-Only SLAM by Mark Cummins and Paul Newman (Jeff, Yu-Hsiang)

Best Video Finalists

Magmites - Wireless Resonant Magnetic Microrobots by Dominic R. Frutiger, Bradley Kratochvil, Karl Vollmers and Bradley J. Nelson
The OmniTread OT-4 Serpentine Robot by Johann Borenstein and Adam Borrell
Preliminary Report: Rescue Robot at Crandall Canyon, Utah, Mine Disaster by Robin R. Murphy, Jeffery Kravitz, Ken Peligren, James Milward and Jeff Stanway

KUKA Service Robotics Best Paper Finalists

Efficient Airport Snow Shoveling by Applying Autonomous Multi-Vehicle Formations by Martin Saska, Martin Hess and Klaus Schilling
Hybrid Laser and Vision Based Object Search and Localization by Dorian Galvez Lopez, Kristoffer Sjo, Chandana Paul and Patric Jensfelt (Andi)
Towards a Personal Robotics Development Platform: Rationale and Design of an Intrinsically Safe Personal Robot by Keenan A. Wyrobek, Eric H. Berger, H.F. Machiel Van der Loos, and J. Kenneth Salisbury (Jeff)
VSA-II: A Novel Prototype of Variable Stiffness Actuator for Safe and Performing Robots Interacting with Humans by Riccardo Schiavi, Giorgio Grioli, Soumen Sen and Antonio Bicchi (Yu-Chun)

Sunday, May 11, 2008

[Lab Meeting] May 12th, 2008 (Andi) Progress Report on 3D mapping

I will show some early results, and discuss several issues that still have to be addressed.

[Lab Meeting] May 12th, 2008 (Ekker): On foot navigation : continuous step calibration using both complementary recursive prediction and adaptive Kalm

Title:
On foot navigation : continuous step calibration
using both complementary recursive prediction
and adaptive Kalman filtering
From : ION 2000
Abstract:
Dead reckoning for on-foot navigation applications cannot
be computed by double integration of the antero-posterior
acceleration. The main reasons are the alignment problem
and the important sensor systematic errors in comparison
to human walking speed. However, raw accelerometer
signal can furnish helpful information on steps length as a
function of the walk dynamics. As stride length naturally
varies, a continuous adaptation is necessary. In the
absence of satellite observable, a recursive prediction
process is used. When GPS signal is available, adaptive
Kalman filtering is processed to update both the stride
length and the recursive prediction parameters. This paper
shows the different necessary stages for individual stride
calibration as basis of global on-foot dead reckoning
applications. This study lies within the framework of a
project that aims at analyzing the daily activity of people.
Precise continuous positioning, but not necessarily in real-
time conditions, appears of evident interest. The global
procedure and several test results are presented.
Link

Saturday, May 10, 2008

Modeling and Visualizing the World from Internet Photo Collections

Noah Snavely
University of Washington

Tuesday May 13
10:00am NSH 3305

Abstract:
The Internet has become a massive source of photographic imagery. Billions of photos are available from sources ranging from Google Maps to Flickr, and myriad views of virtually every famous location on Earth are readily available. For instance, a Google Image search for "Eiffel Tower" returns almost half a million images, and a search for "Grand Canyon" returns nearly three million photos, representing many different photographers, viewpoints, times of day, weather conditions, and seasons. While extremely rich, these vast, unorganized photo collections are difficult to explore and search through using traditional photo browsing tools.

In this talk, I will present my work on new computer vision techniques for recovering the 3D structure of scenes from very large, diverse photo collections, and on new visualization techniques for exploring these reconstructed scenes in 3D. I will first describe Photo Tourism, an approach for navigating through photos using geometric controls. I will then discuss more recent work in creating simple, intuitive navigation interfaces by analyzing patterns in how people take photographs, and using these patterns to derive optimized 3D controls for each scene.

Bio:
Noah Snavely is a Ph.D. candidate in the Department of Computer Science and Engineering at the University of Washington, advised by Professor Steven Seitz and Dr. Richard Szeliski. His research interests span computer vision, computer graphics, and interactive techniques. He is particularly interested in developing new computer vision algorithms for the analysis of large, diverse photo collections, and on leveraging these algorithms to produce effective visualizations of scenes. He is the recipient of a National Science Foundation fellowship (2003) and a Microsoft Live Labs fellowship (2007).

Here is a link to a live demo of this work (and some videos)

Thursday, May 08, 2008

Robotics Institute Thesis Proposal 12 May 2008

Online Adaptive Modeling for Outdoor Mobile Robots in Rough Terrain

Abstract:
Autonomous navigation by Unmanned Ground Vehicles (UGVs) in rough terrain is currently a problem of much interest and with many applications. ...
In this thesis, we propose a system for automatically identifying a vehicle model using conventional, on board state-estimation and terrain perception sensors. Such a system should be able automatically adapt to new or changing environments, vehicle damage or wear and tear. Research areas to be addressed include model structure, convergence and observability.

A copy of the thesis proposal document can be found at http://www.cs.cmu.edu/~dranders/fileserv/papers/ThesisProposal.pdf

Tuesday, May 06, 2008

[SCS Faculty Candidate Talk] Analyzing Dynamic Scenes from Moving Cameras: A Spacetime

SCS Faculty Candidate Yaser Sheikh
Wednesday, May 7th, 2008
10:00 a.m. NSH 3305
Host: Srinivasa Narasimhan, The Robotics Institute

Title:* *Analyzing Dynamic Scenes from Moving Cameras: A Spacetime Perspective

Abstract:
With the proliferation of camera-enabled cell phones, domestic robots, and wearable computers, moving cameras are being introduced /en masse/into society. The confluence of camera motion and the motion of objects in the scene complicates the task of understanding the scene from video. In this talk, I discuss how and when it is possible to disambiguatet hese two sources of motion, towards the goal of analyzing dynamic scenes from moving cameras.

I begin by considering a single camera viewing a dynamic scene. Unlike contemporary approaches to this problem, which try to model the variation in the shape of objects, I show that modeling the variation of points along time is better motivated physically and produces more stable reconstructions. This model also intuitively characterizes the inherent reconstruction ambiguity for a single camera and motivates the study of dynamic scenes from /multiple/ moving cameras. I present the case for conducting this analysis in spacetime, where a dynamic scene is considered a body in spacetime, and each video a spacetime image of this body. Through this representation, I demonstrate that classic algorithmsin multiview geometry that deal with static scenes can be lifted to spacetime, and applied directly for dynamic scene analysis. The analogues of factorization approaches and the fundamental matrix are described, leading to new, intuitive, relationships between the epipolar geometries of perspective images, linear pushbroom images and epipolar plane images.

Monday, May 05, 2008

[Lab Meeting] May 5th, 2008 (Kuo-Hwei), Progress Report on Laser-Based SLAMMOT

I will show the improvement of the multi-hypotheses SLAMMOT, and explain some detail of my implementation.

Sunday, May 04, 2008

[Lab Meeting] May 5th, 2008 (Hero): An Application of Reinforcement Learning to Aerobatic Helicopter Flight

Pieter Abbeel, Adam Coates, Morgan Quigley, Andrew Y. Ng
Computer Science Dept.
Stanford University
Stanford, CA 94305

Abstract
Autonomous helicopter flight is widely regarded to be a highly challenging control
problem. This paper presents the first successful autonomous completion on a
real RC helicopter of the following four aerobatic maneuvers: forward flip and
sideways roll at low speed, tail-in funnel, and nose-in funnel. Our experimental
results significantly extend the state of the art in autonomous helicopter flight.
We used the following approach: First we had a pilot fly the helicopter to help
us find a helicopter dynamics model and a reward (cost) function. Then we used
a reinforcement learning (optimal control) algorithm to find a controller that is
optimized for the resulting model and reward function. More specifically, we used
differential dynamic programming (DDP), an extension of the linear quadratic
regulator (LQR).

link: http://www.cs.stanford.edu/%7Epabbeel/pubs/AbbeelCoatesQuigleyNg_aaorltahf_nips2006.pdf

[Lab Meeting] May 5th, 2008 (Stanley): Boosting Structured Prediction for Imitation Learning

Author:
Nathan Ratliff, David Bradley, J. Andrew Bagnell, Joel Chestnutt
Robotics Institute, Carnegie Mellon University

From:
Advances in Neural Information Processing Systems 19, MIT Press, Cambridge, MA, 2007

Abstract:
The Maximum Margin Planning (MMP) (Ratliff et al., 2006) algorithm solves imitation learning problems by learning linear mappings from features to cost functions in a planning domain. The learned policy is the result of minimum-cost planning using these cost functions. These mappings are chosen so that example policies (or trajectories) given by a teacher appear to be lower cost (with a loss-scaled margin) than any other policy for a given planning domain. We provide a novel approach, MMPBOOST , based on the functional gradient descent view of boosting (Mason et al., 1999; Friedman, 1999a) that extends MMP by “boosting”in new features. This approach uses simple binary classification or regression to improve performance of MMP imitation learning, and naturally extends to the class of structured maximum margin prediction problems. (Taskar et al., 2005) Our technique is applied to navigation and planning problems for outdoor mobile robots and robotic legged locomotion.

Link

Saturday, May 03, 2008

MIT CSAIL talk : Predicting Listener Backchannel: A Probabilistic Multimodal Approach

Speaker: Louis-Philippe Morency, Research Scientist, Institute for Create Technologies, University of Southern California
Date: Monday, May 5 2008
Time: 2:00PM to 3:00PM
Refreshments: 1:45PM
Location: 32-D507Host: C. Mario Christoudias, Gerald Dalley, MIT CSAIL
Contact: C. Mario Christoudias, Gerald Dalley, 3-4278, 3-6095, cmch@csail.mit.edu , dalleyg@mit.edu

During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important milestone for building engaging and natural virtual humans. In this talk I will show how sequential probabilistic models (e.g., Hidden Markov Models (HMMs) or Conditional Random Fields (CRFs)) can automatically learn from a database of human-to-human interactions to predict listener backchannels using the speaker multimodal output features (e.g., prosody, spoken words and eye gaze). The main challenges addressed in this talk are (1) automatic selection of the relevant features and (2) optimal feature representation for probabilistic models. For prediction of visual backchannel cues (i.e., head nods), our prediction model shows a statistically significant improvement over a previously published approach based on hand-crafted rules.

link

Friday, May 02, 2008

NPR Science Friday talks robots

NPR's Science Friday last week had a show called "Building a More Sociable Robot." Guests include Helen Greiner (chair and co-founder of iRobot), Peter McOwen (Queen Mary, University of London), Dean Kamen (inventor of the iBot, Segway, and founder of FIRST), and Grant Cox (member of FIRST champion team The Thunder Chickens). Greiner and McOwen talk about what average people expect out of robots in terms of interaction, the relationship between interactive technology, price, and consumer demand, and what the state of technology is to get robots interacting with the environment and with us in a "natural" way. Kamen and Cox, meanwhile, talk about the FIRST program, how it's encouraging people to follow science, engineering, and technology as careers, and why robotics is so effective in doing this.

the postcast and link to the article

Robotic wheelchair docks like a spaceship

Link

A laser-guided robot wheelchair that automatically docks with the user's vehicle and loads itself into the back could give disabled drivers more freedom.

Using the new system, the user opens the door of their van and presses a button to lower the front seat so they can climb in. A remote control is then used to drive the chair round to the back of the van.

From here on, a computer inside the vehicle takes over. Using radio signals and laser guidance, it positions the chair onto the forks of a lift that hauls the wheelchair on board, and closes the door (see video, right).

The process is reversed once the driver reaches their destination.

Reliable docking
Researchers from Lehigh University in Bethlehem, Pennsylvania, US, working with a company called Freedom Sciences of Philadelphia, have demonstrated the system using a retrofitted, commercially available motorised wheelchair and a standard Chrysler minivan.

The researchers had originally planned to let users dock the empty wheelchair onto the forklift themselves, using the remote control and a camera mounted on the van. But it proved too difficult to position the chair accurately on the lift.

"The real challenge is to dock with 100% reliability. That is something you can't do with remote control," says John Spletzer, a roboticist at Lehigh who helped develop the system.

Instead they developed an on-board computer that uses a LIDAR (light detecting and ranging) system to position the chair. It bounces laser light off two reflectors on the armrests of the chair to track its position and align it with the forklift.

Space docking
Similar laser ranging was used by the uncrewed cargo spacecraft Jules Verne when it first docked with the International Space Station last month.

In tests, the system achieved a 97.5% success rate in docking the chair, even when facing complications such as rain, headlights, visible exhaust fumes, or loose gravel under the wheels.

If a docking attempt fails, the operator repositions the chair and tries again. If all else fails, they can take over, and keep trying until docking is successful.

Freedom Sciences expects to begin selling the system later this year in the US for around $30,000 each.

FDA approval
The price is comparable to having a vehicle modified so that a wheelchair can roll into the driver's area, and has the benefit that the equipment can be transferred when a new car is purchased, says Thomas Panzarella, Freedom Sciences' chief technology officer.

Because the product involves modifying a medical device – a wheelchair – approval is needed from the US Food and Drug Administration, but Panzarella expects this to take no more than a month or so.

The system tackles an interesting problem, says Joelle Pineau, a computer scientist at McGill University in Montreal. "It is well known that navigating in very constrained spaces and conditions is a major challenge for wheelchair users."

Journal reference: Journal of Field Robotics (DOI: 10.1002/rob.20236)

Wednesday, April 30, 2008

Disney enters the programmable robot market with its own kids robot

Robots aren’t just for hobbyists anymore. The programmable gadgets have taken off thanks to the efforts of tech-oriented companies such as WowWee, Sony, Ugobe and LEGO. But the market may be ready for a whole new level as Disney enters the market tomorrow.
LINK

Tuesday, April 29, 2008

[CSAIL Seminar]Maximum Entropy and Species Distribution Modeling

Speaker: Robert Schapire, Princeton University

Abstract:
Modeling the geographic distribution of a plant or animal species is a critical problem in conservation biology: to save a threatened species, one first needs to know where it prefers to live, and what its requirements are for survival. From a machine-learning perspective, this is an especially challenging problem in which the learner is presented with no negative examples and often only a tiny number of positive examples. In this talk, I will describe the application of maximum-entropy methods to this problem, a set of decades-old techniques that happen to fit the problem very cleanly and effectively. I will describe a version of maxent that we have shown enjoys strong theoretical performance guarantees that enable it to perform effectively even with a very large number of features. I will also describe some extensive experimental tests of the method, as well as some surprising applications.

This talk includes joint work with Miroslav Dudík and Steven Phillips.

[Relevant Link]

Monday, April 28, 2008

Lab Meeting April 28th, 2008 (Yu-Hsiang): Progress on abnormal object detection for the ICRA Challenge

I'll present how I use unsupervise way to cluster surf features and detect abnormal object. And I will show some my current results.

Sunday, April 27, 2008

Lab Meeting April 28th, 2008 (Der-Yeuan): Progress on Feature-Based SLAM for the ICRA Challenge

I will present some of my ideas and current results in mapping a 3D environment using the SwissRanger camera and the MotionNode IMU.

Saturday, April 26, 2008

News: Academic Leaders in Robotics Research Announce Effort To Create National Strategy for Robotics Growth

PITTSBURGH—Citing the critical importance of the continued growth of robotics to U.S. competitiveness, 11 universities are taking the lead in developing an integrated national strategy for robotics research. The United States is the only nation engaged in advanced robotics research that does not have such a research roadmap.

The Computing Community Consortium (CCC), a program of the National Science Foundation, is providing support for developing the roadmap, which will be a unified research agenda for robotics across federal agencies, industry and the universities.

The effort began last year and includes representatives from the Georgia Institute of Technology, Carnegie Mellon University and the universities of Massachusetts, Pennsylvania, California- Berkeley, Southern California, Utah and Illinois, as well as Rensselaer Polytechnic Institute, Stanford University and Massachusetts Institute of Technology.

See the full article. The roadmapping effort is detailed at www.us-robotics.us.

Wednesday, April 23, 2008

[CVPR2008] Fusion of Time-of-Flight Depth and Stereo for High Accuracy Depth Maps

Authors: Jiejie Zhu, Liang Wang, Ruigang Yang, James Davis
CVPR 2008 Oral

Abstract:
Time-of-flight range sensors have error characteristics which are complementary to passive stereo. They provide real time depth estimates in conditions where passive stereo does not work well, such as on white walls. In contrast, these sensors are noisy and often perform poorly on the textured scenes for which stereo excels. We introduce a method for combining the results from both methods that performs better than either alone. A depth probability distribution function from each method is calculated and then merged. In addition, stereo methods have long used global methods such as belief propagation and graph cuts to improve results, and we apply these methods to this sensor. Since time-of-flight devices have primarily been used as individual sensors, they are typically poorly calibrated. we introduce a method that substantially improves upon the manufacturer’s calibration. We show that these techniques lead to improved accuracy and robustness.

fulltext

[CVPR2008]Unsupervised Modeling of Object Categories Using Link Analysis Techniques

Title: Unsupervised Modeling of Object Categories Using Link Analysis Techniques

Author: Gunhee Kim, Christos Faloutsos, Martial Hebert, Carnegie Mellon University

Abstract:
We propose an approach for learning visual models of object categories in an unsupervised manner in which we ﬁrst build a large-scale complex network which captures the interactions of all unit visual features across the entire training set and we infer information, such as which features are in which categories, directly from the graph by using link analysis techniques. The link analysis techniques are based on well-established graph mining techniques used in diverse applications such as WWW, bioinformatics, and social networks. The techniques operate directly on the patterns of connections between features in the graph rather than on statistical properties, e.g., from clustering in feature space. We argue that the resulting techniques are simpler, and we show that they perform similarly or better compared to state of the art techniques on common data sets. We also show results on more challenging data sets than those that have been used in prior work on unsupervised modeling.

fulltext

[CVPR2008]Model-Based Hand Tracking with Texture, Shading and Self-occlusions

Authors: Martin de La Gorce, Nikos Paragios, David J. Fleet
CVPR 2008 oral

Abstract:
A novel model-based approach to 3D hand tracking from monocular video is presented. The 3D hand pose, the hand texture and the illuminant are dynamically estimated through minimization of an objective function. Derived from an inverse problem formulation, the objective function enables explicit use of texture temporal continuity and shading information, while handling important self-occlusionsand time-varying illumination. The minimization is done efﬁciently using a quasi-Newton method, for which we propose a rigorous derivation of the objective function gradient. Particular attention is given to terms related to the change of visibility near self-occlusion boundaries that are neglected in existing formulations. In doing so we introduce new occlusion forces and show that using all gradient terms greatly improves the performance of the method. Experimental results demonstrate the potential of the formulation.

[Link]

[CVPR 2008] A Mobile Vision System for Robust Multi-Person Tracking

Authors: Andreas Ess, Bastian Leibe, Konrad Schindler and Luc Van Gool
ETH Zurich, Switzerland, KU Leuven, Belgium
CVPR 2008 oral

Full text

Abstract:
We present a mobile vision system for multi-person track-ing in busy environments. Specifically, the system integratescontinuous visual odometry computation with tracking-by-detection in order to track pedestrians in spite of frequentocclusions and egomotion of the camera rig. To achieve re-liable performance under real-world conditions, it has longbeen advocated to extract and combine as much visual in-formation as possible. We propose a way to closely inte-grate the vision modules for visual odometry, pedestrian de-tection, depth estimation, and tracking. The integration nat-urally leads to several cognitive feedback loops between themodules. Among others, we propose a novel feedback con-nection from the object detector to visual odometry whichutilizes the semantic knowledge of detection to stabilize lo-calization. Feedback loops always carry the danger that er-roneous feedback from one module is amplified and causesthe entire system to become instable. We therefore incor-porate automatic failure detection and recovery, allowingthe system to continue when a module becomes unreliable.The approach is experimentally evaluated on several longand difficult video sequences from busy inner-city locations.Our results show that the proposed integration makes it pos-sible to deliver stable tracking performance in scenes ofpreviously infeasible complexity.

Intel Seminar : Unsupervised Analysis of Human Activities in Everyday Environments

Title: Structure from Statistics: Unsupervised Analysis of HumanActivities in Everyday Environments
Speaker:Raffay Hamid
Monday, April 21st, 2008, 10:30am- 12:00pm

Abstract:
In order to make computers proactive and assistive, we must enablethem to perceive, learn, and predict what is happening in theirsurroundings. This presents us with the challenge of formalizingcomputational models of everyday human activities. These models mustperform well in the face of data uncertainty and complex activitydynamics. Traditional approaches to this end assume prior knowledgeabout the structure of human activities, using which explicitlydefined activity-models are learned in a supervised manner. However,for a majority of everyday environments such activity structure isgenerally not known a priori. In this talk, I will discuss knowledgerepresentations and manipulation techniques that facilitate minimallysupervised learning of activity structure. In particular, I willpresent n-grams and Suffix Tree based sequence representations forhuman activity analysis. I will discuss how such data-driven approachtowards activity modeling can help discover and characterize humanactivities, and learn typical behaviors crucial for detectingirregular occurrences in an environment. I will provide experimentalvalidation of my proposed approach for activity analysis inenvironments such as a residential house, a loading dock area, and ahousehold kitchen.

Bio:
Raffay Hamid is a Ph.D. candidate in Computer Science in the School ofInteractive Computing at the Georgia Institute of Technology, where heis a member of the Computational Perception Lab., and the Aware HomeResearch Initiative. His research interests lie at the intersection ofStatistical Learning, Computer Vision and Ubiquitous Computing.
During his graduate years, Raffay has worked as a Research Intern atIntel Research Lab., Mitsubishi Electronic Research Lab., andMicrosoft Research. From 2001 to 2002, he was a Signal ProcessingEngineer at Techlogix Inc., working on a joint project with GeneralMotors and Eaton Corporation. During this time he also served as anadjunct lecturer at the University of Engineering and TechnologyLahore, Pakistan. He has been awarded the National Merit Scholarshipfrom the Government of Pakistan from 1994 to 2001. More informationabout his curricular and co-curricular interests can be found at:www.cc.gatech.edu/~raffay .

Monday, April 21, 2008

Lab Meeting April 21st, 2008 (Leo): Ground truth system

I will show some improvement of the ground truth sytsem.

Sunday, April 20, 2008

Lab Meeting April 21st, 2008 (Yi-Liu): Progress report

I'll show some results of Monocular DATMO.

Lab Meeting April 21th, 2008 (fish60): Efficient Motion Planning Algorithm for Stochastic Dynamic Systems with Constraints on Probability of Failure

I will talk about the simple idea which I want to present last week.

Abstract:Focus on -- Bi-stage Robust Motion Planning algorithm:Two stage optimization approach, with the upper stage optimizingthe risk allocation and the lower stage calculatingthe optimal control sequence that maximizes the reward.

Link

Lab Meeting April 21st, 2008 (Jeff):Progress report

I will show 2 results using current dataset.

And I will point out some problem about the current sensor model and try to

propose a method to solve it.

Saturday, April 19, 2008

Off-Road Obstacle Avoidance through End-to-End Learning

We describe a vision-based obstacle avoidance system for off-road mobile robots. The system is trained from end to end to map raw input images to steering angles. It is trained in supervised mode to predict the steering angles provided by a human driver during training runs collected in a wide variety of terrains, weather conditions, lighting conditions, and obstacle types. The robot is a 50cm off-road truck, with two forward pointing wireless color cameras. A remote computer processes the video and controls the robot via radio. The learning system is a large 6-layer convolutional network whose input is a single left/right pair of unprocessed low-resolution images. The robot exhibits an excellent ability to detect obstacles and navigate around them in real time at speeds of 2 m/s.

Link

Tuesday, April 15, 2008

CMU RI Report: Cost-based Registration using A Priori Data for Mobile Robot Localization

L. Xu and A. Stentz
tech. report TR-08-05
Robotics Institute
Carnegie Mellon University

Abstract--A major challenge facing outdoor navigation is the localization of a mobile robot as it traverses a particular terrain. Inaccuracies in dead-reckoning and the loss of global positioning information (GPS) often lead to unacceptable uncertainty in vehicle position. We propose a localization algorithm that utilizes cost-based registration and particle filtering techniques to localize a robot in the absence of GPS. We use vehicle sensor data to provide terrain information similar to that stored in an overhead satellite map. This raw sensor data is converted to mobility costs to normalize for perspective disparities and then matched against overhead cost maps. Cost-based registration is particularly suited for localization in the navigation domain because these normalized costs are directly used for path selection. To improve the robustness of the algorithm, we use particle filtering to handle multi-modal distributions. Results of our algorithm applied to real field data from a mobile robot show higher localization certainty compared to that of dead-reckoning alone.

Check here for the full text.