Wednesday, January 28, 2009

CMU talk: Perception on an Offroad Robot: Shallow and Deep Learning Architectures

VASC Seminar
Monday, February 2, 2009

Perception on an Offroad Robot: Shallow and Deep Learning Architectures
Raia Hadsell
Robotics Institute, CMU

Perception for offroad mobile robots is very difficult. Roads and paths aren't visually consistent, nor are they guaranteed to exist at all; obstacles are diverse and often visually complex. In addition, long range perception is more important when path planning needs to be done in the absence of clear roads or corridors. I will describe 2 learning-based approaches to perception in offroad environments: first, a kernel-based method for rough terrain reconstruction, and second, a self-supervised online vision system that can detect paths and obstacles at very long range and that quickly adapts to new environments.

Raia Hadsell is currently a postdoc at the Robotics Institute, working with Martial Hebert, Drew Bagnell, and Daniel Huber. She completed her doctorate at New York University in 2008, under the advisement of Yann LeCun, with research interests that lie in the intersection of machine learning, vision, and robotics. She has also enjoyed internships at Google NYC and Net-Scale Technologies. Dr. Hadsell, if pressed, will own up to her bachelor's degree in religion and philosophy and may discuss Nietzsche on occasion.

Friday, January 16, 2009

Lab Meeting January 19, 2009 (Tiffany): Algorithms for Inverse Reinforcement Learning

ICML 2000

Title: Algorithms for Inverse Reinforcement Learning

Authors: Andrew Y. Ng and Stuart Russell

This paper addresses the problem of inverse reinforcement learning (IRL) in Markov decision processes, that is, the problem of extracting a reward function given observed, optimal behaviour. IRL may be useful for apprenticeship learning to acquire skilled behaviour, and for ascertaining the reward function being optimized by a natural system. We rst characterize the set of all reward functions for which a given policy is optimal. We then derive three algorithms for IRL. The rst two deal with the case where the entire policy is known; we handle tabulated reward functions on a nite state space and linear functional approximation of the reward function over a potentially in-nite state space. The third algorithm deals with the more realistic case in which the policy is known only through a nite set of observed trajectories. In all cases, a key issue is degeneracy|the existence of a large set of reward functions for which the observed policy is optimal. To remove degeneracy, we suggest some natural heuristics that attempt to pick a reward function that maximally dierentiates the observed policy from other, suboptimal policies. This results in an eciently solvable linear programming formulation of the IRL problem. We demonstrate our algorithms on simple discrete / nite and continuous /in nite state problems.

Lab Meeting January 19, 2009 (Yu-chun): Interaction with a Zoomorphic Robot that Exhibits Canid Mechanisms of Behaviour

ICRA 2008

Title: Interaction with a Zoomorphic Robot that Exhibits Canid Mechanisms of Behaviour

Authors: Trevor Jones1, Shaun Lawson, and Daniel Mills

Despite parallels between the cooperative use of domestic dogs in human society today, the predicted similar deployment of robots in the future, and the plethora of superficially dog-like robotic entertainment devices, very little effort has been directed at exploiting any understanding of social cognition between dogs and humans when designing interactive robotic systems. This paper describes an experiment in which we gave interactive robots zoomorphic appearances and dog-like behavioural properties. We analysed human reactions to robots exhibiting differing levels of zoomorphism and dog-like behaviour during an interaction task; we were particularly interested to determine whether behaviour and/or appearance that mimicked that of dogs facilitated increased satisfaction in robot performance and a willingness to persevere with a robot that made mistakes. Our findings show that neither the appearance or behaviour of a robot had an impact on the participants’ rating of robot performance whilst there was also no significant difference in the self-reported categories of frustration, excitement and desire to persist with an interaction. However, our findings suggest that differences in individual preferences are revealed when people are asked to interact with robots that exhibit dog-like behaviours and other zoomorphic characteristics and that further research is required in order to better understand these differences.

Thursday, January 15, 2009

CMU RI Thesis Proposal: Distributed Algorithms for Probabilistic Inference and Learning

Date: 16 January 2009
Time: 12:00 p.m.
Place: Newell Simon Hall 1507
Type: Thesis Proposal
Topic: Distributed Algorithms for Probabilistic Inference and Learning


Probabilistic inference and learning problems arise naturally in distributed systems such as sensor networks, teams of mobile robots, and recommendation systems. In these systems, the data resides at multiple distributed locations, and the network nodes need to collaborate, in order to perform the inference or learning task.

This thesis has three thrusts. First, we propose distributed implementations of several state-of-the-art centralized inference algorithms. Our solutions address challenges, such as effective MAP estimation, scheduling of messages in loopy belief propagation, and assumed density filtering.

Many algorithms for probabilistic inference are described by graphical models, such region graphs or junction trees. These graphical models, together with the update schedule, entirely determine the behavior of the inference algorithm in a centralized settings. Yet, in distributed settings, the graphical model crucially interacts with the physical network and determines properties, such as robustness or communication complexity. In this thesis, we propose a unified view where the graphical model and its placement is optimized jointly to match both the network and the probabilistic model. In this manner, our distributed algorithms will not only attain accurate solutions, but will also have a low message complexity.

Recent advances in peer-to-peer networks offer interesting opportunities for learning latent variable models for collaborative filtering. Peer-to-peer networks simplify many aspects of distributed learning, but open an interesting challenge of supporting recommendation queries with stale local models. We propose a pull-based approach that updates the model parameters, in order to minimize its regret with respect to the optimal set of recommendations.

We demonstrate our algorithms on real-world applications in large-scale modular robot localization, camera networks and movie recommendation systems. We demonstrate that our algorithms scale to large networks and provide improved robustness and convergence properties.

Friday, January 09, 2009

NTU talk (this Saturday): Non-Chronological Video Editing and Video Synopsis

Speaker: Prof. Shmuel Peleg, School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel

Time: 02:20pm, January 10 (Saturday), 2009

Place: Room 101, CSIE building

Title: Non-Chronological Video Editing and Video Synopsis

Powerful effects in video editing can be obtained when relaxing the chronological constraints: activities that occurred in different times can be shown simultaneously and vice versa. The description of non-chronological video editing effects and the simple methods to perform them will start this talk.

The non-chronological approach to video is also powerful in creating video summaries. In particular, a full day recorded by a video surveillance camera can be summarized in a few minutes without loss of any activity. It is estimated that 40 million surveillance cameras are being installed annually. But none of the video they record is ever watched: it is too time consuming. The presented video synopsis approach can provide access to the untapped resource of recorded surveillance cameras.

Short Biography:
Shmuel Peleg received his Ph.D. from the University of Maryland in 1979 under the guidance of Professor Azriel Rosenfeld. In 1981 he became a faculty member at the Hebrew University of Jerusalem where he is still a Professor of Computer Science. Shmuel served as chairman of the Institute of Computer Science at Hebrew University from 1990 to 1993.

Shmuel's research covers pyramid representation, image enhancement, motion analysis, panoramic mosaicing, and video surveillance. He has several patents which provided the technical foundations to four start-up companies. The most recent company, BriefCam Ltd., provides indexing into video surveillance, and uses a technology covered in this presentation.

Wednesday, January 07, 2009

NTU talk: Lecture Series on Brain Theory and Neural Network

Speaker:Dr. Michael A. Arbib,
Professor of Biomedical Engineering, Computer Science, Electrical Engineering Neurobiology, and Psychology, University of Southern California

Lecture Series on Brain Theory and Neural Network (I)
Time:2009/01/10 (Saturday) 9:00~12:00
Title: An introduction of Brain theory and artificial intelligence:
(1)Brief overview of AI & BT networks of leaky integrator neurons
(2)Winner-Take-All(a frog model);Didday-Arbib & Itti-Koch on visual attention

Lecture Series on Brain Theory and Neural Network (II)
Time:2009/01/11 (Sunday) 9:00~12:00
Topics:(1)Dominey Arbib model of perception and attention:Working Memory & Dynamic Remapping
(2)Object Recognition & Scene Perception

Lecture Series on Brain Theory and Neural Network(III)
Time: 2009/01/12(Monday) 9:00~12:00
Tpoics:Adaptive networks; reinforcement learning
(1)Introduction to Hebbian, supervised and reinforcement learning in neural networks
(2)Augmented Competitive Queuing: Opportunistic scheduling with mirror neurons and reinforcement learning

NTU talk: Toward Robust Online Visual Tracking

Speaker: Prof. Ming-Hsuan Yang, UC Merced

Time: 02:20pm, January 9 (Fri), 2009
Place: Room 102, CSIE building

Title: Toward Robust Online Visual Tracking


Human beings are capable of tracking objects in dynamic scenes effortlessly, and yet visual tracking remains a challenging problem in computer vision. The main reason can be attributed to the difficulty in handling appearance variation of a target object. Intrinsic appearance change include out-of-plane motion and shape deformation of a target object, whereas extrinsic illumination change, camera motion, camera viewpoint, and occlusions inevitably cause large appearance variation.

Visual tracking is a fundamental problem in computer vision that has important applications in a variety of areas, including recovering 3D structure from moving scenes, camera calibration, estimating the underlying motion of the scene, and object recognition. It also has other applications in autonomous robotics and vehicles, medical imaging, as well as entertainment. While existing algorithms are able to track objects in controlled environments, they usually fail in the presence of significant image variations caused by changes in illumination, pose and occlusions. In addition, most of them require significant efforts in offline training prior to tracking. In the first part of this talk, I will present an efficient online learning algorithm for simultaneously tracking objects and learning compact appearance models. Numerous experiments show that this method is able to learn compact generative models for tracking target objects undergoing large pose and
illumination changes. I will then discuss discriminative algorithms that track objects by separating foreground targets from backgrounds in an online manner. Experimental validation demonstrates that these algorithms are robust for tracking fast moving objects undergoing illumination change, occlusion, and articulated motion in real time with better results than existing systems.

Short Biography:

Ming-Hsuan Yang is an assistant professor in Electrical Engineering and Computer Science of University of California at Merced. After receiving his PhD degree in Computer Science from the University of Illinois at Urbana-Champaign (UIUC), he worked as a senior researcher at the Honda Research Institute in Mountain View, California, and was an assistant professor with Computer Science and Information Engineering at National Taiwan University.

His research interests include computer vision, pattern recognition, robotics, cognitive science, and machine learning. While at UIUC, he was awarded the Ray Ozzie Fellowship given to outstanding graduate students in Computer Science. He has co-authored the book Face Detection and Gesture Recognition for Human-Computer Interaction (Kluwer Academic Publishers), and co-edited a special issue on face recognition of Computer Vision and Image Understanding. He serves as an Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence, and an Area Chair of the IEEE Computer Vision and Pattern Recognition in 2008 and 2009. He is a senior member of the IEEE and the ACM.

CMU talk: Shape Constrained Figure-Ground Segmentation and Tracking

Special VASC Seminar
Thursday, January 8, 2009

Shape Constrained Figure-Ground Segmentation and Tracking
Zhaozheng Yin
Penn State University

To avoid drift problems during adaptive tracking, we must raise the level of abstraction at which the tracker represents its target. The goal must tracking "objects", not a box of pixels or a color distribution. If we can explicitly segment the foreground from background, it is possible to keep the adaptive model anchored on just the foreground pixels. In this talk, we discuss a shape constrained segmentation approach for tracking. Global object shape information is embedded into local graph links in a Conditional Random Field framework, thus the graph cut is attracted to occur around the figure-ground boundary. When treating tracking as a figure-ground segmentation problem, the precise foreground matte can help reduce pixel classification error during model adaptation. Meanwhile, the collected shape templates are useful to search for and recognize the same object after occlusion or tracking failure.

Zhaozheng Yin is currently a PhD candidate in Robert Collins's vision group at Penn State, where his research interests include object segmentation, tracking, motion detection and feature selection/fusion. He received his BS degree from Tsinghua University, China, and his MS degree from the University of Wisconsin at Madison.

Monday, January 05, 2009

[NIPS 2008]Nonrigid Structure from Motion in Trajectory Space

Title: Nonrigid Structure from Motion in Trajectory Space

Authors: Ijaz Akhter, Yaser Ajmal Sheikh, Sohaib Khan, and Takeo Kanade
Published in: Neural Information Processing Systems (NIPS), December 2008

Existing approaches to nonrigid structure from motion assume that the instantaneous 3D shape of a deforming object is a linear combination of basis shapes, which have to be estimated anew for each video sequence. In contrast, we propose that the evolving 3D structure be described by a linear combination of basis trajectories. The principal advantage of this approach is that we do not need to estimate any basis vectors during computation. We show that generic bases over trajectories, such as the Discrete Cosine Transform (DCT) basis, can be used to compactly describe most real motions. This results in a significant reduction in unknowns, and corresponding stability in estimation. We report empirical performance, quantitatively using motion capture data, and qualitatively on several video sequences exhibiting nonrigid motions including piece-wise rigid motion, partially nonrigid motion (such as a facial expression), and highly nonrigid motion (such as a person dancing).

[Link to the paper and the datasets]

[MIT technical report]Organic Indoor Location Discovery

Title: Organic Indoor Location Discovery

Authors: Seth Teller, Jonathan Battat, Ben Charrow, Dorothy Curtis, Russell Ryan, Jonathan Ledlie, and Jamey Hicks

We describe an indoor, room-level location discovery method based on spatial variations in “wifi signatures,” i.e., MAC addresses and signal strengths of existing wireless access points. The principal novelty of our system is its organic nature; it builds signal strength maps from the natural mobility and lightweight contributions of ordinary users, rather than dedicated effort by a team of site surveyors. Whenever a user’s personal device observes an unrecognized signature, a GUI solicits the user’s location. The resulting location-tagged signature or “bind” is then shared with other clients through a common database, enabling devices subsequently arriving there to discover location with no further user contribution. Realizing a working system deployment required three novel elements: (1) a human-computer interface for indicating location over intervals of varying duration; (2) a client-server protocol for pre-fetching signature data for use in localization; and (3) a location-estimation algorithm incorporating highly variable signature data. We describe an experimental deployment of our method in a nine-story building with more than 1,400 distinct spaces served by more than 200 wireless access points. At the conclusion of the deployment, users could correctly localize to within 10 meters 92% of the time.