Monday, April 30, 2007

[VASC Seminar Series ]Image representations beyond histograms of gradients: The role of Gestalt descriptors

Speaker: Stanley Bileschi


Histograms of orientations and the statistics derived from them have
proven to be effective image representations for various recognition
tasks. In this work we attempt to improve the accuracy of object detection
systems by including new features that explicitly capture mid-level
gestalt concepts. Four new image features are proposed, inspired by the
gestalt principles of continuity, symmetry, closure and repetition. The
resulting image representations are used jointly with existing
state-of-the-art features and together enable better detectors for
challenging real-world data sets. As baseline features, we use Riesenhuber
and Poggio's C1 features [15] and Dalan and Triggs' Histogram of Oriented
Gradients feature [6]. Given that both of these baseline features have
already shown state of the art performance in multiple object detection
benchmarks, that our new midlevel representations can further improve
detection results warrants special consideration. We evaluate the
performance of these detection systems on the publicly available
StreetScenes [25] and Caltech101 [11] databases among others.

Related links:

Gestalt psychology

[News]ITRI Adopts Evolution Robotics Software Platform For Robotics Development


Identifying robotics as a key industry for growth, the Taiwan Ministry of Economic Affairs (MOEA) commissioned ITRI to create a national robotic SDK (software development kit) to provide an industry-standard platform for robotics. This software development kit will offer Taiwan’s corporations, labs and universities best-in-class tools for developing a broad range of new robotic products and applications, as well as provide a path for incorporating and sharing new technologies over time.

In deciding upon one of the most critical components for the robotics SDK, ITRI chose the Evolution Robotics’ ERSP Architecture to provide the standard foundation for all application development. In this role, the ERSP Architecture will essentially serve as the robot operating system, providing the key infrastructure and functions for integrating all of the hardware and software components.

full article

Thursday, April 26, 2007

Lab Meeting 26 April 2007 (Chihao): Audio Source localization and tracking: Property of Time delay of arrival & proposed method

  • compare the effect of different frame size
  • problem and proposed method

Lab Meeting 26 April 2007 (Jim): Probabilistic Appearance Based Navigation and Loop Closing

Probabilistic Appearance Based Navigation and Loop Closing
by Mark Cummins and Paul Newman, ICRA 2007

pdf, website

This paper describes a probabilistic framework for navigation using only appearance data. By learning a generative model of appearance, we can compute not only the similarity of two observations, but also the probability that they originate from the same location, and hence compute a pdf over observer location. We do not limit ourselves to the kidnapped robot problem (localizing in a known map), but admit the possibility that observations may come from previously unvisited places. The principled probabilistic approach we develop allows us to explicitly account for the perceptual aliasing in the environment – identical but indistinctive observations receive a low probability of having come from the same place. Our algorithm complexity is linear in the number of places, and is particularly suitable for online loop closure detection in mobile robotics.

Wednesday, April 25, 2007

Lab Meeting 26 April 2007 (Yu-Chun): Natural Emotion Expression of a Robot Based on Reinforcer Intensity

2007 IEEE International Conference on Robotics and Automation

Seung-Ik Lee, Gunn-Yong Park, and Joong-Bae Kim

An emotional robot is regarded as being able to express its diverse emotions in response to internal or external events. This paper presents a robot affective system that is able to express life-like emotions. In order to do that, the overall architecture of our affective system is based on neuroscience from which we obtained the natural emotional processing routines. Based on that architecture, we apply the reinforcer effects expecting that those would lead the affective system to be more similar to real-life’s emotion expression. The robot affective system has responsibility for gathering environmental information and evaluating which environmental stimuli are rewarding or punishing. The emotion processing involves with appraisal of the external and internal stimuli, such as homeostasis, and generates the affective states of the robot. Therefore, emotions are associated with the presentation, omission, and termination of the expected rewards or punishers (reinforcers). The experimental results show that our affective system can express several emotions simultaneously as well as the emotions decrease, increase, or changes to another emotion seamlessly as time passes.

Monday, April 23, 2007

CMU news: Fotowoosh technology

Freewebs licenses Fotowoosh technology, developed by Alexei Efros, Martial Hebert, and PhD student Derek Hoiem, that converts single, two-dimensional images into 3-D images.

Sunday, April 22, 2007

CMU Intelligence Seminar: An efficient way to learn deep generative models

An efficient way to learn deep generative models

Geoff Hinton, University of Toronto

I will describe an efficient, unsupervised learning procedure for deep generative models that contain millions of parameters and many layers of hidden features. The features are learned one layer at a time without any information about the final goal of the system. After the layer-by-layer learning, a subsequent fine-tuning process can be used to significantly improve the generative or discriminative performance of the multilayer network by making very slight changes to the features.

I will demonstrate this approach to learning deep networks on a variety of tasks including: Creating generative models of handwritten digits and human motion; finding non-linear, low-dimensional representations of very large datasets; and predicting the next word in a sentence. I will also show how to create hash functions that map similar objects to similar addresses, thus allowing hash functions to be used for finding similar objects in a time that is independent of the size of the database.

Speaker Bio

Geoffrey Hinton received his BA in experimental psychology from Cambridge in 1970 and his PhD in Artificial Intelligence from Edinburgh in 1978. He did postdoctoral work at Sussex University and the University of California San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. He then became a fellow of the Canadian Institute for Advanced Research and moved to the Department of Computer Science at the University of Toronto. He spent three years from 1998 until 2001 setting up the Gatsby Computational Neuroscience Unit at University College London and then returned to the University of Toronto where he is a University Professor. He holds a Canada Research Chair in Machine Learning. He is the director of the program on "Neural Computation and Adaptive Perception" which is funded by the Canadian Institute for Advanced Research.

Geoffrey Hinton is a fellow of the Royal Society, the Royal Society of Canada, and the Association for the Advancement of Artificial Intelligence. He is an honorary foreign member of the American Academy of Arts and Sciences, and a former president of the Cognitive Science Society. He received an honorary doctorate from the University of Edinburgh in 2001. He was awarded the first David E. Rumelhart prize (2001), the IJCAI award for research excellence (2005), the IEEE Neural Network Pioneer award (1998) and the ITAC/NSERC award for contributions to information technology (1992).

Friday, April 20, 2007

[CMU RI Defense] Exploiting Space-Time Statistics of Videos for Face "Hallucination"

Title : Exploiting Space-Time Statistics of Videos for Face "Hallucination"

Author :
Goksel Dedeoglu

Abstract :

Face "Hallucination" aims to recover high quality, high-resolution images of human faces from low-resolution, blurred, and degraded images or video. This thesis presents person-specific solutions to this problem through careful exploitation of space (image) and space-time (video) models. The results demonstrate accurate restoration of facial details, with resolution enhancements upto a scaling factor of 16.

The algorithms proposed in this thesis follow the analysis-by-synthesis paradigm; they explain the observed (low-resolution) data by fitting a (high-resolution) model. In this context, the first contribution is the discovery of a scaling-induced bias that plagues most model-to-image (or image-to-image) fitting algorithms. It was found that models and observations should be treated asymmetrically, both to formulate an unbiased objective function and to derive an accurate optimization algorithm. This asymmetry is most relevant to Face Hallucination: when applied to the popular Active Appearance Model, it leads to a novel face tracking and reconstruction algorithm that is significantly more accurate than state-of-the-art methods. The analysis also reveals the inherent trade-off between computational efficiency and estimation accuracy in low-resolution regimes.

The second contribution is a statistical generative model of face videos. By treating a video as a composition of space-time patches, this model efficiently encodes the temporal dynamics of complex visual phenomena such as eye-blinks and the occlusion or appearance of teeth. The same representation is also used to define a data-driven prior on a three-dimensional Markov Random Field in space and time. Experimental results demonstrate that temporal representation and reasoning about facial expressions improves robustness by regularizing the Face Hallucination problem.

The final contribution is an approximate compensation scheme against illumination effects. It is observed that distinct illumination subspaces of a face (each coming from a different pose and expression) still exhibit similar variation with respect to illumination. This motivates augmenting the video model with a low-dimensional illumination subspace, whose parameters are estimated jointly with high-resolution face details. Successful Face Hallucinations beyond the lighting conditions of the training videos are reported.

Full text of the thesis can be found here.
You can go here for more information.

Lab Meeting 26 April 2007 (AShin) :(ICRA 07)Outdoor Navigation of a Mobile Robot Using Differential GPS and Curb Detection

Title:Outdoor Navigation of a Mobile Robot Using Differential GPS and Curb Detection

Author:Seung-Hun Kim , Chi-Won Roh , Sung-Chul Kang and Min-Yong Park

Abstract— This paper demonstrates a reliable navigation ofa mobile robot in outdoor environment. We fuse differential GPS and odometry data using the framework of extended Kalman filter to localize a mobile robot. And also, we proposean algorithm to detect curbs through the laser range finder. Animportant feature of road environment is the existence of curbs.The mobile robot builds the map of the curbs of roads and themap is used for tracking and localization. The navigation systemfor the mobile robot consists of a mobile robot and a controlstation. The mobile robot sends the image data from a camera tothe control station. The control station receives and displays theimage data and the teleoperator commands the mobile robotbased on the image data. Since the image data does not containenough data for reliable navigation, a hybrid strategy forreliable mobile robot in outdoor environment is suggested.When the mobile robot is faced with unexpected obstacles or thesituation that, if it follows the command, it can happen to collide,it sends a warning message to the teleoperator and changes themode from teleoperated to autonomous to avoid the obstacles byitself. After avoiding the obstacles or the collision situation, themode of the mobile robot is returned to teleoperated mode. Wehave been able to confirm that the appropriate change ofnavigation mode can help the teleoperator perform reliablen avigation in outdoor environment through experiments in theroad.

CMU ML lunch: Learning without the loss function

Speaker: John Langford, Yahoo! Research,

Title: Learning without the loss function

Abstract: When learning a classifier, we use knowledge of the loss of different choices on training examples to guide the choice of a classifier. An often incorrect assumption is embedded in this paradigm: the assumption that we know the loss of different choices. This assumption is often incorrect, and the talk is about the feasibility of (and algorithms for) fixing this.

One example where the assumption is incorrect is the ad placement problem. Your job is to choose relevant ads for a user given various sorts of context information. You can test success by displaying an ad checking if the user is interested in it. However, this test does _not_reveal what would have happened if a different ad was displayed. Restated, the "time always goes forward" nature of reality does not allow us to answer "What would have happened if I had instead made a different choice in the past?"

Somewhat surprisingly, this is _not_ a fundamental obstacle to application of machine learning. I'll describe what we know about learning without the loss function, and some new better algorithms for this setting.

Thursday, April 19, 2007

Lab Meeting 19 April 2007 (Nelson): Statistical Segment-RANSAC Localization and Moving Object Detection in Crowded Urban Area

  • detailed algorithms
  • working progress

Lab Meeting 19 May 2007 (ZhenYu) Stereovision with a Single Camera and Multiple Mirrors

Title : Stereovision with a Single Camera and Multiple Mirrors(ICRA2005)
Author : Mouaddib, E.M. Sagawa, R. Echigo, T. Yagi, Y.
You can create catadioptric omnidirectional stereovision using several mirrors with a single camera. These systems have interesting advantages, for instance in the case of mobile robot navigation and environment reconstruction. Our paper aims at estimating the” quality” of such stereovision system. What happens when the number of mirrors increases? Is it better to increase the base-line or to increase the number of mirrors? We propose some criteria and a methodology to compare different significant categories (seven): three already existing systems and four new designs that we propose. We also study and propose a global comparison between the best configurations.


SLAM product: a Web-based mapping system based on SRI International's SLAM technology

A message from SRI. -Bob


We are pleased to announce the availability of a Web-based mapping system based on SRI International's SLAM technology called Karto™.

As part of the first step toward providing a complete suite of robotic navigation and exploration algorithms, we are making the Karto Logger Plug-In available as a free download at The Karto Logger Plug-In allows robot developers to collect and log odometry and range data in a form that our mapping software can interpret to create maps of indoor environments. This can be done either in simulation or a on a real-world robotic platform. We currently support both the Player and Microsoft Robotic Studio platforms.

We welcome any and all feedback on our algorithms. Please feel free to contact us at to tell us about your experiences with our mapping algorithm.


The Karto™ Team
Software for Robot on the Move



Forget passwords, Bioscrypt Inc. has recently introduced a USB-pluggable, 3-inch 3-D face recognition camera that can be used to authenticate users into computer systems. The new technology combines Bioscrypt's background as a provider of fingerprint-based biometric access controls with the advanced face imaging and recognition technology it acquired along with the A4Vision on March 14. Bioscrypt’s system will be called the VisionAccess 3D DeskCam and will work by casting a 40,000-point infrared mesh grid over the user’s face so that workers can log onto their computers, networks, and application with just a glance. For more information, visit: the link.



Once a pure figment of the Hollywood imagination, smart cars that operate autonomously are set to come to life in 15 years. Researchers at the University of Essex, Eastern England, are building a car using a standard remote control model, which will serve as a prototype for other researchers to develop their own smart cars. The cars will use a special type of computer software that will enable the car to recognize obstacles and make decisions. To read more, visit: the link.

Wednesday, April 18, 2007

[Lab meeting] 19 April 2007(Stanley)

I'll show the potential field method result.

Tuesday, April 17, 2007

[Machine Learning Lunchtime Chats]April 16, Pradeep Ravikumar, Sparsity recovery and structure learning

Title: Sparsity recovery and structure learning
Speaker: Pradeep Ravikumar

The sparsity pattern of a vector, the number and location of its non-zeros, is of importance in varied settings as structure learning in graphical models, subset selection in regression, signal denoising and compressed sensing. In particular, a key task in these settings is to estimate the sparsity pattern of an unknown ``parameter'' vector from a set of n noisy observations.

Suppose a sparse ``signal'' vector (edges in graphical models, covariates in regression) enters into linear combinations, and has observations which are functions of these linear combinations with added noise. The task is to identify the set of relevant signals from n such noisy observations. In graphical models for instance, the task in structure learning is to identify the set of edges from the noisy samples.

This is an intractable problem under general settings, but there has been a flurry of recent work on using convex relaxations and L1 penalties to recover the underlying sparse signal, and in particular, the sparsity pattern of the signal.

The tutorial will cover the basic problem abstraction, the applications in various settings and some general conditions under which the tractable methods ``work''. It will also focus in particular on the application setting of structure learning in graphical models.

Monday, April 16, 2007

ICRA '07 : Stereo-based Markerless Human Motion Capture for Humanoid Robot Systems

Title :
Stereo-based Markerless Human Motion Capture for Humanoid Robot Systems

Author :
Pedram Azad, Aleňás Ude, Tamim Asfour, and R¨udiger Dillmann

Abstract :
In this paper, we present an image-based markerless human motion capture system, intended for humanoid robot systems. The restrictions set by this ambitious goal are numerous. The input of the system is a sequence of stereo image pairs only, captured by cameras positioned at approximately eye distance. No artificial markers can be used to simplify the estimation problem. Furthermore, the complexity of all algorithms incorporated must be suitable for real-time application, which is maybe the biggest problem when considering the high dimensionality of the search space. Finally, the system must not depend on a static camera setup and has to find the initial configuration automatically. We present a system, which tackles these problems by combining multiple cues within a particle filter framework, allowing the system to recover from wrong estimations in a natural way. We make extensive use of the benefit of having a calibrated stereo setup. To reduce search space implicitly, we use the 3D positions of the hands and the head, computed by a separate hand and head tracker using a linear motion model for each entity to be tracked. With stereo input image sequences at a resolution of 320×240 pixels, the processing rate of our system is 15 Hz on a 3 GHz CPU. Experimental results documenting the performance of our system are available in form of several videos.

ICRA 07: Rao-Blackwellized Particle Filtering for Mapping Dynamic Environments

Rao-Blackwellized Particle Filtering for Mapping Dynamic Environments

Isaac Miller and Mark Campbell


A general method for mapping dynamic environments
using a Rao-Blackwellized particle filter is presented.
The algorithm rigorously addresses both data association and
target tracking in a single unified estimator. The algorithm
relies on a Bayesian factorization to separate the posterior into
1) a data association problem solved via particle filter and
2) a tracking problem with known data associations solved
by Kalman filters developed specifically for the ground robot
environment. The algorithm is demonstrated in simulation and
validated in the real world with laser range data, showing
its practical applicability in simultaneously resolving data
association ambiguities and tracking moving objects.


ICRA 07: Probabilistic Sonar Scan Matching for Robust Localization.

Probabilistic Sonar Scan Matching for Robust Localization.

Antoni Burguera, Yolanda Gonz´alez and Gabriel Oliver


This paper presents a probabilistic framework toperform scan matching localization using standard time-offlightultrasonic sensors. Probabilistic models of the sensors aswell as techniques to propagate the errors through the modelsare also presented and discussed. A method to estimate themost probable trajectory followed by the robot according tothe scan matching and odometry estimations is also presented.Thanks to that, accurate robot localization can be performedwithout the need of geometric constraints. The experimentsdemonstrate the robustness of our method even in the presenceof large amounts of noisy readings and odometric errors.


Friday, April 13, 2007

ICRA 07: An accurate closed-form estimate of ICP’s covariance

An accurate closed-form estimate of ICP’s covariance

Andrea Censi

Existing methods for estimating the covariance of the ICP (Iterative Closest/Corresponding Point) algorithm are either inaccurate or are computationally too expensive to be used online. This paper proposes a new method, based on the analysis of the error function being minimized. It considers that the correspondences are not independent (the same measurement being used in more than one correspondence), and explicitly utilizes the covariance matrix of the measurements, which are not assumed to be independent either. The validity of the approach is verified through extensive simulations: it is more accurate than previous methods and its computational load is negligible. The ill-posedness of the surface matching problem is explicitly tackled for under-constrained situations by performing an observability analysis; in the analyzed cases the method still provides a good estimate of the error projected on the observable manifold.


ICRA 07: Improved Data Association for ICP-based Scan Matching in Noisy and Dynamic Environments

Improved Data Association for ICP-based Scan Matching in Noisy and Dynamic Environments

Diego Rodriguez-Losada and Javier Minguez


This paper presents a technique to improve the data association in the Iterative Closest Point based scan matching. The method is based on a distance-filter constructed on the basis of an analysis of the set of solutions produced by the associations in the sensor configuration space. This leads to a robust strategy to filter all the associations that do not explain the principal motion of the scan (due to noise in the sensor, large odometry errors, spurious, occlusions or dynamic features for example). The experimental results suggest that the improvement of the data association leads to more robust and faster methods in the presence of wrong correspondences.


Wednesday, April 11, 2007

Lab Meeting 12 May 2007 (fish60)

Just talk a little about the progress of the hand.

Lab Meeting 12 May 2007 (Vincent): Face recognition Using Probabilistic Eigenface

In this talk, I will present a face recognition approach that integrates eigenface with probabilistic aprroach. I'll also demonstrate some result using this approach.

This work is based on the following paper.

Title : Beyond eigenfaces: probabilistic matching for face recognition

Author :
Moghaddam, B. Wahid, W. Pentland, A.

Abstract :
We propose a technique for direct visual matching for face recognition and database search, using a probabilistic measure of similarity which is based on a Bayesian analysis of image differences. Specifically we model lure mutually exclusive classes of variation between facial images: intra-personal (variations in appearance of the same individual, due to different expressions or lighting) and extra-personal (variations in appearance due to a difference in identity). The likelihoods for each respective class are learned from training data using eigenspace density estimation and used to compute similarity based on the a posteriori probability of membership in the intra-personal class, and ultimately used to rank matches in the database. The performance advantage of this probabilistic technique over nearest-neighbor eigenface matching is demonstrated using results front ARPA's 1996 “FERET” face recognition competition, in which this algorithm was found to be the top performer.

Here is the full text.

MERTZ: An active vision head robot for exploring social learning

MERTZ is an active vision head robot, designed for exploring scalable learning in a social context. Inspired by how human infants learn by observing and imitating other people, we plan to have MERTZ be placed in a public venue for long periods of time, continuously interacting with people and incrementally learning about various correlations. For example, the robot may learn to correlate objects and people with frequently uttered phoneme sequences, differentiate among people and their interaction habits, learn to dislike some people who tend to annoy the robot, etc. MERTZ has recently gone through a series of experiment where it interacted with many people at different public spaces in the Stata Center.

related links:
MERTZ projects , Lijin Aryananda ,
Research Abstracts ,
PhD thesis: A Few Days of A Robot's Life in the Human's World: Toward Incremental Individual Recognition

VASC Seminar Series

Piotr Dollar
UC San Diego
Monday, April 9, 3:30pm, NSH 1507

Supervised Learning of Edges and Object Boundaries

Edge detection is one of the most studied problems in computer vision, yet it remains a very challenging task. It is difficult since often the decision for an edge cannot be made purely based on low level cues such as gradient, instead we need to engage all levels of information, low,middle, and high, in order to decide where to put edges. In this paper we propose a novel supervised learning algorithm for edge and object boundary detection which we refer to as Boosted Edge Learning or BEL for short. A decision of an edge point is made independently at each location in the image; a very large aperture is used providing significant context for each decision. In the learning stage, the algorithm selects and combines a large number of features across different scales in order to learn a discriminative model using an extended version of the Probabilistic Boosting Tree classification algorithm. The learning based framework is highly adaptive and there are no parameters to tune. We show applications for edge detection in a number of specific image domains as well as on natural images. We test on various datasets including the Berkeley datasetand the results obtained are very good.

Short Bio:
Piotr is currently pursuing his PhD in Computer Science at UCSD, under the guidance of Prof. Serge Belongie (expected Sept 2007). He obtained his B.A and M.S degrees from Harvard University. His interests lie in machine learning and pattern recognition, and their application to computer vision. His research is supported by the NSF IGERT fellowship.

Tuesday, April 10, 2007

[MIT technical report] A Few Days of A Robot’s Life in the Human's World: Toward Incremental Individual Recognition

Title : A Few Days of A Robot’s Life in the Human’s World: Toward Incremental Individual Recognition

Author : Lijin Aryananda

Abstract :

This thesis presents an integrated framework and implementation for Mertz, an expressive robotic creature for exploring the task of face recognition through natural interaction in an incremental and unsupervised fashion. The goal of this thesis is to advance toward a framework which would allow robots to incrementally “get to know” a set of familiar individuals in a natural and extendable way. This thesis is motivated by the increasingly popular goal of integrating robots in the home. In order to be effective in human-centric tasks, the robots must be able to not only recognize each family member, but also to learn about the roles of various people in the household. In this thesis, we focus on two particular limitations of the current technology. Firstly, most of face recognition research concentrate on the supervised classification problem. Currently, one of the biggest problems in face recognition is how to generalize the system to be able to recognize new test data that vary from the training data. Thus, until this problem is solved completely, the existing supervised approaches may require multiple manual introduction and labelling sessions to include training data with enough variations. Secondly, there is typically a large gap between research prototypes and commercial products, largely due to lack of robustness and scalability to different environmental settings. In this thesis, we propose an unsupervised approach which would allow for a more adaptive system which can incrementally update the training set with more recent data or new individuals over time. Moreover, it gives the robots a more natural social recognition mechanism to learn not only to recognize each person’s appearance, but also to remember some relevant contextual information that the robot observed during previous interaction sessions. Therefore, this thesis focuses on integrating an unsupervised and incremental face recognition system within a physical robot which interfaces directly with humans through natural social interaction. The robot autonomously detects, tracks, and segments face images during these interactions and automatically generates a training set for its face recognition system. Moreover, in order to motivate robust solutions and address scalability issues, we chose to put the robot, Mertz, in unstructured public environments to interact with naive passersby, 2 instead of with only the researchers within the laboratory environment. While an unsupervised and incremental face recognition system is a crucial element toward our target goal, it is only a part of the story. A face recognition system typically receives either pre-recorded face images or a streaming video from a static camera. As illustrated an ACLU review of a commercial face recognition installation, a security application which interfaces with the latter is already very challenging. In this case, our target goal is a robot that can recognize people in a home setting. The interface between robots and humans is even more dynamic. Both the robots and the humans move around. We present the robot implementation and its unsupervised incremental face recognition framework. We describe an algorithm for clustering local features extracted from a large set of automatically generated face data. We demonstrate the robot’s capabilities and limitations in a series of experiments at a public lobby. In a final experiment, the robot interacted with a few hundred individuals in an eight day period and generated a training set of over a hundred thousand face images. We evaluate the clustering algorithm performance across a range of parameters on this automatically generated training data and also the Honda-UCSD video face database. Lastly, we present some recognition results using the self-labelled clusters.

Here is the full text

Sunday, April 08, 2007

Stanford Close To A.I. Robots For Home

Stanford is getting close to creating a robot that can help you around the house. Japan is also so close that government officials released a set of ethical guidelines to protect people from robots.

"Stair" is Stanford's artificial intelligence robot project that can be given a simple task, such as, empty the dishwasher. Then analysis it's surroundings and then figures out the best way to complete it, on its own.

Link (with video for full content)

Saturday, April 07, 2007

News: Aircraft Swarm Around Single Airborne Controller news service , Paul Marks
17:40 02 April 2007

Fighter pilots will one day be able to control entire squadrons of uncrewed combat aircraft as well as their own plane, following successful flight demonstrations of a multi-aircraft remote control system in UK airspace.
In addition to cutting the number of pilots risked in military operations, the remote control system could one day also be used to auto-land hijacked planes. Or they might allow lone pilots to orchestrate complex search and rescue operations.
UK defence firm Qinetiq demonstrated the system on 30 March. The pilot of a modified Tornado fighter plane assumed remote control of a BAC 1-11 airliner carrying members of the press, including New Scientist, and flying at an altitude of 4500 metres (15000 feet). The Tornado pilot was also in control of three simulated Unmanned Combat Air Vehicles (UCAVs).
Currently, UCAVs and their unarmed cousins, UAVs, are controlled remotely by pilots on the ground, who may be thousands of miles away. Autonomy could allow an airborne fighter pilot closer to the action take control, with each UCAV sending images and other data back to the control jet.
Autonomy has more control than a normal autopilot system and is more sophisticated, coordinating the movement of several different airplanes simultaneously.
It does this by assigning a software agent to look after each UCAV or UAV and automatically drawing up flight patterns around the likely targets. The uncrewed craft follow these patterns until the fighter pilot, who examines images of possible targets, decides they should investigate, attack, or go home to refuel.

Original Link, Video>

Friday, April 06, 2007

CMU ML talk: Mining Large Time-evolving Data Using Matrix and Tensor Tools

Christos Faloutsos, CMU
Tamara G. Kolda, Sandia National Labs
Jimeng Sun, CMU

How can we find patterns in sensor streams (eg., a sequence of temperatures, water-pollutant measurements, or machine room measurements)? How can we mine Internet traffic graph over time? Further, how can we make the process incremental? We review the state of the art in four related fields: (a) numerical analysis and linear algebra (b) multi-linear/tensor analysis (c) graph mining and (d) stream mining. We will present both theoretical results and algorithms as well as case studies on several real applications. Our emphasis is on the intuition behind each method, and on guidelines for the practitioner.

The link

Thursday, April 05, 2007


Volume 5 No. 4
April 4 2007

Welcome to RASeNews, the IEEE RAS email newsletter. RASeNews is limited to short announcements of RAS Conferences, RAS society 'action items', breaking news, and time-sensitive issues. Send comments, queries and news items to If you received this newsletter from a colleague and are interested in joining the IEEE Robotics and Automation Society, go to the link.


1. ICRA Travel grants for female graduate students
2. RAS Membership Forum at ICRA 2007
3. Petitions for RAS AdCom
4. IROS Deadline Extended to April 9
5. Other Imminent Deadlines

Last Minute ICRA Travel Grants

On April 3 the ICRA committee was informed that Microsoft Research will provide $5000 for travel grants for female grad students to attend ICRA and the Women's Lunch/BOF. Microsoft Research wants the awards to go to students that would not otherwise be able to attend. For more information, contact Nancy Amato, amato @

RAS Open Forum at ICRA 2007

At their October 2006 meeting, the AdCom voted to have the Open Forum be comprised of reports from the major committees and boards that would normally be given to the AdCom at their formal meeting after ICRA. This will give members a chance to see in detail what the AdCom does and provide inputs to the AdCom regarding decisions and initiatives which the Adcom will consider on Sunday, April 15. (All RAS members who wish to attend the AdCom meeting are welcome.).
This year the Open Forum will feature the President's report, reports on our Publications, the Conference Editorial Board and the Electronic Products and Services Board. The Forum will be held beginning at 6:20 p.m. (18:20, immediately after the last technical session) on Wednesday, April 11 in Angelicum A, Aula Minor. Please plan to attend.

Petition for IEEE-RAS AdCom Election

The slate of candidates for the RAS Administrative Committee is normally chosen by the AdCom nominations committee, currently chaired by Art Sanderson (, Toshio Fukuda (fukuda @ and Kazuo Tanie ( However, a candidate who submits a petition with the signatures of at least 2 percent of RAS voting members (Graduate Student, Members, Affliate, Senior and Fellows) he/she is automatically included on the ballot. Petitions must be received by August 15 to be guaranteed a place on the ballot. To obtain a petition form and more information, contact the RAS Society Administrator ( Also, according to the RAS Standing Rules, the Nominations Committee will seriously consider any candidate who obtains at least 25 signatures on his/her petition. The nominations committee welcomes nominations, including self nominators.

IROS 2007 Extended
Due to a number of requests, the Program committee of the Intelligent Robots and Systems Conference has extended the submissions deadline to 9 April.
IROS 2007, San Diego. 29 Oct- 2 Nov

Other Imminent Deadlines
IEEE CASE 2007 22-25 Sept. DEADLINE 30 April 2007.

MMAR 27-30 August, 2007. Szczecin, Poland. DEADLINE 16 April, 2007.

IEEE-TASE Special Issue on Automation and Engineering for Ambient Intelligence. Submissions Deadline 1 May 2007.
IEEE Transactions on Automation Science and Engineering.

IEEE RAM Special Issue on Design, Control, and Applications of Real-World Multi-Robot Systems.
Submissions Deadline 1 June 2007.
IEEE Robotics and Automation Magazine



Resembling the system Tom Cruise used in Steven Spielberg’s film Minority Report, Gesture Studios has developed a system they call GoodPoint using a technology called motion capture, which film studios and video game makers have used for years to make computer-animated characters appear more realistic. The interface featured in Minority Report is just what Gesture is selling to companies using cameras to track hand movements and translate them into computer instructions to create presentations at the 14,000 trade shows and conferences in the U.S. each year. The technology is now bursting out of Hollywood and changing the way consumers interact with home electronics. The technology has also been adopted by advertisers to create interactive displays. To read more, visit: the link.


Alice, a Java-based interactive program that allows users to produce 3-D computer animation images without the need for advanced level programming skills has been created by a lab director at Carnegie Mellon University (USA) in an effort to draw middle-school age girls to the computer science field. Through the use of a computer mouse, young users can create stories, outline the actions of characters or objects, and perform graphics programs using simple commands, such as a drop and drag interface. CMU officials claim Alice has been successful in teaching programming skills to female students from middle school to college-age. The link.


Il Village, a northern Italy technology company, has developed the Easy Walk service, which will give blind people greater independence and mobility. Easy Walk uses a mobile phone, a small Bluetooth GPS receiver, text to speech software, and a call center that will operate seven days a week. The Easy Walk is scheduled to be launched in the fall. Read more, the link.

Monday, April 02, 2007

Unsupervised Learning of Boosted Tree Classifier using Graph Cuts for Hand Pose Recognition

Title: Unsupervised Learning of Boosted Tree Classifier using Graph Cuts for Hand Pose Recognition
Author: Toufiq Parag, Ahmed Elgammal
(BMVC 2006)

This study proposes an unsupervised learning approach for the task of hand pose recognition.
Considering the large variation in hand poses, classification using a decision tree seems highly
suitable for this purpose. Various research works have used boosted decision trees and have
shown encouraging results for pose recognition. This work also employs a boosted classifier
tree learned in an unsupervised manner for hand pose recognition. We use a recursive two way
spectral clustering method, namely the Normalized Cut method (NCut), to generate the decision
tree. A binary boosting classifier is then learned at each node of the tree generated by the clustering algorithm. Since the output of the clustering algorithm may contain outliers in practice, the variant of boosting algorithm applied at each node is the Soft Margin version of AdaBoost, which was developed to maximize the classifier margin in a noisy environment. We propose a novel approach to learn the weak classifiers of the boosting process using the partitioning vector given by the NCut algorithm. The algorithm applies a linear regression of feature responses with the partitioning vector and utilizes the sample weights used in boosting to learn the weak hypotheses. Initial result shows satisfactory performances in recognizing complex hand poses with large variations in background and illumination. This framework of tree classifier can also be applied to general multi-class object recognition.

paper download:[link]

CMU vasc talk: The role of Manifold learning in Human Motion Analysis

Ahmed Elgammal
Rutgers, State University of New Jersey
Monday, April 2, 3:30PM, NSH 1507

The role of Manifold learning in Human Motion Analysis

Human body is an articulated object with high degrees of freedom. Despite the high dimensionality of the configuration space, many human motion activities lie intrinsically on low dimensional manifolds. Although the intrinsic body configuration manifolds might be very low in dimensionality, the resulting appearance manifolds are challenging to model given various aspects that affects the appearance such as the shape and appearance of the person performing the motion, or variation in the view point, or illumination. Our objective is to learn representations for the shape and the appearance of moving (dynamic) objects that support tasks such as synthesis, pose recovery, reconstruction, and tracking. We studied various approaches for representing global deformation manifolds that preserve their geometric structure. Given such representations, we can learn generative models for dynamic shape and appearance. We also address the fundamental question of separating style and content on nonlinear manifolds representing dynamic objects.We learn factorized generative models that explicitly decompose the intrinsic body configuration (content) as a function of time from the appearance/shape (style factors) of the person performing the action as time-invariant parameters. We show results on pose recovery, body tracking, gait recognition, as well as facial expression tracking and recognition.