Thursday, May 27, 2010

CMU PhD Thesis Defense: Search in the Physical World

Search in the Physical World

Geoffrey A. Hollinger
Carnegie Mellon University

June 01, 2010, 10:00 a.m., NSH 3305
This thesis examines search in the physical world, which differs significantly from the searches in the digital world that we perform every day on our computers. When searching the internet, for instance, success is a matter of informed indexing that allows the information to be retrieved quickly. In these cases, there is no consideration of the physical nature of the world, and the search is not cognizant of space, time, or traversal distance. In contrast, search in the physical world must consider a target that could be continuously moving, possibly even trying to evade being found. The environment may be partially known, and the search proceeds with information gathered during the search itself. In many cases, such as guaranteeing capture of an adversarial target, the problem cannot be solved with a single searcher, and all group members must coordinate their actions with others on the team. Prior work has explored limited instances of such problems, but existing techniques either scale poorly or do not have performance guarantees.

Two of the main variations of search in the physical world are considered: efficient search and guaranteed search. During efficient search, robots move to optimize the average-case performance of the search given a model of the target’s motion. During guaranteed search, robots coordinate to provide worst-case guarantees on search time if the target is adversarial. This thesis unifies these search problems and shows them to be NP-hard, which suggests that a scalable and optimal algorithm is unlikely. Despite these hardness results, algorithms using implicit coordination can provide scalable and high-performing approximate solutions to many real-world search problems. Implicit coordination arises when robots share their locations, measurements, and/or actions to improve the plans of their teammates. In accord with this design strategy, a linearly scalable efficient search algorithm is presented that utilizes implicit coordination to achieve bounded performance. In addition, this thesis contributes a novel approach that augments the coordination with a pre-search spanning tree generation step, which leads to an anytime algorithm for guaranteed search.

With a focus on decentralized and online operation, the proposed search algorithms are extended to take into account team constraints, limited communication, and partially known environments. The techniques are illustrated using a scenario in the literature that incorporates both efficient and guaranteed search, and they are validated both in simulation and on human-robot search teams operating in the physical world. The developed framework enables teams of autonomous agents to search environments outside the scope of previous techniques, and the analysis provides insight into the complexity of multi-robot coordination problems.

[Thesis PDF]

Thesis Committee
Sanjiv Singh, Chair
Geoff Gordon
Reid Simmons
Athanasios Kehagias, Aristotle University of Thessaloniki

Monday, May 24, 2010

Department machine learning talks: Interactively Building Mashups by Demonstration

Title: Interactively Building Mashups by Demonstration
Speaker: Dr. Craig A. Knoblock, University of Southern California
Time: 10:30am, May 25 (Tue), 2010
Place: Room 210, CSIE Building


There are a number of tools and services available now for building mashups on the Web. However, many of the tools for constructing mashups reply on a widget paradigm, where users must select, customize, and connect widgets to build the desired application. While this approach does not require programming, the users must still understand programming concepts to successfully create a mashup. In this talk I describe our programming-by-demonstration approach to building mashups by example. Instead of requiring a user to select and customize a set of widgets, the user simply demonstrates the integration task by example. I will describe how this approach addresses the problems of extracting data from various sources, cleaning and modeling the extracted data, integrating the data across sources, and visualizing the integrated results in a geospatial context. We implemented these ideas in a system called Karma and evaluated Karma on a set of 20 users and showed that compared to other mashup construction tools, Karma allowed more of the users to successfully build mashups and made it possible to build these mashups significantly faster compared to using a widget-based approach.

This research is joint work with Shubham Gupta, Pedro Szekely, and Rattapoom Tuchinda.

Short Biography:

Dr. Craig Knoblock is a Research Professor in Computer Science and a Senior Project Leader in the Information Sciences Institute at the University of Southern California (USC). He received both his M.S. and Ph.D. in Computer Science from Carnegie Mellon and his B.S. from Syracuse University. His current research interests include information integration, information extraction, machine learning, users interfaces, constraint reasoning, geospatial data fusion, and bioinformatics. He has published one book and over 200 articles, book chapters, and conference papers on his research. He has served on the Senior Program Committees of the National Artificial Intelligence Conference, the International Joint Conference on AI, the International Semantic Web Conference, and the International Conference on Intelligent User Interfaces. He was program co-chair for the 2008 AAAI track on AI and the Web and he is conference chair for the 2011 International Joint Conference on AI (IJCAI). He is on the editorial board of Artificial Intelligence, AAAI Press, Computational Intelligence, and the Journal on Foundations and Trends in Web Science. He is a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), a Distinguished Scientist of the Association of Computing Machinery (ACM), a Trustee of the International Joint Conference on Artificial Intelligence (IJCAI), and past President of the International Conference on Automated Planning and Scheduling (ICAPS). He has started two companies, Fetch Technologies and Geosemble Technologies, based on his research at USC.

Department machine learning talks: Transfer Learning with Applications

Title: Transfer Learning with Applications
Speaker: Prof. Qiang Yang, Hong Kong University of Science and Technology
Time: 11:15am, May 25 (Tue), 2010
Place: Room 210, CSIE Building


Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. In this talk, I will give an introduction to transfer learning and then highlight some important applications such as text and image classification, sensor network data mining and activity recognition, collaborative filtering and bioinformatics. I will also discuss some potential future directions of transfer learning.

Short Biography:

Qiang Yang is a professor of the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology. He is also an adjunct professor at Peking University, Beijing, and at Zhongshan University in Guangzhou, China. He received his PhD degree from the University of Maryland, College Park. His research interests include AI planning and sensor-based activity recognition, machine learning and case-based reasoning, and data mining. He is a senior member of the IEEE, the AAAI, and the ACM, and an associate editor for the IEEE Transactions on Knowledge and Data Engineering and IEEE Intelligent Systems, as well as the International Journal of Knowledge and Information Systems. More information about him can be found at

Sunday, May 23, 2010

Lab Meeting June 8th, 2010 (fish60): Learning to Navigate Through Crowded Environments

Peter Henry, Christian Vollmer, Brian Ferris, and Dieter Fox,
Learning to Navigate Through Crowded Environments,
in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA2010), Anchorage, Alaska, May 2010

Abstract—The goal of this research is to enable mobile robots to navigate through crowded environments such as indoor shopping malls, airports, or downtown side walks. The key research question addressed in this paper is how to learn planners that generate human-like motion behavior. Our approach uses inverse reinforcement learning (IRL) to learn human-like navigation behavior based on example paths. Since robots have only limited sensing, we extend existing IRL methods to the case of partially observable environments. We demonstrate the capabilities of our approach using a realistic crowd flow simulator in which we modeled multiple scenarios in crowded environments. We show that our planner learned to guide the robot along the flow of people when the environment is crowded, and along the shortest path if no people are around.


Thursday, May 20, 2010

News: Innovation: Teaching robots some manners

13:07 17 May 2010 by Colin Barras

Where PCs are concerned, faster is invariably better. But things aren't so clear-cut in human society. The next generation of social robots will be better loved if they adopt more human-like behaviour – even if that means losing some of their raw efficiency.

Norihiro Hagita and colleagues at the ATR laboratories in Kyoto, Japan, asked 38 volunteers to click on a PC mouse to enlarge an image. The response was programmed to be delayed by 1 to 3 seconds. As expected, an immediate response was most favoured, and participants expressed more and more dissatisfaction as the delay lengthened.

But a version of the experiment that involved a humanoid robot threw up a surprising result. The volunteers were asked to tell the robot to take out the rubbish, and the robot verbally acknowledged the request. This time an immediate response – beginning the moment the volunteer finished talking – was considered less welcome than one that was delayed by a second.


See the full article here.

News: Software that Learns by Watching

KarDo learns how to perform common IT support tests by observing what the experts do.
By Duncan Graham-Rowe

Overworked and much in demand, IT support staff can't be in two places at once. But software designed to watch and learn as they carry out common tasks could soon help--by automatically performing the same jobs across different computers.

The new software system, called KarDo, was developed by researchers at MIT. It can automatically configure an e-mail account, install a virus scanner, or set up access to a virtual private network, says MIT's Dina Katabi, an associate professor at MIT.

Crucially, the software just needs to watch an administrator perform this task once before being able to carry out the same job on computers running different software. Businesses spend billions of dollars each year on simple and repetitive IT tasks, according to reports from the analyst groups Forrester and Gartner. KarDo could reduce these costs by as much as 20 percent, Katabi says.

See the full article here.

Wednesday, May 19, 2010

Lab Meeting June 1st, 2010 (Kuen-Han): Multiframe Motion Segmentation with Missing Data Using PowerFactorization and GPCA(IJCV 2008)

Title: Multiframe Motion Segmentation with Missing Data Using
PowerFactorization and GPCA
Authors: René Vidal · Roberto Tron · Richard Hartley

Abstract: We consider the problem of segmenting multiple
rigid-body motions from point correspondences in multiple
affine views. We cast this problem as a subspace clustering
problem in which point trajectories associated with each
motion live in a linear subspace of dimension two, three or
four. Our algorithm involves projecting all point trajectories
onto a 5-dimensional subspace using the SVD, the Power-
Factorization method, or RANSAC, and fitting multiple linear
subspaces representing different rigid-body motions to
the points in R5 using GPCA. Unlike previous work, our
approach does not restrict the motion subspaces to be
fourdimensional and independent. Instead, it deals gracefully
with all the spectrum of possible affine motions: from twodimensional
and partially dependent to four-dimensional and fully independent.

Our algorithm can handle the case of missing data, meaning
that point tracks do not have to be visible in all images, by
using the PowerFactorization method to project the data. In
addition, our method can handle outlying trajectories by using
RANSAC to perform the projection.

We compare our approach to other methods on a database of
167 motion sequences with full motions, independent motions,
degenerate motions, partially dependent motions, missing data,
outliers, etc. On motion sequences with complete data our
method achieves a misclassification error of less that 5% for
two motions and 29% for three motions.

paper link

Tuesday, May 18, 2010

Lab Meeting June 1, 2010 (Jimmy): Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation

Title: Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation
In: IJRR2010
Authors: Kevin Lai and Dieter Fox

In recent years, object detection has become an increasingly active field of research in robotics. An important problem in object detection is the availability of a sufficient amount of labeled training data to learn good classifiers. In this paper we show how to significantly reduce the need for manually labeled training data by leveraging data sets available on the World Wide Web. Specifically, we show how to use objects from Google’s 3D Warehouse to train an object detection system for 3D point clouds collected by robots navigating through both urban and indoor environments. In order to deal with the different characteristics of the web data and the real robot data, we additionally use a small set of labeled point clouds and perform domain adaptation. Our experiments demonstrate that additional data taken from the 3D Warehouse along with our domain adaptation greatly improves the classification accuracy on real-world environments.


Sunday, May 16, 2010

Lab Meeting May, 18 (Gary) : "2D vs. 3D Deformable Face Models: Representational Power, Construction, and Real-Time Fitting "( IJCV 2007)

2D vs. 3D Deformable Face Models: Representational Power, Construction, and Real-Time Fitting

Authors: Iain Matthews, Jing Xiao, Simon Baker


Model-based face analysis is a general paradigm with applications that include face recognition, expression recognition, lip-reading, head pose estimation, and gaze estimation. A face model is first constructed from a collection of training data, either 2D images or 3D range scans. The face model is then fit to the input image(s) and the model parameters used in whatever the application is. Most existing face models can be classified as either 2D (e.g. Active Appearance Models) or 3D (e.g. Morphable Models). In this paper we compare 2D and 3D face models along three axes: (1) representational power, (2) construction, and (3) real-time fitting. For each axis in turn, we outline the differences that result from using a 2D or a 3D face model.


Wednesday, May 12, 2010

ICRA 2010 Awards - Best Cognitive Robotics Paper

ICRA 2010 Awards - KUKA Service Robotics Best Paper

ICRA 2010 Awards - Best Conference Paper

ICRA 2010 Awards - Best Student Paper

Tuesday, May 11, 2010

ICRA 2010 Awards - Best Medical Robotics Paper

ICRA 2010 Awards - Best Automation Paper

ICRA 2010 Awards - Best Vision Paper

ICRA 2010 Awards - Best Manipulation Paper

ICRA 2010 Awards - Best Video

This post, along with the following posts, will provide links to the best paper awards and finalists for each category in ICRA 2010, with the best papers listed first.

Monday, May 10, 2010

Lab Meeting May 11 (Nicole): Active Audition Using the Parameter-less Self-organising Map (Auton Robot 2008)

Title: Active Audition Using the Parameter-less Self-organising Map
Authors: Erik Berglund · Joaquin Sitte · Gordon Wyeth

Autonomous Robots Volume 24, Number 4, 2008/5

This paper presents a novel method for enabling a robot to determine the position of a sound source in three dimensions using just two microphones and interaction with its environment. The method uses the Parameter-Less Self-Organising Map (PLSOM) algorithm and Reinforcement Learning (RL) to achieve rapid, accurate response. We also introduce a method for directional filtering using the PLSOM. The presented system is compared to a similar system to evaluate its performance.


Sunday, May 09, 2010

Lab Meeting May 11 (Shao-Chen): Decentralized Localization of Sparsely-Communicating Robot Networks: A Centralized-Equivalent Approach(IEEE T-RO 2010)

Title: Decentralized Localization of Sparsely-Communicating Robot Networks: A Centralized-Equivalent Approach
Authors: Leung, K.Y.K.;   Barfoot, T.D.;   Liu, H.;   Inst. for Aerosp. Studies, Univ. of Toronto, Toronto, ON, Canada


Finite-range sensing and communication are factors in the connectivity of a dynamic mobile-robot network. State estimation becomes a difficult problem when communication connections allowing information exchange between all robots are not guaranteed. This paper presents a decentralized state-estimation algorithm guaranteed to work in dynamic robot networks without connectivity requirements. We prove that a robot only needs to consider its own knowledge of network topology in order to produce an estimate equivalent to the centralized state estimate whenever possible while ensuring that the same can be performed by all other robots in the network. We prove certain properties of our technique and then it is validated through simulations. We present a comprehensive set of results, indicating the performance benefit in different network connectivity settings, as well as the scalability of our approach.


Monday, May 03, 2010

ICRA 2010 Day 1 Report:

I will attend the Best Practice in 3D Perception and Modeling for Mobile Manipulation workshop on May 3rd. The web page of this workshop is here. The papers of this workshop are available at the lab server.