Wednesday, October 28, 2009

(IROS2009)Video: RF Vision: RFID Receive Signal Strength Indicator (RSSI) Images for Sensor Fusion and Mobile Manipulation


Title: RF Vision: RFID Receive Signal Strength Indicator (RSSI) Images for Sensor Fusion and Mobile Manipulation


In this work we present a set of integrated methods that enable an RFID-enabled mobile manipulator to approach and grasp an object to which a self-adhesive passive (battery-free) UHF RFID tag has been affixed.


I will find the pdf file later.

Tuesday, October 27, 2009

NTU talk:

Title: Localization and Mapping of Surveillance Cameras in City Map
Speaker: Prof. Leow Wee Kheng, National University of Singapore
Time: 2:20pm, Oct 30 (Fri), 2009
Place: Room 103, CSIE building


Many large cities have installed surveillance cameras to monitor human activities for security purposes. An important surveillance application is to track the motion of an object of interest, e.g., a car or a human, using one or more cameras, and plot the motion path in a city map. To achieve this goal, it is necessary to localize the cameras in the city map and to determine the correspondence mappings between the positions in the city map and the camera views. Since the view of the city map is roughly orthogonal to the camera views, there are very few common features between the two views for a computer vision algorithm to correctly identify corresponding points automatically. We propose a method for camera localization and position mapping that requires minimum user inputs. Given approximate corresponding points between the city map and a camera view identified by a user, the method computes the orientation and position of the camera in the city map, and determines the mapping between the positions in the city map and the camera view. The performance of the method is assessed in both quantitative tests and practical application. Quantitative test results show that the method is accurate and robust in camera localization and position mapping. Application test results are very encouraging, showing the usefulness of the method in real applications.

Short Biography: Dr. Leow Wee Kheng obtained his B.Sc. and M.Sc. in Computer Science from National University of Singapore in 1985 and 1989 respectively. He pursued Ph.D. study at The University of Texas at Austin and obtained his Ph.D. in Computer Science in 1994. His curent research interests include computer vision, medical image analysis, and protein docking. He has published more than 80 technical papers in journals, conferences, and books. He has also been awarded two U.S. patents and has published another patent under PCT. He has served in the Program Committees and Organizing Committees of various conferences. He has collaborated widely with a large number of local and overseas institutions. His current local collaborators include I2R of A*STAR, Singapore General Hospital, National University Hospital, and National Skin Centre, and overseas collaborators include CNRS in France and National Taiwan University and National Taiwan University Hospital.

Saturday, October 24, 2009

CMU talk: Set estimation for statistical inference in brain imaging and active sensing

Machine Learning Lunch (
Speaker: Prof. Aarti Singh
Venue: GHC 6115
Date: Monday, October 26, 2009

Set estimation for statistical inference in brain imaging and active sensing

Inferring spatially co-located regions of interest is an important problem in several applications, such as identifying activation regions in the brain or contamination regions in environmental monitoring. In this talk, I will present multi-resolution methods for passive and active learning of sets that aggregate data at appropriate resolutions, to achieve optimal bias and variance tradeoffs for set estimation. In the passive setting, we observe some data such as a noisy fMRI image of the brain and then extract the regions with statistically significant brain activity. Active setting, on the other hand, involves feedback where the location of an observation is decided based on the data observed in the past. This can be used for rapid extraction of set estimates, such as a contamination region in environmental monitoring, by designing data-adaptive spatial survey paths for a mobile sensor. I will describe a method that uses information gleaned from coarse surveys to focus sampling around informative regions (boundaries), thus generating successively refined multi-resolution set estimates.

I will also discuss some current research directions which aim at efficient extraction of spatially distributed sets of interest by exploiting non-local dependencies in the data.

Friday, October 23, 2009

Lab Meeting 10/28, 2009(Kuen-Han): Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles(PAMI 2008)

Title: Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles(PAMI 2008)
Authors: B. Leibe, K. Schindler, N. Cornelis, and L. Van Gool.


Abstract—We present a novel approach for multi-object tracking
which considers object detection and spacetime trajectory
estimation as a coupled optimization problem. Our approach
is formulated in a Minimum Description Length hypothesis
selection framework, which allows our system to recover from
mismatches and temporarily lost tracks. Building upon a stateof-
the-art object detector, it performs multi-view/multi-category
object recognition to detect cars and pedestrians in the input
images. The 2D object detections are checked for their consistency
with (automatically estimated) scene geometry and are converted
to 3D observations, which are accumulated in a world coordinate
frame. A subsequent trajectory estimation module analyzes the
resulting 3D observations to find physically plausible spacetime
trajectories. Tracking is achieved by performing model selection
after every frame. At each time instant, our approach searches for
the globally optimal set of spacetime trajectories which provides
the best explanation for the current image and for all evidence
collected so far, while satisfying the constraints that no two
objects may occupy the same physical space, nor explain the
same image pixels at any point in time. Successful trajectory
hypotheses are then fed back to guide object detection in future
frames. The optimization procedure is kept efficient through
incremental computation and conservative hypothesis pruning.
We evaluate our approach on several challenging video sequences
and demonstrate its performance on both a surveillance-type
scenario and a scenario where the input videos are taken from
inside a moving vehicle passing through crowded city areas.



Thursday, October 22, 2009


Next-Generation Research and Breakthrough Innovation: Indicators from US Academic Research

by Thomas C. McMail

When searching for breakthrough advantages, the key innovations are those that surpass the present state so significantly that they might well lead to the next generation of technical advances. Over a period of a year and a half, I interviewed more than 100 educators, researchers, and deans in many disciplines with the overall goal of encouraging innovative collaborations between Microsoft Research and academics as part of my responsibilities as a university-relations specialist. The survey reveals connections among a broad range of topics and various recurrent themes that might be of interest to scientists in various fields.

[Full Article]

It is good to see that robotics is one of the Next-Generation Research and Breakthrough Innovations.


Wednesday, October 21, 2009

Lab Meeting 10/28 (Any): GroupSAC

Kai Ni, Hailin Jin, and Frank Dellaert. GroupSAC: Efficient Consensus in the Presence of Groupings. In International Conference on Computer Vision (ICCV), September 2009.

Abstract--We present a novel variant of the RANSAC algorithmthat is much more efficient, in particular when dealing with problems with low inlier ratios. Our algorithm assumes that there exists some grouping in the data, based on which we introduce a new binomial mixture model rather than the simple binomial model as used in RANSAC. We prove that in the new model it is more efficient to sample data from a smaller numbers of groups and groups with more tentative correspondences, which leads to a new sampling procedure that uses progressive numbers of groups. We demonstrate our algorithm on two classical geometric vision problems: wide-baseline matching and camera resectioning. The experiments show that the algorithm servesas a general framework that works well with three possible grouping strategies investigated in this paper, including a novel optical flow based clustering approach. The results show that our algorithm is able to achieve a significant performance gain compared to the standard RANSAC and PROSAC.

Tuesday, October 20, 2009

Lab Meeting 10/21 (Nicole):Discovery of sound sources by an autonomous mobile robot (Autonomous Robots Vol 27 No.3 Oct.2009)

Title:Discovery of sound sources by an autonomous mobile robot

Authors:Eric Martinson and Alan Schultz


In this work, we describe an autonomous mobile robotic system for finding, investigating, and modeling ambient noise sources in the environment. The system has been fully implemented in two different environments, using two different robotic platforms and a variety of sound source types. Making use of a two-step approach to autonomous exploration of the auditory scene, the robot first quickly moves through the environment to find and roughly localize unknown sound sources using the auditory evidence grid algorithm. Then, using the knowledge gained from the initial exploration, the robot investigates each source in more depth, improving upon the initial localization accuracy, identifying volume and directivity, and, finally, building a classification vector useful for detecting the sound source in the future.


Saturday, October 17, 2009

CMU talk: Cross-Modal Localization Through Mutual Information

FRC Seminar: Cross-Modal Localization Through Mutual Information

Speaker: Dr. Alen Alempijevic
ARC Centre for Autonomous Systems
Mechatronics and Intelligent Systems Group
University of Technology Sydney, Australia

Friday October 16, 2009
NSH 1109

Abstract: Relating information originating from disparate sensors observing a given scene is a challenging task, particularly when an appropriate model of the environment or the behaviour of any particular object within it is not available. One possible strategy to address this task is to examine whether the sensor outputs contain information which can be attributed to a common cause. I will present an approach to localise this embedded common information through an indirect method of estimating mutual information between all signal sources. Ability of L1 regularization to enforce sparseness of the solution is exploited to identify a subset of signals that are related to each other, from among a large number of sensor outputs. As opposed to the conventional L2 regularization, the proposed method leads to faster convergence with reduced spurious associations.

Speaker Bio: Dr. Alen Alempijevic is a Research Fellow within the ARC Centre for Autonomous Systems (Mechatronics and Intelligent Systems Group) at the University of Technology Sydney. He earned his BE in Computer Systems and PhD degrees in Mechatronics Engineering from the University of Technology Sydney in 2003 and 2009 respectively. He has been a guest researcher at UC Berkeley as part of the 2007 entry into the DARPA Grand Challenge and is currently working on 6DOF localization of an underground miner as part of an ARC Linkage Grant. His research interests are in perception for long term autonomy, distributed sensing in self-reconfiguring modular robots and SLAM for urban search and rescue vehicles.

CMU talk: Applied machine learning in human-computer interaction research

Machine Learning Lunch (
Speaker: Moira Burke
Venue: GHC 6115
Date: Monday, October 19, 2009

Applied machine learning in human-computer interaction research

Human-computer interaction researchers use diverse methods—from eye-tracking to ethnography—to understand human activity, and machine learning is growing in popularity as a method within the community. This talk will survey projects from HCI researchers at CMU combining machine learning with other techniques to problems such as adapting interfaces to individuals with motor impairments, predicting routines in dual-income families, classifying controversial Wikipedia articles, and identifying rhetorical strategies newcomers use in online support groups that elicit responses. Researchers without strong computational backgrounds can face practical challenges as consumers of machine learning tools; this talk will highlight opportunities for tool design and collaboration across communities.

Thursday, October 15, 2009

Lab Meeting 10/21 (Gary): 3D Alignment of Face in a Single Image (CVPR 06)

Authors: Lie Gu, Takeo Kanade


We present an approach for aligning a 3D deformable model to a single face image. The model consists of a set of sparse 3D points and the view-based patches associated with every point. Assuming a weak perspective projection model, our algorithm iteratively deforms the model and adjusts the 3D pose to fit the image. As opposed to previous approaches, our algorithm starts the fitting without resorting to manual labeling of key facial points. And it makes no assumptions about global illumination or surface properties, so it can be applied to a wide range of imaging conditions. Experiments demonstrate that our approach can effectively handle unseen faces with a variety of pose and illumination variations.


Tuesday, October 13, 2009

NTU talk: Bridging the Gap between Signal Processing and Machine Learning

Title: Bridging the Gap between Signal Processing and Machine Learning
Speaker: Dr. Y.-C. Frank Wang, Academia Sinica
Time: 2:20pm, Oct 16 (Fri), 2009
Place: Room 103, CSIE building

The advancements of signal processing and machine learning techniques have afforded many applications which benefit people in different areas. The first part of this talk starts with some interesting examples, and explains how “signal processing” and “machine learning” people might interpret the same thing in different points of view. I will discuss why it is vital to bridge the gap between these two areas, and what can be achieved by close collaboration between people from these two communities.

Many of real-world machine learning applications involve recognition of multiple classes. However, standard classification methods may not perform well and efficiently on large-scale problems. Designing a classifier with good generalization and scalability is an ongoing research topic for machine learning researchers. Another important issue, which is typically not addressed in prior work, is the rejection of unseen false classes (not of interest). Since we cannot design a classifier by training the data from “unseen” classes, the rejection problem becomes very challenging. In the second part of this talk, I will present my proposed work, soft-decision hierarchical SVRDM (support vector representation and discrimination machine) classifier, which is to address aforementioned multiclass classification and rejection problems.

Short Biography:
Yu-Chiang Frank Wang received his B.S. in Electrical Engineering from National Taiwan University in 2001. Before pursuing his graduate study, he worked as a research assistant in the Division of Medical Engineering Research at National Health Research Institutes in Taiwan from 2001 to 2002. He received his M.S. and Ph.D. degrees in Electrical and Computer Engineering from Carnegie Mellon University in 2004 and 2009, respectively. His research projects at CMU included the development of a military automated target recognition system, the design of a multi-modal biometric fusion system, and algorithms for data clustering and multi-class classification problems. His research and graduate study were funded by the US Army Research Office through Carnegie Mellon.

Dr. Wang is currently an assistant research fellow in the Research Center for Information Technology Innovation (CITI) at Academia Sinica, where he also holds the position as an adjunct assistant research fellow in the Institute of Information Science. His research interests span the areas of pattern recognition, machine learning, computer vision, multimedia signal processing and content analysis.

NTU talk: Data-Aware Search: Web Scale Integration, Web Scale Inspiration

Title: Data-Aware Search: Web Scale Integration, Web Scale Inspiration
Speaker: Prof. Kevin C. Chang, UIUC
Time: 2:10pm, Oct 15 (Thu), 2009
Place: Room 210, CSIE building

What have you been searching lately? With so much data on the web, we often look for various "stuff" across many sites-- but current search engines can only find web pages and then only one page at a time.

Towards "data-aware" search over the web as a massive database, we face the challenges of integrating data from everywhere. The barrier boils down to the classic "impedance mismatch" between structured queries over unstructured data-- but now at the Internet scale! I will discuss our lessons learned in the MetaQuerier and WISDM projects at Illinois, in which we develop large-scale data integration by two approaches with duality-- bringing data to match queries, and vice versa. I will demo prototypes and their productization at Cazoodle.

Short Biography:
Kevin C. Chang is an Associate Professor in Computer Science, University of Illinois at Urbana-Champaign. He received a BS from National Taiwan University and PhD from Stanford University, in Electrical Engineering. He likes large scale information access and, with his students, co-founded Cazoodle, a startup from the University of Illinois, for developing "data-aware" search over the web. URL:

Monday, October 12, 2009

Lab Meeting 14/10 (Andi): MMM-classification of 3D Range Data (ICRA 09)

Authors: Anuraag Agrawal, Atsushi Nakazawa, and Haruo Takemura (Osaka University)

Abstract: This paper presents a method for accurately segmenting and classifying 3D range data into particular object classes. Object classification of input images is necessary for applications including robot navigation and automation, in particular with respect to path planning. To achieve robust object classification, we propose the idea of an object feature which represents a distribution of neighboring points around a target point. In addition, rather than processing raw points, we reconstruct polygons from the point data, introducing connectivity to the points. With these ideas, we can refine the Markov Random Field (MRF) calculation with more relevant information with regards to determining “related points”. The algorithm was tested against five outdoor scenes and provided accurate classification even in the presence of many classes of interest.

local copy

Sunday, October 11, 2009

Lab Meeting 10/14, 2009 (Shao-Chen): Distributed Multirobot Localization(IEEE Transactions on Robotics and Automation Oct. 2002)

Title: Distributed Multirobot Localization(IEEE Transactions on Robotics and Automation Oct. 2002)

Authors: Stergios I. Roumeliotis,  George A. Bekey


In this paper, we present a new approach to the
problem of simultaneously localizing a group of mobile robots
capable of sensing one another. Each of the robots collects sensor
data regarding its own motion and shares this information with
the rest of the team during the update cycles. A single estimator,
in the form of a Kalman filter, processes the available positioning
information  from  all  the  members  of  the  team  and  produces
a pose estimate for every one of them. The equations for this
centralized  estimator  can  be  written  in  a  decentralized  form,
therefore  allowing this  single  Kalman  filter  to  be  decomposed
into a number of smaller communicating filters. Each of these
filters processes sensor data collected by its host robot. Exchange
of information between the individual filters is necessary only
when two robots detect each other and measure their relative
pose. The resulting decentralized estimation schema, which we
call collective localization, constitutes a unique means for fusing
measurements collected from a variety of sensors with minimal
communication  and  processing  requirements.  The  distributed
localization algorithm is applied to a group of three robots and
the improvement in localization accuracy is presented. Finally, a
comparison to the equivalent decentralized information filter is


CMU talk: The Inner Workings of face Processing: From human to computer perception and back

CMU VASC Seminar
Monday, October 12

The Inner Workings of face Processing: From human to computer perception and back
Aleix Martinez
Ohio State University

Faces and emotions shape our daily life in many different ways. We are so intrigued about such effects that writers, poets and painters have been depicting and portraying them for centuries. Why does the male character in Wood's "American Gothic" seem sad? Why do kids elongate their faces when they are upset? Why do some people always seem angry? Why do we recognize identity from faces so easily? Why is it so hard to learn non-manuals (i.e., facial expressions of grammar) in sign languages, when native signers do this effortlessly? In short, what are the dimensions of our computational (cognitive) space responsible for these face processing tasks? If we are to understand why things appear as they do and how cognitive disorders in Autism, schizophrenia and Huntington’s disease develop, we need to define how the brain codes and interpreters faces, emotions and grammar. This is also important for the design of technology – as devices need to interact with us. In this talk, I will outline the research framework I use to study face perception and the related topics of emotion and grammar. This consists of a multidisciplinary approach in cognitive science, including psychophysical studies and the design of computer algorithms for the analysis of face images. We will review the big questions about this computational space. In doing so, we will see that the ability of human observers to process face images is truly remarkable, suggesting that some abstract, yet simple representation that is unaffected by a large number of image transformations is at work. We will summarize our current understanding of this representation.

Aleix M. Martinez is an Associate Professor in the Department of Electrical and Computer Engineering at The Ohio State University (OSU), where he is the founder and director of the Computational Biology and Cognitive Science Lab. He is also affiliated with the Department of Biomedical Engineering and to the Center for Cognitive Science. Prior to joining OSU, he was affiliated with the Electrical and Computer Engineering Department at Purdue University and with the Sony Computer Science Lab. He serves as an associate editor of IEEE Transactions on Pattern Analysis and Machine Intelligence and of Image and Vision Computing and has been an area chair of CVPR. Aleix has spent his time wondering why he is such a bad face recognizer and why people attribute social labels to faces of unknown individuals. His other areas of interest are learning, vision and linguistics.

Tuesday, October 06, 2009

Lab Meeting 10/6, 2009 (Casey): Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic ...

Title: Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach
Author: Joan Solà
(PhD Thesis, 2007/02)
(LAAS-CNRS laboratory in Toulouse, Occitania, France )


1. Undelayed initialization in monocular SLAM
2. Binocular SLAM
3. Moving object detection and tracking