Wednesday, October 31, 2007

CMU RI Seminar: Perceiving the Actions and Intentions of Others

Perceiving the Actions and Intentions of Others: Brain Mechanisms for Social Perception

Kevin Pelphrey
Department of Psychology
Carnegie Mellon University

Mauldin Auditorium (NSH 1305 )
Talk 3:30 pm

Humans are intensely social beings that have evolved and develop within highly social environments in which each individual is dependent upon others. Rapid assimilation of information about other individuals is critical. We must be able to recognize specific individuals and use the history of our past interactions to guide our future behavior. We constantly engage in social perception – using cues from facial expressions, gaze shifts, body movements, and language to infer the intentions of others and plan our own actions accordingly. When approached by another individual, interpreting his or her intentions assumes even greater urgency as the distance between us diminishes. This is particularly so if the individual is a stranger, where body size, facial expressions, gestures and gait may differentiate between a potential threat and a potential ally.
Given the importance of our social interactions, it is plausible that specialized brain systems may have evolved that are critical for these different aspects of social cognition. Several candidate regions thought to comprise the social brain have been identified, including the fusiform gyrus for face perception, the posterior superior temporal sulcus for the perception of biological motion and the visual analysis of others’ intentions, and the amygdala and ventral frontal regions for the perception of emotional expression.
Members of my laboratory have been investigating the properties of these brain regions using functional magnetic resonance imaging (fMRI) in typically developing adults and children as well as in children and adults with autism, a neurodevelopmental disorder marked by severe dysfunction in aspects of social perception. In this talk, I will describe these studies in three main parts: First, I will describe our efforts using fMRI to identify the basic brain mechanisms for social perception in typically developing adults. Second, I will discuss our studies of the neural basis of social perception deficits in adults with autism. Finally, I will describe our recent efforts to chart the typical and atypical development of brain mechanisms for social perception in children with and without autism.

Monday, October 29, 2007

Lab Meeting Octobor 30th, 2007 (Jeff):Progress report

I will try to show some results and problem from my recent work.

Lab Meeting 30 October (ZhenYu): A theory of catadioptric image formation

Title: A theory of catadioptric image formation

Baker, S. Nayar, S.K.

Computer Vision, 1998. Sixth International Conference

Conventional video cameras have limited fields of view which make them restrictive for certain applications in computational vision. A catadioptric sensor uses a combination of lenses and mirrors placed in a carefully arranged configuration to capture a much wider field of view. When designing a catadioptric sensor, the shape of the mirror(s) should ideally be selected to ensure that the complete catadioptric system has a single effective viewpoint. In this paper, we derive the complete class of single-lens single-mirror catadioptric sensors which have a single viewpoint and an expression for the spatial resolution of a catadioptric sensor in terms of the resolution of the camera used to construct it. We also include a preliminary analysis of the defocus blur caused by the use of a curved mirror.


Saturday, October 27, 2007

ICCV07 : Non-metric affinity propagation for unsupervised image categorization


Unsupervised categorization of images or image parts is
often needed for image and video summarization or as a
preprocessing step in supervised methods for classification,
tracking and segmentation. While many metric-based techniques
have been applied to this problem in the vision community,
often, the most natural measures of similarity (e.g.,
number of matching SIFT features) between pairs of images
or image parts is non-metric. Unsupervised categorization
by identifying a subset of representative exemplars can
be efficiently performed with the recently-proposed ‘affinity
propagation’ algorithm. In contrast to k-centers clustering,
which iteratively refines an initial randomly-chosen
set of exemplars, affinity propagation simultaneously considers
all data points as potential exemplars and iteratively
exchanges messages between data points until a good solution
emerges. When applied to the Olivetti face data set
using a translation-invariant non-metric similarity, affinity
propagation achieves a much lower reconstruction error
and nearly halves the classification error rate, compared
to state-of-the-art techniques. For the more challenging
problem of unsupervised categorization of images from the
Caltech101 data set, we derived non-metric similarities between
pairs of images by matching SIFT features. Affinity
propagation successfully identifies meaningful categories,
which provide a natural summarization of the training images
and can be used to classify new input images.


Thursday, October 25, 2007

[ML Lunch Seminar] Hanghang Tong: Proximity on Graphs

Speaker: Hanghang Tong, MLD, CMU
Title: Proximity on Graphs: Definitions, Fast Solutions and Applications
Venue: NSH 1507
Date: Monday October 29
Time: 12:00 noon

Graphs appear in a wide range of settings, like computer networks, the
world wide web, biological networks, social networks
(MSN/FaceBook/LinkedIn) and many more. How to find master-mind criminal
given some suspects X, Y and Z? How to find user-specific pattern (like,
e.g. a money-laundering ring)? How to track the most influential authors
over time? How to automatically associate digital images with proper
keywords? How to answer all these questions quickly on large (disk
resident) graphs? It turns out that the main tool behind these
applications (and many more) is the proximity measurement: given two
nodes A and B in a network, how close is the target node B related
to the source A?

In this talk, I will cover three aspects of the proximity on graphs: (1)
Proximity definitions. I will start with random walk with restart, the
main idea behind Google's PageRank algorithm, and talk about its
variants and generalizations. (2) Computational issue. Many proximities
measurements involve a specific linear system. I will give algorithms on
how to efficiently solve such linear system(s) in several different
settings. (3) Applications. I will show some applications of the
proximity, including link prediction, neighborhood formulation, image
caption, center-piece subgraph, pattern match etc.

Lab meeting 22 Oct ober (韋麒) : Design and Control of Five-Fingered Haptic Interface Opposite to Human Hand

This paper presents the design and control of a newly developed five-fingered haptic interface robot named HIRO IIplus. The developed haptic interface can present force and tactile feeling to the five fingertips of the human hand. Its mechanism consists of a 6 degree of freedom (DOF) arm and a 15 DOF hand. The interface is placed opposite the human hand, which ensures safety and freedom of movement, but this arrangement leads to difficulty in designing and controlling the haptic interface, which should accurately track the fingertip positions of the operator. A design concept and optimum haptic finger layout, which maximizes the design performance index is presented. The design performance index consists of the product space between the operator's finger and the hapic finger, and the opposability of the thumb and fingers. Moreover, in order to reduce the feeling of uneasiness in the operator, a mixed control method consisting of a finger-force control and an arm position control intended to maximize the control performance index, which consists of the hand manipulability measure and the norm of the arm-joint angle vector is proposed. The experimental results demonstrate the high potential of the multifingered haptic interface robot HIRO IIplus utilizing the mixed control method.

Full Text: PDF (962 KB)

Tuesday, October 23, 2007

Special FRC Seminar - - Wednesday, October 24 @ Noon at NSH 1305 - - IROS Talk Practice Session

FRC Seminar Schedule

Wednesday, October 24 Noon

Speaker (1) Gil Jones Ph.D. Candidate Robotics Institute
Title Learning-enhanced Market-based Task Allocation for Oversubscribed Domains

Abstract: This paper presents a learning-enhanced market-based task allocation approach for oversubscribed domains. In oversubscribed domains all tasks cannot be completed within the required deadlines due to a lack of resources. We focus specifically on domains where tasks can be generated throughout the mission, tasks can have different levels of importance and urgency, and penalties areassessed for failed commitments. Therefore, agents must reason aboutpotentialfuture events before making task commitments. Within these constraints, existing market-based approaches to task allocation can handle task importanceand urgency, but do a poor job of anticipating future tasks, and are hence assessed a high number of penalties.

Speaker (2) Balajee Kannan Research Enginee Robotics Institute
Title Metrics for quantifying system performance in intelligent, fault-tolerant multi-robot teams

Abstract: The quality of the incorporated fault-tolerance has a direct impact on the overall performance of the system. Hence, being able to measure the extent and usefulness of fault-tolerance exhibited by the system would provide the designer with a useful analysis tool for better understanding the system as a whole. Unfortunately, it is difficult to quantify system fault-tolerance on its own for intelligent systems. A more useful metric for evaluation is the "effectiveness" measure offault-tolerance, i.e.,the influence of fault-tolerance towards improving overall performance determines the overall effectiveness or quality of thesystem.In this paper, we outline application-independent metrics to measure fault-tolerance within the context of system performance. In addition, we also outline potential methods to better interpret the obtained measures towards understanding the capabilities of the implemented system. Furthermore, a main focus of our approach is to capture the effect of intelligence, reasoning, or learning on the effective fault-tolerance of thesystem, rather than relying purely on traditional redundancy based measures.

HRI08 Workshop: Coding Behavioral Video Data and Reasoning Data in Human-Robot Interaction: Call for submission

Purpose: The purpose of the workshop is to bring together HRI researchersand designers from across the world who are actively engaged - or would liketo be - in coding behavioral and/or reasoning data in HRI. We'll sharemethods from our respective laboratories, and discuss problems encounteredand potential solutions. By the end of the workshop:
* Participants will understand different approaches towardconstructing coding systems.
* Participants will be positioned better to analyze their own HRIdata.
* We will have begun to establish a community of researchers anddesigners who can share related ideas with one another in the years to come.
* We will move forward with publishing proceedings from the workshop.Deadline for Submission (for Presenters): December 14,2007
Peter H. Kahn, Jr.
University of Washington, USA

Takayuki Kanda
Advanced Telecommunications Research (ATR), Japan

Nathan G. Freier
Rensselaer Polytechnic Institute, USA

Rachel L. Severson
University of Washington, USA

Hiroshi Ishiguro
Advanced Telecommunications Research (ATR) and Osaka University, Japan
As the field of human-robot interaction begins to mature, researchers anddesigners are recognizing the need for systematic, comprehensive, andtheoretically-grounded methodologies for investigating people's socialinteractions with robots. One attractive approach entails the collection ofbehavioral video data in naturalistic or experimental settings. Anotherattractive approach entails interviewing participants about theirconceptions of human-robot interaction (e.g., during or immediatelyfollowing an interaction with a specific robot). With behavioral video dataand/or reasoning data in hand, the question then emerges: How does one codeand analyze such data?

The workshop is divided into two main parts.

Morning. Our collaborative laboratories (from the University of Washingtonand ATR) will share in some depth the coding system we have developed forcoding 90 children's social and moral behavior with and reasoning about ahumanoid robot (ATR's Robovie). This coding manual builds from othersystems we have developed and disseminated elsewhere as technical reports(Friedman, et al, 2005; Kahn et al., 2003, 2005, 2005). Key issuespresented in the morning include:

* What is a Coding Manual?
* Getting Started - Iterating between Data and Theory
* Building on Previous Systems, when Applicable
* Hierarchical Organization of Categories
* Time Segmentation of Behavior
* Behavior in Response to Robot-Initiated and Experimenter-InitiatedStimulus
* Coding Social and Moral Reasoning
* How to Deal with Multiple Ways of Coding a Single Behavioral Eventor Reason
* Reliability CodingWe'll have plenty of time for discussion of issues as they emerge.

Afternoon: Following a group lunch, we'll then have up to 5 participantspresent for 20 minutes each (followed by 20 minutes of discussion after eachpresentation). Presenters will provide a brief overview of one of their HRIresearch projects (hopefully with some video data or interview data inhand), and then explicate three problems they encountered in coding thedata, and then (if at all) how they sought to solve the problems. The 20minute discussion periods will provide time for participants to discuss thenature of the problems and other possible solution strategies.

Two Types of Participation
There will be two types of participation:

5 Presenters (in addition to the 5 organizers): Presenters will be activelyinvolved in HRI research that involves behavioral and/or reasoning data. Asnoted above, each presenter will have 20 minutes to present an overview ofone of their HRI research projects, and to present three problemsencountered and possible solutions.

Other Workshop Participants: Participants will join in the workshop andparticipate in discussions. The prerequisite is simply an interest in thetopic.

Submission Guidelines
As noted above, there will be two types of participation: (1) workshoppresenters, and (2) workshop participants. Submission guidelines differdepending on your interests in participating:

(1) Workshop Presenter: Send a one-page single-spaced summary of your HRIresearch project, and three possible coding problems encountered andpossible solutions. Indicate whether you anticipate having some actual datato share (video clips or interview transcripts) that illustrate your issuesat hand. Include an additional paragraph that summarizes your background inHRI. These submissions will be peer-reviewed. The deadline for submissionis December 14, 2007.

(2) Workshop Participant: Send a one-paragraph summary of your backgroundin HRI and interest in the workshop. Participants will be accepted on afirst-come-first-admitted basis.

The workshop will take place March 12, 2008, at the HRI '08 conference site,the beautiful Felix Meritis cultural center in central Amsterdam.

Workshop Proceedings
We plan to publish proceedings of the workshop in the form of a technicalreport. At this junction, the technical report will include the full codingsystem for the UW-ATR study on Children's Social and Moral Relationshipswith a Humanoid Robot. We would also like to include full coding systemsfrom the other 5 presenters in the workshop. Together, then, we would havecreated a vibrant initial repository of coding systems for other researchersto draw upon. However, if not all of the presenters have full systems, thenwe will include a written version of their summary of their project andtheir 3 problems and solutions presented during the workshop.

Sunday, October 21, 2007

Lab Meeting 22 Oct, (Yi-liu): Dynamic 3D Scene Analysis from a Moving Vehicle

Bastian Leibe, Nico Cornelis, Kurt Cornelis, Luc Van Gool

IEEE Conference on Computer Vision and Pattern Recognition 2007 (CVPR'07)

In this paper, we present a system that integrates fully automatic scene geometry estimation, 2D object detection,3D localization, trajectory estimation, and tracking for dynamic scene interpretation from a moving vehicle. Our sole input are two video streams from a calibrated stereo rig on top of a car. From these streams, we estimate Structurefrom-Motion (SfM) and scene geometry in real-time. In parallel,we perform multi-view/multi-category object recognition to detect cars and pedestrians in both camera images. Using the SfM self-localization, 2D object detections are converted to 3D observations, which are accumulated in a world coordinate frame. A subsequent tracking module analyzes the resulting 3D observations to find physically plausible spacetime trajectories. Finally, a global optimization criterion takes object-object interactions into account to arriveat accurate 3D localization and trajectory estimatesfor both cars and pedestrians. We demonstrate the performance of our integrated system on challenging real-world data showing car passages through crowded city areas.

Full article: link

SIGGRAPH 07: Seam Carving for Content-Aware Image Resizing

S. Avidan and A. Shamir
Seam Carving for Content-Based Image Retargeting
SIGGRAPH, San-Diego, 2007

Adobe's Avidan says that seam carving is fairly straightforward. If, for instance, a person wanted to compress a picture lengthwise by a single pixel, the software would scan the image to find the best pixels to remove. This is usually a zigzagging, vertical seam that is surrounded by pixels on the left and right that have a similar color. The pixel-wide seam is removed and the image is compressed without distorting the objects in the image. What makes the duo's algorithm impressive is that it can find and remove these pixels quickly, so a person can expand and compress a picture quickly. The process works well for photos with backgrounds such as sky or grass, in which there can be little variation in color and pattern, Avidan explains, although it works poorly for people's faces and more varied landscapes.

Video: Image Resizing by Seam Carving
News: New Tricks for Online Photo Editing

Online photo editor Rsizr implements the feature

Friday, October 19, 2007

Lab Meeting 22 October (Stanley): A Human Aware Mobile Robot Motion Planner


Author: Emrah Akin Sisbot, Luis F. Marin-Urias, Rachid Alami, and Thierry Sim´eon, Member, IEEE

Robot navigation in the presence of humans raises new issues formotion planning and control when the humans must be taken explicitly into account. We claim that a human aware motion planner (HAMP) must not only provide safe robot paths, but also synthesize good, socially acceptable and legible paths. This paper focuses on amotion planner that takes explicitly into account its human partners by reasoning about their accessibility, their vision field and their preferences in terms of relative human–robot placement and motions in realistic environments. This planner is part of a human-aware motion and manipulation planning and control system that we aim to develop in order to achieve motion and manipulation tasks in the presence or in synergy with humans.

Thursday, October 18, 2007

Design and Control of Five-Fingered Haptic Interface Opposite to Human Hand


The developed haptic interface can present force and tactile feeling to the five fingertips of the human hand. Its mechanism consists of a 6 degree of freedom (DOF) arm and a 15 DOF hand.

The design performance index consists of the product space between the operator’s finger and the hapic finger, and the opposability of the thumb and fingers. Moreover, in order to reduce the feeling of uneasiness in the operator, a mixed control method consisting of a finger-force control and an arm position control intended to maximize the control performance index, which consists of the hand manipulability measure and the norm of the arm-joint angle vector is proposed.

Full content:

Intelligence Seminar at School of Computer Science at Carnegie Mellon University

Wed Oct 10 2007

Optimal Multi-Agent Scheduling with Constraint Programming

We consider the problem of computing optimal schedules in multi-agent systems. In these problems, actions of one agent can influence the actions of other agents, while the objective is to maximize the total `quality' of the schedule.
We show how we can model and efficiently solve these problems with constraint programming technology. Elements of our proposed method include constraint-based reasoning, search strategies, problem decomposition, scheduling algorithms, and a linear programming relaxation.


Speaker info.

Tuesday, October 16, 2007

News: Asus' Eee PC Official Prices Announced

華碩今天下午宣佈了 Eee PC 正式上市 。Eee PC 將以四個不同的規格等級上市,分別是最便宜的 Eee PC 2G Surf(256MB DDR2 / 2GB SSD / 4400mAh 電池),售價 7999 台幣;規格高一點的 Eee PC 4G Surf(512MB DDR2 / 4GB SSD / 4400mAh 電池),售價 9999 台幣;電池比較長壽的 Eee PC 4G(512MB DDR2 / 4GB SSD / 5200mAh 電池),售價 11100 台幣,和最高規格的 Eee PC 8G(1GB DDR2 / 8GB SSD / 5200mAh 電池),售價 13800 台幣。據華碩自已的估計,4400mAh 的電池大約可以用 2.8 個小時,而 5200mAh 的電池則是 3.5 個小時左右。

Asus announced immediate availability of their long coveted Eee PC this afternoon. Eee PCs will hit the market in four different specs, the cheapest being the Eee PC 2G Surf ( 256MB DDR2 / 2GB SSD / 4400mAh battery), at approximately $245USD, while the Eee PC 4G Surf ( 512MB DDR2 / 4GB SSD / 4400mAh battery) goes for around $305USD. The two higher end models have increased battery capacity, with the Eee PC 4G ( 512MB DDR2 / 4GB SSD / 5200mAh battery) going for $340USD, and the top end Eee PC 8G ( 1GB DDR2 / 8GB SSD / 5200mAh ) coming in with a $424USD price tag. According to ASUS' own estimates, the 4400mAh battery will give you about 2.8 hrs of usage, while the 5200mAh one should last around 3.5 hrs. We'll post our hands on gallery in a moment!

ASUSTek Computer Inc. - Eee PC Spec

Monday, October 15, 2007

Lab Meeting 15 October (Der-Yeuan): Introduction to Robotics Programming with Microsoft Robotics Studio


Microsoft Robotics Studio (MSRS) is a Windows-based IDE for robotics programming. Its primary components are the Concurrency and Coordination Runtime (CCR) and the Decentralized System Services (DSS). The CCR emphasizes in scheduling the tasks to manage concurrency and load-balancing for different applications. The DSS is a service-oriented approach to robot component integration where every software or hardware component of a design is a service. Such web-based architecture allows services within a network to interact. Given the experience of MSRS with LEGO NXT bricks, this presentation will provide a brief introduction to CCR and DSS, and give some insight on the maturity of MSRS.

Sunday, October 14, 2007

Lab Meeting 15 October (fish60): introduction of POMDP

I will try to talk about what the POMDP(partially observable Markov decision process) is.

Lab Meeting 15 October (Chihao): Sound Location using Structure from Sound

Structure from Sound (SFS) is defined as the simultaneous localization problem of N sound sensors and M acoustic events in the environment detected by these sensors.
I will show the result of acoustic source localization using the SFS and discuss traits of uncertainty.

Robotics Institute Thesis Oral 18 Jul 2007 : Activity Recognition for Agent Teams

Abstract :

Proficient teams can accomplish goals that would not otherwise be achievable by groups of uncoordinated individuals. This thesis addresses the problem of analyzing team activities from external observations and prior knowledge of the team's behavior patterns. There are three general classes of recognition cues that are potentially valuable for team activity/plan recognition: (1) spatial relationships between team members and/or physical landmarks that stay fixed over a period of time; (2) temporal dependencies between behaviors in a plan or between actions in a behavior; (3) coordination constraints between agents and the actions that they are performing. This thesis examines how to leverage available spatial, temporal, and coordination cues to perform offline multi-agent activity/plan recognition for teams with dynamic membership.

In physical domains (military, athletic, or robotic), team behaviors often have an observable spatio-temporal structure, defined by the relative physical positions of team members and their relation to static landmarks; we suggest that this structure, along with temporal dependencies and coordination constraints defined by a team plan library, can be exploited to perform behavior recognition on traces of agent activity over time, even in the presence of uninvolved agents. Unlike prior work in team plan recognition where it is assumed that team membership stays constant over time, this thesis addresses the novel problem of recovering agent-to-team assignment for team tasks where team composition, the mapping of agents into teams, changes over time; this allows the analysis of more complicated tasks in which agents must periodically divide into subteams.

This thesis makes four main contributions: (1) an efficient and robust technique for formation identification based on spatial relationships; (2) a new algorithm for simultaneously determining team membership and performing behavior recognition on spatio-temporal traces with dynamic team membership; (3) a general pruning technique based on coordination cues that improves the efficiency of plan recognition for dynamic teams; (4) methods for identifying player policies in team games that lack strong spatial, temporal, and coordination dependencies.


CMU RI Thesis Proposal Oct. 16th 2007:Estimating Mission Reliability for Mobile Robots

Title: Estimating Mission Reliability for Mobile Robots

Speaker: Stephen Stancliff

Current mobile robots generally fall into one of two categories as far as reliability is concerned - highly unreliable, or very expensive. Most fall into the first category, requiring teams of graduate students or staff engineers to coddle them in the days and hours before a brief demonstration. The few robots that exhibit very high reliability, such as those used by NASA for planetary exploration, are very expensive.
In order for mobile robots to become more widely used in real-world environments, they will need to have reliability in between these two extremes. In order to design mobile robots with respect to reliability, we need quantitative models for predicting robot reliability and for relating reliability to other design parameters. To date, however, there has been very little formal discussion of reliability in the mobile robotics literature, and no general method has been presented for quantitatively predicting the reliability of mobile robots.
This thesis proposal focuses on this problem of predicting reliability for mobile robots and for using reliability as a quantitative input into mobile robot mission design.

Proposal Link:

Lab Meeting 15 October (Leo): Weakly Interacting Object Tracking in Indoor Environments

Weakly Interacting Object Tracking in Indoor Environments

authors: Chieh-Chih Wang, Kao-Wei Wan and Tzu-Chien Lo


Interactions between targets have been exploited
to solve the occlusion problem in multitarget tracking but
not to provide higher level scene understanding. In our previous
work [1], a variable structure multiple model estimation
framework with a scene interaction model and a neighboring
object interaction model was proposed to accomplish these
two tasks. The proposed approach was demonstrated in urban
areas using a laser scanner. As indoor environments
are relatively unconstrained than urban areas, interactions in
indoor environments are weaker and have more variants. Weak
interactions make scene interaction modeling and neighboring
object interaction modeling challenging. In this paper, a
place-driven scene interaction model is proposed to represent
long-term interactions in indoor environments. To deal with
complicated short-term interactions, the neighboring object
interaction model is consisted of three short-term interaction
models, following, approaching and avoidance. The moving
model, the stationary process model and these two interaction
models are integrated to accomplish weakly interacting object
tracking. In addition, higher level scene understanding such as
unusual activity recognition and important place identification
is accomplished straightforwardly. The experimental results
using data from a laser scanner demonstrate the feasibility
and robustness of the proposed approaches.


Friday, October 12, 2007

ECE Seminar: Natural Scene Recognition: From Humans to Computers

Title: Natural Scene Recognition: From Humans to Computers
Speaker: Prof. Fei-Fei Li
Date: October 10, 2007

For both humans and machines, the ability to learn and recognize the semantically meaningful contents of the visual world is an essential andimportant functionality. In this talk, we will examine the topic ofnatural scene categorization and recognition in human psychophysical and physiological experiments as well as in computer vision modeling.

I will first present a series of recent human psychophysics studies onnatural scene recognition. All these experiments converge to oneprominent phenomena of the human visual system: humans are extremely efficient and rapid in capturing the semantic contents of the real-worldimages. Inspired by these behavioral results, we report a recent fMRIexperiment that classifies different types of natural scenes (e.g. beach vs. building vs. forest, etc.) based on the distributed fMRI activity.This is achieved by utilizing a number of pattern recognition algorithmsin order to capture the multivariate nature of the complex fMRI data.

In the second half of the talk, we present a generative Bayesianhierarchical model that learns to categorize natural images in a weaklysupervised fashion. We represent an image by a collection of localregions, denoted as codewords obtained by unsupervised clustering. Eachregion is then represented as part of a `theme'. In previous work, suchthemes were learnt from hand-annotations of experts, while our method learns the theme distribution as well as the codewords distribution overthe themes without such supervision. We report excellent categorizationperformances on a large set of 13 categories of complex scenes.

Bio:Prof. Fei-Fei Li's main research interest is in vision, particularlyhigh-level visual recognition.
In computer vision, Fei-Fei's interestsspan from object and natural scene categorization to human activity categorizations in both videos and still images. In human vision, shehas studied the interaction of attention and natural scene and objectrecognition. In a recent project, she also studies the human brain fMRI activities in natural scene categorization by using pattern recognitionalgorithms. Fei-Fei graduated from Princeton University in 1999 with aphysics degree, and a minor in engineering physics. She received her PhD in electrical engineering from the California Institute of Technology in2005. Fei-Fei was on faculty in the Electrical and Computer EngineeringDept. at the University of Illinois Urbana-Champaign (UIUC) from Sept 2005 to Dec 2006. Starting Jan 2007, Fei-Fei is an Assistant Professorin the Computer Science Department at Princeton University. She alsoholds courtesy appointments in the Psychology Department and theNeuroscience Program at Princeton. She is a recipient of the 2006 Microsoft Research New Faculty Fellowship. (Fei-Fei publishes under thename L. Fei-Fei.)

Wednesday, October 10, 2007

[VASC Seminar]Three Power-ups for Object Class Recognition

Date: Oct 11, Thursday
Speaker: Marcin Marszalek, INRIA

We will present our recent work from CVPR'07 and describe our winning
method for Pascal VOC Challenge 2007. The talk will therefore consist of
three parts. First, we will introduce shape masks and describe how they
are used for accurate object localization (CVPR'07 oral). Second, we will
show how we learn representations for image classification using a genetic
algorithm (Pascal VOC'07 winning method). Finally, we will discuss the use
of semantic hierarchies for visual object recognition, which is the
current main focus of our research.

Related Paper:
Accurate Object Localization with Shape Masks
Semantic Hierarchies for Visual Object Recognition

Saturday, October 06, 2007

Lab Meeting 8 October (Atwood): Accelerated Training of CRFs with Stochastic Gradient Methods

Title: Accelerated Training of CRFs with Stochastic Gradient Methods
Author: S.V. N. Vishwanathan, Nicol N. Schraudolph, Mark W. Schmidt, Kevin P. Murphy

We apply Stochastic Meta-Descent (SMD), a stochastic gradient optimization method with gain vector adaptation, to the training of Conditional Random Fields (CRFs). On several large data sets, the resulting optimizer converges to the same quality of solution over an order of magnitude faster than limited-memory BFGS, the leading method reported to date. We report results for both exact and inexact inference techniques.


[VASC Seminar Series]Unsupervised Learning of Categories Appearing in Images

Date: Monday, Oct 8
Title: Unsupervised Learning of Categories Appearing in Images
Speaker: Sinisa Todorovic, UIUC

This talk is about solving the following problem: given a set of images containing frequent occurrences of multiple object categories, learn a compact, multi-category representation that encodes the models of these categories and their inter-category relationships, for the purposes of object recognition and segmentation. The categories are not defined by the user, and whether and where any instances of the categories appear in a specific image is not known. This problem is challenging as it involves the following unanswered questions. What is an object category? To which
extent human supervision is necessary to communicate the nature of object categories to a computer vision system? What is an efficient, compact representation of multiple categories, and which inter-category relationships should it capture? I will present an approach that addresses the above stated problem, wherein a category is defined as a set of 2D objects (i.e., subimages) sharing similar appearance and topological properties of their constituent regions. The approach derives from and
closely follows this definition by representing each image as a segmentation tree, whose structure captures recursive embedding of image regions in a multiscale segmentation, and whose nodes contain the associated geometric and photometric region properties. Since the presence of any categories in the image set is reflected in the occurrence of similar subtrees (i.e., 2D objects) within the image trees, the approach: (1) matches the image trees to find these similar subtrees; (2) discovers
categories by clustering similar subtrees, and uses the properties of each cluster to learn the model of the associated category; and (3) captures sharing of simpler categories among complex ones, i.e., category-subcategory relationships. The approach can also be used for addressing a less-general, subsumed problem, that of unsupervised extraction of texture elements ( i.e., texels) from a given image of 2.1D
texture, because 2.1D texture can be viewed as composed of repetitive instances of a category (e.g., waterlilies on the water surface).

Thursday, October 04, 2007

Lab Meeting 8 October (Any): SLAM in Large-Scale Cyclic Environments Using the Atlas Framework

Michael Bosse, Paul Newman, John Leonard, Seth Teller

International Journal of Robotics Research 2004 (IJRR'04)

Abstract -- In this paper we describe Atlas, a hybrid metrical/topological approach to simultaneous localization and mapping (SLAM) that achieves efficient mapping of large-scale environments. The representation is a graph of coordinate frames, with each vertex in the graph representing a local frame and each edge representing the transformation between adjacent frames. In each frame, we build a map that captures the local environment and the current robot pose along with the uncertainties of each. Each map’s uncertainties are modeled with respect to its own frame. Probabilities of entities with respect to arbitrary frames are generated by following a path formed by the edges between adjacent frames, computed using either the Dijkstra shortest path algorithm or breath-first search. Loop closing is achieved via an efficient map-matching algorithm coupled with a cycle verification step.We demonstrate the performance of the technique for post-processing large data sets, including an indoor structured environment (2.2 km path length) with multiple nested loops using laser or ultrasonic ranging sensors.

Full Article - Link.
Video - Link.

[Computational Biology Seminar Series] High-throughput reconstruction of brain circuits

Date: October 5, 2007

High-throughput reconstruction of brain circuits: how machine vision will revolutionize neuroscience

Speaker: Dmitri B. Chklovskii

How does electrical activity in neuronal circuits give rise to intelligent behavior? We believe that this question is impossible to answer without a comprehensive description of neurons and synaptic connections between them. Absence of such a description, often called a wiring diagram, has been holding back the development of neuroscience. We believe that recent technological advances in high-resolution imaging and machine vision will make possible the reconstruction of whole wiring diagrams of simpler organisms or significant parts of more complex systems, such as the mammalian neocortex. Such reconstructions promise to revolutionize neuroscience just like human genome sequencing revolutionized molecular biology.

Wednesday, October 03, 2007

Robotics Institute Thesis Oral 2 Oct 2007 (Navigation Among Movable Obstacles)

Robots would be much more useful if they could move obstacles out of the way.

Traditional motion planning searches for collision free paths from a start to a goal. However, real world search and rescue, construction, home and nursing home domains contain debris, materials clutter, doors and objects that need to be moved by the robot.

Theoretically, one can represent all possible interactions between the robot and movable objects as a huge search. We present methods that simplify the problem and make Navigation Among Movable Obstacles (NAMO) a practical challenge that can be addressed with existing hardware and computation.

For more information

Monday, October 01, 2007

Robotics Institute Seminar : From Images to Insights

Speaker : Wojciech
Title: From Images to Insights

Time and Place :
Maudlin Auditorium (NSH 1305 )
Talk 3:30 pm

Abstract :
With the success of digital photography during the past few years we have witnessed a revolution in the way photographs and videos are captured and processed. Today, our ability to acquire images and video far outstrips our ability to make sense of that data. This is true not only in personal and commercial applications but also in the sciences, where huge amounts of image data are acquired from scanners, microscopes, telescopes, and various other instruments.
We desperately need better abstractions that can improve our ability to gain insight from large collections of image data. I argue that proper data analysis can transform the data into a meaningful and perceptually intuitive representation. However, conceiving the right representation is not straightforward and it can benefit greatly from appropriate data visualization and human involvement. Fortunately, once the right abstraction is found it leads to a better and simpler acquisition method. To summarize, the whole process often involves a complex interplay among data acquisition, data visualization, and data representation.
In this talk I will discuss a number of data-driven hierarchical representations that tame the complexity of high-dimensional visual data. First, I will address the representation of spatially-varying appearance using a tree-structured factorization method and a new matrix decomposition algorithm. Then I will show how to generalize these ideas to decompose time-lapse video into simple and intuitive components that can be edited. Finally, I will discuss the MERL face-scanning project, where we collected a database of over 400 subjects with thousands of images each in order to build high-quality statistical models of human faces.