Friday, December 30, 2005

GaTech: Frank Dellaert

Chech out Frank Dellaert's web page,
Some projects are very interesting.

CMU RI: Robosapien Hacking

Click this link for more information on how to write your own code to control the Robosapien humanoid.

The offical Robosapien web site:

The CMU Humanoids Course:

Robot Dream Exposition Taiwan.





I am a new member of this Lab., "???"

Hi all,

My name is ???"Ko-Chih Wang" (Casey) and I also often use the ID, "CL".
Now, I am a NTU-INM master2 student.
Originally, I am a member of ubicomp lab., but I will join your group next year( or semester ).
About the robotics, I have no any fundamental knowledge.
Therefore, I hope that I can learn a lot from this lab. and everyone.
In the near furture, I think that I will move to the new lab and work with you all. :)
Anyway, It's my pleasure to know you all.

My msn: cl_kcw (AT) hotmail (dot) com
Mail (Gmail): caseywang777

Thursday, December 29, 2005

MIT report: ErrorWeighted Classifier Combination for Multi-modal Human Identification

Yuri Ivanov, Thomas Serre, Jacob Bouvrie


In this paper we describe a technique of classifier combination used in a human identification system. The system integrates all available features from multi-modal sources within a Bayesian framework. The framework allows representing a class of popular classifier combination rules and methods within a single formalism. It relies on a “perclass” measure of confidence derived from performance of each classifier on training data that is shown to improve performance on a synthetic data set. The method is especially relevant in autonomous surveillance setting where varying time scales and missing features are a common occurrence. We show an application of this technique to the real-world surveillance database of video and audio recordings of people collected over several weeks in the office setting.


CMU report: Market-Based Multirobot Coordination: A Survey and Analysis

M.B. Dias, R.M. Zlot, N. Kalra, and A. Stentz. tech. report CMU-RI-TR-05-13, Robotics Institute, Carnegie Mellon University, April, 2005.

Market-based multirobot coordination approaches have received significant attention and gained considerable popularity within the robotics research community in recent years. They have been successfully implemented in a variety of domains ranging from mapping and exploration to robot soccer. The research literature on market-based approaches to coordination has now reached a critical mass that warrants a survey and analysis. This paper addresses this need by providing an introduction to market-based multirobot coordination, a comprehensive review of the state of the art in the field, and a discussion of remaining challenges. The pdf file.

MIT report: Automatic Software Upgrades for Distributed Systems (PhD thesis)

Author: Sameer Ajmani

October 6, 2005

Upgrading the software of long-lived, highly-available distributed systems is difficult. It is not possible to upgrade all the nodes in a system at once, since some nodes may be unavailable and halting the system for an upgrade is unacceptable. Instead, upgrades may happen gradually, and there may be long periods of time when different nodes are running different software versions and need to communicate using incompatible protocols. We present a methodology and infrastructure that address these challenges and make it possible to upgrade distributed systems automatically while limiting service disruption. Our methodology defines how to enable nodes to interoperate across versions, how to preserve the state of a system across upgrades, and how to schedule an upgrade so as to limit service disruption. The approach is modular: defining an upgrade requires understanding only the new software and the version it replaces. The upgrade infrastructure is a generic platform for distributing and installing software while enabling nodes to interoperate across versions. The infrastructure requires no access to the system source code and is transparent: node software is unaware that different versions even exist. We have implemented a prototype of the infrastructure called Upstart that intercepts socket communication using a dynamically-linked C++ library. Experiments show that Upstart has low overhead and works well for both local-area and Internet systems.
The pdf file.

CMU report: An Analysis of the Human Odometer

U. Wong, C. Lyons, and S. Thayer. tech. report CMU-RI-TR-05-47, Robotics Institute, Carnegie Mellon University, September, 2005.

The Human Odometer is a personal navigation system developed to provide reliable, lightweight, cost-effective, and embedded absolute 3-D position and communication to firefighters, policemen, EMTs, and dismounted soldiers. The goal of the system is to maintain accurate position information without reliance on external references. The Human Odometer system provides real-time position updates and displays maps of relevant areas are to the user on a handheld computer. The system is designed to help a user place himself in a global context and navigate unknown areas under a variety of conditions. This paper provides a quantitative analysis of the in-field operational performance of the system. The pdf file.

Monday, December 26, 2005

My talk this week

I will present this paper:
    Real-time Non-Rigid Surface Detection (pdf)
by Julien Pilet, Vincent Lepetit, Pascal Fua
of Computer Vision Laboratory, École Polytechnique Fédérale de Lausanne, Switzerland

  We present a real-time method for detecting deformable surfaces, with no need whatsoever for a priori pose knowledge.
  Our method starts from a set of wide baseline point matches between an undeformed image of the object and the image in which it is to be detected. The matches are used not only to detect but also to compute a precise mapping from one to the other. The algorithm is robust to large deformations, lighting changes, motion blur, and occlusions. It runs at 10 frames per second on a 2.8 GHz PC and we are not aware of any other published technique that produces similar results.
  Combining deformable meshes with a well designed robust estimator is key to dealing with the large number of parameters involved in modeling deformable surfaces and rejecting erroneous matches for error rates of up to 95%, which is considerably more than what is required in practice.

There are some videos on their project website.

Monday, December 19, 2005

My talk this week: Information Gain-based Exploration

Information Gain-based Exploration Using Rao-Blackwellized Particle Filters

Paper in RSS 2005: (8 page, pdf, 520KB)

Simultaneous Localization, Mapping and Exploration
Related Works
Rao-Blackwellized Particle Filters (RBPF)
The Uncertainty
Maximizing The Information Gain

Saturday, December 17, 2005

CNN: Sony robot keeps a third eye on things

Friday, December 16, 2005; Posted: 4:09 p.m. EST (21:09 GMT)

TOKYO, Japan (Reuters) -- Robots may not be able to do everything humans can, but the latest version of Sony humanoid robot has something many people might find useful: a third eye.

The Japanese consumer electronics company's roller-skating robot, QRIO, has now been enlightened with an extra camera eye on its forehead that allows it to see several people at once and focus in on one of them.

The full article


Read this issue online:

The Fourth IEEE International Conference on Pervasive Computer and Communications (PerCom) will act as a platform for pervasive computing researchers to swap ideas and interact through a variety of workshops. The conference will place special emphasis on the significance of pervasive computing as the natural outcome of advances in wireless networks, mobile computing, distributed computing, and other technologies, and will provide individual workshops and work-in-progress sessions for attendees who wish to better understand some of the current technologies contributing to these rapid advancements. PerCom will convene in Pisa, Italy, from 13 to 17 March 2006. For more information, or to register to attend, visit:

A new system of computerized brain-mapping techniques may greatly improve a neurosurgical technique used to treat movement disorders such as Parkinson's disease and multiple sclerosis, say its developers at Vanderbilt University. Called deep brain stimulation (DBS), the process involves implanting electrodes deep in the brain, typically one electrode in each hemisphere, a difficult and expensive operation that can take as long as 12 hours per electrode. The new system works from a three-dimensional brain atlas that combines the scans of 21 post operative DBS patients using sophisticated computer-mapping methods, the Vanderbilt team says, then superimposing the atlas on a new patient's scan. The new system automates the most difficult part of the operation: precisely locating pea-sized targets deep in the brain which are not visible in brain scans or to the naked eye, and doing do more quickly and accurately than experienced neurosurgeon, according to researchers writing in IEEE Transactions on Medical Imaging. Read more:<>

According to D.K. Arvind of the Institute for Computing Systems Architecture at the University of Edinburgh, tiny grain-sized network semiconductors could one day be sprayed onto surfaces to give computer access to places out of reach. Dubbed the "Speck-net", each tiny sensor will have its own processor, about two kilobytes of memory, and a program that gives it the ability to extract information from the environment. Each "speck" would be able to communicate wirelessly with one another, to gather information and create a larger picture of a problem. The system is currently under simulation at the Speckled Computing Consortium in the UK. Arvind hopes that one day the Speck-net can be used for real applications, such as detecting structural failures in airborne planes and in helping prevent strokes.

Friday, December 16, 2005

MIT Talk: Information Gain-based Exploration for Mobile Robots Using Rao-Blackwellized Particle Filters

Speaker: Cyrill Stachniss , University of Freiburg
Date: Friday, December 16 2005
Time: 1:00PM to 2:00PM
Location: 32-397
Host: Nick Roy
Contact: Nicholas Roy, x3-2517,

This talk presents an integrated approach to exploration, mapping, and localization. Our algorithm uses a highly efficient Rao-Blackwellized particle filter to represent the posterior about maps and poses. It applies a decision-theoretic framework which simultaneously considers the uncertainty in the map and in the pose of the vehicle to evaluate potential actions. It trades off the cost of executing an action with the expected information gain and takes into account possible sensor measurements gathered along the path taken by the robot. We furthermore describe how to utilize the properties of the Rao-Blackwellization to efficiently compute the expected information gain. We present experimental results obtained in the real world and in simulation to demonstrate the effectiveness of our approach.

Cyrill Stachniss studied computer science at the University of Freiburg and received his MSc in 2002. Currently he is a PhD student in the research lab for autonomous intelligent systems headed by Wolfram Burgard at the University of Freiburg. His research interests lie in the areas of mobile robot exploration, SLAM, and collision avoidance. He submitted his PhD thesis titled "Exploration and Mapping with Mobile Robots" in December 2005.


Thursday, December 15, 2005

[]: Latest News from (December 14, 2005)

New Articles (more info below):

Assistware Looking for New Talent to Expand Team
Assistware Technology, a longtime provider of systems using vision-based lane detection technology, is looking for new staff as they ramp up for some new projects, including IVBSS (see related article). A program manager, embedded hardware engineer, vision systems engineer, and systems engineers are being sought. Check the job descriptions by clicking the link on the IVsource homepage.

Seeing Machines Partners with Australian Researchers to Diagnose Drowsy Drivers
Australia’s ICT Centre of Excellence, National ICT Australia (NICTA) and Seeing Machines Limited, a global leader in computer vision technology, have signed a one year research collaboration agreement to explore the use of information and communications technologies (ICT) to reduce road accidents relating to driver fatigue. The project will develop ICT solutions to detect the subtle shifts in muscular control and response during the onset of fatigue when driving; the phenomenon which leads to the well known“micro-nod”.

UMTRI Leads Winning Team for USDOT IVBSS Project
USDOT has awarded $25 million for the Intelligent Vehicle-Based Safety Systems Field Operational Test project to the University of Michigan Transportation Research Institute (UMTRI), which will be the largest FOT project of this type within the current government program.

Wednesday, December 14, 2005

MIT Report : Conditional Random People : Tracking Humans with CRFs and Grid Filters

Leonid Taycher, Gregory Shakhnarovich, David Demirdjian, and Trevor Darrell


We describe a state-space tracking approach based on a
Conditional Random Field (CRF) model, where the observation
potentials are learned from data. We find functions
that embed both state and observation into a space where
similarity corresponds to L1 distance, and define an observation
potential based on distance in this space. This potential
is extremely fast to compute and in conjunction with
a grid-filtering framework can be used to reduce a continuous
state estimation problem to a discrete one
. We show
how a state temporal prior in the grid-filter can be computed
in a manner similar to a sparse HMM, resulting in
real-time system performance. The resulting system is used
for human pose tracking in video sequences.



Read this issue online:

This month's issue of IEEE Communications Magazine (v. 43, no. 12) presents a special focus on the future of convergent portable devices that integrate not only cameras and cell phones, but also other functions such as wireless LAN (WLAN), personal video recording (PVR), gaming and digital TV. Articles in this issue take a closer look at topics such as mobile imaging, graphics processing capabilities and integration. The guest editorial on the topic is now accessible to all readers at:

Abstracts for the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) are due by 15 February 2006. Papers can focus on a diversity of Robotics technologies spanning academic, public, and governmental initiatives. For instance, will robots one day be used as assistants to human beings? What future technologies may allow engineers to design such complicated machines? Topics may range from innovative robot
designs to ethical issues in human-robot interaction research. The conference will take place next September in Hertfordshire, U.K. For more information, visit: <>


Read this issue online:

Two California scientists have created a mathematical theory of surprise based on principles of probability applied to a digital environment and experiments that record eye movements of volunteers. Researchers from the University of Southern California Viterbi School of Engineering and the University of California Irvine Institute for Genomics and Bioinformatics developed their theory using the stream of electronic data making up a video image as a proxy for the complex flood of stimuli in a real environment. By analyzing a data stream, the researchers say they can isolate unique visual stimuli, called "salient," "novelty," and "entropy." The researchers say they have worked out a way of predicting how observing new data will affect the set of beliefs an observer has developed about the world on the basis of data previously received. The scientists analyze a video stream to describe its most "surprising," features, then check the analysis by watching the eye movements of observers viewing the images, to see if the movements correlated with the measure of surprise. Read more:

Monday, December 12, 2005

about my talk

My talk this week will discuss about the following paper:The link
Detecting irregularities in images and in video
(Receive Honorable Mention for the 2005 Marr Prize)

Author : Oren Boiman , Michal Irani (Dept. of Computer Science and Applied Math , The Weizmann Institute of Science , Israel)

Abstract :
We address the problem of detecting irregularities in visualdata, e.g., detecting suspicious behaviors in video sequences,or identifying salient patterns in images. The term“irregular” depends on the context in which the “regular”or “valid” are defined. Yet, it is not realistic to expectexplicit definition of all possible valid configurations fora given context. We pose the problem of determining thevalidity of visual data as a process of constructing a puzzle:We try to compose a new observed image region ora new video segment (“the query”) using chunks of data(“pieces of puzzle”) extracted from previous visual examples(“the database”). Regions in the observed data whichcan be composed using large contiguous chunks of datafrom the database are considered very likely, whereas regionsin the observed data which cannot be composed fromthe database (or can be composed, but only using smallfragmented pieces) are regarded as unlikely/suspicious. Theproblem is posed as an inference process in a probabilisticgraphical model. We show applications of this approach toidentifying saliency in images and video, and for suspiciousbehavior recognition.

Sunday, December 11, 2005


Call for Papers

4th International Conference On Smart homes and health Telematics
ICOST2006 - 26-28 June, 2006 - Belfast, Northern Ireland, UK

University of Ulster

After three successful editions held in France (2003), Singapore (2004), and Canada (2005), ICOST2006 aims to continue to develop an active research community dedicated to explore how Smart Homes and Health Telematics can foster independent living and offer an enhanced quality of life for ageing and disabled people. A Smart Home can be considered to be an augmented environment with the ability to consolidate embedded computers, information appliances, micro/nano systems, and multi-modal sensors to offer people unprecedented levels of access to information and assistance from information and communication technology. Health Telematics makes the most of networks and telecommunications to provide, within the home environment, health services, expertise and information and hence radically transform the way health-related services are conceived and delivered. We believe that in the future ageing and disabled people will use smart assistive technology to perform daily living activities, socialize, and enjoy entertainment and leisure activities. Nowadays networks, microprocessors, memory chips, smart sensors and actuators are faster, cheaper, more intelligent and smaller than ever. Current advances in such enabling technologies coupled with evolving care paradigms allow us to foresee novel applications and services for improving the quality of life for ageing and disabled people both inside and outside of their homes. The conference will present the latest approaches and technical solutions in the area of Smart Homes, Health Telematics, and emerging enabling technologies. Technical topics of interest include, but are not limited to:

o Intelligent Environments / Smart Homes
o Medical Data Collection and Processing
o Human-Machine Interface / Ambient Intelligence
o Modeling of Physical and Conceptual Information in Intelligent Environments
o Vision / Hearing / Cognitive Devices
o Tele-Assistance and Tele-Rehabilitation
o Personal Robotics and Smart Wheelchairs
o Context Awareness / Autonomous Computing
o Home Networks / Residential Gateways
o Wearable Sensors / Integrated Micro/Nano Systems / Home Health Monitoring
o Social / Privacy / Security Issues
o Middleware Support for Smart Home and Health Telematic Services

Each year, ICOST has a specific flavour. ICOST2003 focused on usability. The theme was "Independent living for persons with disabilities and elderly people". The theme for ICOST2004 was "Towards a Human-Friendly Assistive Environment" and for ICOST2005 was "From Smart Homes to Smart Care". This year the conference has the theme "Smart Homes and Beyond". This focuses on promoting personal autonomy and extending the quality of life. Papers or special sessions addressing the following topics are especially encouraged:

o Inclusive smart home services
o Smart services inside and outside of the home
o Situation awareness
o Location-based services
o Mobility of service delivery

Submission of Papers: There will be a combination of presentations including scientific papers, posters, exhibits and technology demonstrations. Prospective authors are invited, in the first instance, to submit papers for oral presentation in any of the areas of interest for this conference as well as proposals for Special Sessions. The initial submission for evaluation should be in the form of a 4-8 page paper outline. Authors are strongly recommended to submit the full 8-page English language version. IOS Press will publish the proceedings as a volume of the Assistive Technology Research Series therefore their paper publication format ( should be used for submission.

Important Dates:
Papers submission: 20 January, 2006
Author notification: 3 March, 2006
Camera-ready copy: 3 April, 2006

Conference Web Page:
Conference Organisation Email:

Conference Venue: ICOST2006 will be held at the Culloden Hotel, Belfast, Northern Ireland, UK

MIT talk: Object and Place Recognition from Invariant Local Features

speaker: David Lowe , University of British Columbia
date: 2005/12/12

Within the past few years, invariant local features have been successfully applied to a wide range of recognition and image matching problems. For recognition applications, it has proved particularly important to develop features that are distinctive as well as invariant, so that a single feature can be used to index into a large database of features from previous images. Robust recognition can then be achieved by identifying clusters of features with geometric consistency followed by detailed model fitting. Efficiency can be obtained with approximate nearest-neighbor methods that identify matches in a large database in real time. Recent work will be presented on applications to location recognition, augmented reality, and the detection of image panoramas from unordered sets of images.

David Lowe is a professor of computer science at the University of British Columbia and a Fellow of the Canadian Institute for Advanced Research. He received his Ph.D. in computer science from Stanford University in 1984. From 1984 to 1987 he was an Assistant Professor at the Courant Institute of Mathematical Sciences at New York University. He is a member of the scientific advisory board for Evolution Robotics. His research interests include object recognition, local invariant features for image matching, robot localization, and models of human visual recognition.

Saturday, December 10, 2005

CVPR Paper: Probabilistic parameter-free motion detection

T. Veit, F. Cao, P. Bouthemy. Probabilistic parameter-free motion detection. In Conf. Computer Vision and Pattern Recognition, CVPR'04, Washington, DC, June 2004.

We propose an original probabilistic parameter-free method for the detection of independently moving objects in an image sequence. We apply a probabilistic perceptual principle, the Helmholtz principle, whose main advantage is the automatization of the detection decision, by providing a tight control of the number of false alarms. Not only does this method localize the moving objects but it also answers the preliminary question of the presence of motion. In particular, the method works even when no assumption on motion presence is made. The algorithm is composed of three independent steps: estimation of the dominant image motion, spatial segmentation of object boundaries and independent motion detection itself. We emphasize that none of these steps needs any parameter tuning. Results on real image sequences are reported and validate the proposed approach.

With background paper on grouping.

CNN: Robot chopper documents Katrina's power: 'Flying camera' may be ready for next hurricane season

By Marsha Walton, CNN
Friday, December 9, 2005; Posted: 12:20 p.m. EST (17:20 GMT)

BILOXI, Mississippi (CNN) -- "And let's go out over the motel roof so we can get the seams ... go out a little further... alright that's good, hold there."

Kevin Pratt communicated with the other members of the helicopter flight team to shoot the best angles of video of two Katrina damaged structures.

What was unique about this flight? The four-person crew was on the ground, while the camera-carrying, 10-pound robotic aircraft flew around the buildings.

Headed by University of South Florida robotics professor Robin Murphy, the team documented damage to multistory buildings hit hard by the hurricane.

the full article

CMU talk: Massively Scalable Computer Vision - The Next Great Challenge

Craig Coulter, HyperActive Technologies, Inc.
Monday, December 12, 2005

Computer vision technologies are only beginning to emerge from research and development and into broader application. Scaling vision algorithms presents enormous, unaddressed challenges to the community and is creating a whole new branch of computer vision research: Massively Scalable Systems.

Consider for a moment the challenges of simply testing a computer vision application. Testing is currently peformed by hand, through visual inspection, and usually by the research group itself, across a relatively small batch of test images - usually a few hundred to a few thousand.

HyperActive Technologies is applying computer vision technologies into the quick-service and general retail markets - applications where the same core set of detection and tracking algorithms will operate in tens of thousands of locations daily. Developing a "retail grade" computer vision application for a 20,000 store chain will require building a system that processes some 40 billion images per day, or 15 trillion images per year. Achieving this goal will require the community to completely rethink its approach to application development, testing, and in-field performance monitoring.

This talk will focus on exploring the challenges of defining a new area of computer vision research: "Massively Scalable Computer Vision Systems".

Speaker Bio
Dr. R. Craig Coulter is Co-Founder and Chief Scientist of HyperActive Technologies, Inc. a Pittsburgh area robotics that addresses the real-time decision-making problems plaguing high-volume, high-demand markets like quick-service restaurants, retail, and grocery stores.

Dr. Coulter is a robotics scientist focused on the commercialization of intelligent robotics technologies. He began his work at the National Robotics Engineering Consortium (NREC), where he organized a project with Ford Motor Company to commercialize a revolutionary vision-based position estimation system that he co-invented. He co-founded Highlander Systems, Inc., a vision systems engineering company; Tpresence, Inc., an early distributed computing company, where he served as CEO; and Mammoth Ventures, a firm focused on the development of new robotics-related intellectual property, where he currently serves as Managing Partner. In 2001, Dr. Coulter co-founded HyperActive Technologies, Inc. Dr. Coulter is a graduate of the PhD in Robotics from Carnegie Mellon's School of Computer Science.

Friday, December 09, 2005

A Vision-Based Approach to Collision Prediction at Traffic Intersections

Stefan Atev, Hemanth Arumugam, Osama Masoud, Ravi Janardan, Senior Member, IEEE, and
Nikolaos P. Papanikolopoulos, Senior Member, IEEE


Monitoring traffic intersections in real time and predicting
possible collisions is an important first step towards building
an early collision-warning system. We present a vision-based
system addressing this problem and describe the practical adaptations
necessary to achieve real-time performance. Innovative
low-overhead collision-prediction algorithms (such as the one
using the time-as-axis paradigm) are presented. The proposed
system was able to perform successfully in real time on videos
of quarter-video graphics array (VGA) (320 × 240) resolution
under various weather conditions. The errors in target position
and dimension estimates in a test video sequence are quantified
and several experimental results are presented.

Index Terms

Collision prediction, machine vision, real-time
systems, tracking, traffic control (transportation).


Detection of Text on Road Signs From Video

Wen Wu, Member, IEEE, Xilin Chen, Member, IEEE, and Jie Yang, Member, IEEE

A fast and robust framework for incrementally detecting
text on road signs from video is presented in this paper.
This new framework makes two main contributions. 1) The
framework applies a divide-and-conquer strategy to decompose
the original task into two subtasks, that is, the localization of road
and the detection of text on the signs. The algorithms for the
two subtasks are naturally incorporated into a unified framework
through a feature-based tracking algorithm. 2) The framework
provides a novel way to detect text from video by integrating
two-dimensional (2-D) image features in each video frame (e.g.,
color, edges, texture) with the three-dimensional (3-D) geometric
structure information
of objects extracted from video sequence
(such as the vertical plane property of road signs). The feasibility
of the proposed framework has been evaluated using 22 video
sequences captured from a moving vehicle. This new framework
gives an overall text detection rate of 88.9% and a false hit rate of
9.2%. It can easily be applied to other tasks of text detection from
video and potentially be embedded in a driver assistance system.

Index Terms
Object detection from video, road sign detection,
text detection, vehicle navigation.


Thursday, December 08, 2005

CMU RI Thesis Proposal: A Constraint Based Approach to Interleaving Planning and Execution for Multirobot Coordination

Speaker: Mary Koes, RI, CMU
Date: 14 Dec. 2005
Time: 10:30 AM
Location: 14 Dec. 2005

Enabling multiple robots to work together as a team is a difficult problem. Robots must decide amongst themselves who should work on which goals and at what time each goal should be achieved. Since the team is situated in some physical environment, the robots must consider travel time in these decisions. This is particularly challenging in time critical domains where goal rewards decrease over time and for tightly coupled coordination where multiple robots must work together on each goal. Further complications arise when the system is subjected to additional constraints on the ordering of the goals, the use of resources, or the allocation of robots to goals. Optimal team behavior can only be achieved when robots simultaneously consider path planning, task allocation, scheduling, and these additional system constraints. In dynamic and uncertain environments, robots need to reevaluate these decisions as they discover new information. Communication failures may mean that robots are unable to consult as a whole team while replanning. The proposed thesis addresses these challenges with four main points.

Further details:
A copy of the thesis proposal document can be found at

Wednesday, December 07, 2005

Talk Today: Affine Structure From Sound

Affine structure from sound,
Sebastian Thrun

We consider the problem of localizing a set of microphones together with a set of external acoustic events (e.g., hand claps), emitted at unknown times and unknown locations. We propose a solution that approximates this problem under an “orthocoustic” model defined in the calculus of affine geometry, and that relies on SVD to recover the affine structure of the problem. We then define low-dimensional optimization techniques for embedding the solution into Euclidean geometry, and further techniques for recovering the locations and emission times of the acoustic events. The approach is useful for the calibration of ad-hoc microphone arrays and sensor networks (though it requires centralized computation).

Tuesday, December 06, 2005

Paper: Learning user models of mobility-related activities through instrumented walking aids

J. Glover, S. Thrun, and J.T. Matthews.

We present a robotic walking aid capable of learning models of users' walking-related activities. Our walker is instrumented to provide guidance to elderly people when navigating their environments; however, such guidance is difficult to provide without knowing what activity a person is engaged in (e.g., where a person wants to go). The main contribution of this paper is an algorithm for learning models of users of the walker. These models are defined at multiple levels of abstractions, and learned from actual usage data using statistical techniques. We demonstrate that our approach succeeds in determining the specific activity in which a user engages when using the walker. One of our proto-type walkers was tested in an assisted living facility near Pittsburgh, PA; a more recent model was extensively evaluated in a university environment.

The full paper is available in PDF and gzipped Postscript

Sunday, December 04, 2005

CMU talk: Probabilistic Policy Reuse in Reinforcement Learning

Speaker: Fernando Fernandez Rebollo, CMU
Date: December 05
Abstract: We contribute Policy Reuse as a technique to improve a reinforcement learner with guidance from past learned similar policies. Our method relies on using the past policies in a novel way as a probabilistic bias where the learner faces three choices: the exploitation of the ongoing learned policy, the exploration of random unexplored actions, and the exploitation of past policies. We introduce the algorithm and its major components: an exploration strategy to include the new reuse bias, and a similarity metric to estimate the similarity of past policies with respect to a new one. We provide empirical results demonstrating that Policy Reuse improves the learning performance over different strategies that learn without reuse. Policy Reuse further contributes the learning of the structure of a domain. Interestingly and almost as a side effect, Policy Reuse identifies classes of similar policies revealing a basis of "eigen-policies" of the domain. In general, Policy Reuse contributes to the overall goal of lifelong reinforcement learning, as (i) it incrementally builds a policy library; (ii) it provides a mechanism to reuse past policies; and (iii) it learns an abstract domain structure in terms of eigen-policies of the domain.

This is joint work with Prof. Manuela Veloso.

Thursday, December 01, 2005

Project: Wheelesley

The link.

This research project started at Wellesley College in January 1995 where Holly Yanco was an Instructor in the Computer Science Department. The project has since moved to the MIT Artificial Intelligence Laboratory.

MIT Tech Report: Accurate and Scalable Surface Representation and Reconstruction from Images

Author[s]: Gang Zeng, Sylvain Paris, Long Quan, Francois Sillion

November 18, 2005

We introduce a new surface representation, the patchwork, to extend the problem of surface reconstruction from multiple images. A patchwork is the combination of several patches that are built one by one. This design potentially allows the reconstruction of an object of arbitrarily large dimensions while preserving a fine level of detail. We formally demonstrate that this strategy leads to a spatial complexity independent of the dimensions of the reconstructed object, and to a time complexity linear with respect to the object area. The former property ensures that we never run out of storage (memory) and the latter means that reconstructing an object can be done in a reasonable amount of time. In addition, we show that the patchwork representation handles equivalently open and closed surfaces whereas most of the existing approaches are limited to a specific scenario (open or closed surface but not both). Most of the existing optimization techniques can be cast into this framework. To illustrate the possibilities offered by this approach, we propose two applications that expose how it dramatically extends a recent accurate graph-cut technique. We first revisit the popular carving techniques. This results in a well-posed reconstruction problem that still enjoys the tractability of voxel space. We also show how we can advantageously combine several image-driven criteria to achieve a finely detailed geometry by surface propagation. The above properties of the patchwork representation and reconstruction are extensively demonstrated on real image sequences.
[PDF] [PS]

IEEE Career Alert

2. Are Asian Scientists Bumping Up Against a Glass Ceiling in the US?

Asians "are known for being great scientists," but probably shouldn't look forward to heading science labs, says Kuan-Teh Jeang, a virologist at the U.S. National Institutes of Health (NIH). Earlier this year, Taiwan-born Jeang compiled statistics in a bid to confirm or refute anecdotal evidence that there were few opportunities for career advancement for Asian researchers at NIH. What he found was disheartening. Though 21.5 percent of the agency's tenure-track investigators are Asian, only 9.2 percent of senior investigators are of Asian descent. And only 4.7 percent of the people heading NIH labs or branches are Asian.

A similar examination of the American Society for Biology and Molecular Biology (ASBMB) by Yi Rao, a neuroscientist at Northwestern University in Evanston, Illinois, uncovered equally bad news for Asian scientists. In letters to the governing boards of ASBMB and the Society for Neuroscience penned in July, Rao wrote, "However the phenomenon can be described, the underlying problem is discrimination. [Asian] Americans tend to be quiet, partly because their voices and concerns are not listened to. But should that mean obedience and subordination forever?"

For more on whether there is a level playing field in scientific research, and to see what officials at these organizations have done in response, read on at:the link

4. Taiwan to Take Center Stage in IC development?

According to Nicky Lu, a former IBM researcher who was a co-inventor of the advanced DRAM technology Big Blue was using when he left the company to return to his native Taiwan in 1991, the global technology market is undergoing a shift that will move semiconductor R&D and other so-called knowledge work from a "pan-Atlantic IC circle" centered in the United States to a "pan-Pacific circle" with Taiwan at its center.

Lu has founded three technology companies in his homeland since a former government minister convinced him to help build the country's nascent IC industry. In an article, he discusses Taiwan's changing role in the global IC market, the importance of intellectual property to generating profit, the entrepreneurial spirit of Taiwanese engineers, and his "pool theory"--of which he says, "the United States has proved that the more open and enjoyable a society is, the more likely it is that all the talent will go there. If you make your pool the cleanest and most beautiful, then people will come over." Read on at (free registration required): the link

Wednesday, November 30, 2005

CMU talk: Preference Elicitation in Constraint-based Decision Problems

Craig Boutilier, Department of Computer Science, University of Toronto

WHEN: 10:30am Wed., Nov. 30
ABSTRACT: Preference elicitation is generally required when making or recommending decisions on behalf of users whose utility function is not known with certainty. Although one can engage in elicitation until a utility function is perfectly known, in practice, this is infeasible. This talk tackles this problem in constraint-based optimization.

I will first describe a graphical model for utility representation and issues associated with elicitation in this model. I then discuss two methods for optimization with imprecise utility information: a Bayesian approach in which utility function uncertainty is quantified probabilistically; and a distribution-free minimax regret model. Finally, I will describe several heuristic strategies for elicitation.

This work describes several joint projects with: Darius Braziunas, Relu Patrascu, Pascal Poupart and Dale Schuurmans.

SPEAKER BIO: Craig Boutilier received his Ph.D. in Computer Science (1992) from the University of Toronto, Canada. He is Professor and Chair of the Department of Computer Science at the University of Toronto. He was previously an Associate Professor at the University of British Columbia, a consulting professor at Stanford University, and has served on the Technical Advisory Board of CombineNet, Inc. since 2001.
Dr. Boutilier's research interests span a wide range of topics, with a focus on decision making under uncertainty. He has been awarded the Isaac Walton Killam Research Fellowship, and an IBM Faculty Award. He also received the Killam Teaching Award.

CMU talk: Learning Image Manifolds != Manifold Learning

Robert Pless
Washington University in St. Louis

Monday, December 5, 2005

This talk will detail my explorations in applying Manifold Learning techniques to real problems in image processing. Initial experiments with natural image sets (What is the intrinsic dimension of a Charlie Chaplin video clip?... Do cardio-pulmonary MR-images have a natural 2D parameterization?) illuminate several limitations of existing algorithms. First, using Euclidean (sum-of-squared pixel intensity difference) distance is usually a poor choice of image distance functions for natural images. Second, many natural image manifolds have a cyclic topology (and thus cannot be cleanly embedding into a Euclidean space). Third, natural data sets often include unlabeled examples from multiple, intersecting low-dimensional manifolds.

I will talk about several heuristic (and occasionally well founded) algorithms for choosing effective local image distance measures, finding minimal parameterizations for cyclic manifolds, and simultaneously clustering and parameterizing data from multiple intersecting manifolds. These have been brought together in an end-to-end application which automatically learns the 2D manifold structure of (ungated, free-breathing) cardiac MRI images of a patient, and uses the manifold structure of the images to regularize the segmentation of the left ventricle simultaneously in all frames.

Short Bio
Robert Pless is an Assistant Professor of Computer Science, and Assistant Director of the Center for Security Technologies at Washington University. His research interests focus on video processing; motion estimation for video surveillance and manifold learning for applications in biomedical imaging. He received a BS from Cornell University in 1994 and a PhD from the University of Maryland in 2000, and was chairman of the IEEE OMNIVIS workshop in 2003.

CMU talk: Human-Robot Systems for Planetary Exploration

Title: Human-Robot Systems for Planetary Exploration

Speaker: Salvatore Domenick Desiano, Research Scientist, Intelligent Systems Division (QSS Group, Inc), NASA Ames Research Center

Date: Thursday, December 1

Planetary robots will be used in many contexts -- Martian and Lunar, alone and with humans, for construction and for scientific exploration, to name a few. The Intelligent Robotics Group at the NASA Ames Research Center develops cross-cutting capabilities that enable robots to perform autonomously in all of these situations.

In this talk, I will focus on the results of the Collaborative Design Systems FY05 demonstration, performed in September. This was the largest demonstration of integrated robotic systems ever carried out at NASA Ames. The demonstration included visual target tracking, autonomous multi-SCIP (Single Cycle Instrument Placement), constraint-based temporal planning, human-robot collaboration, spoken dialog interfaces, multi-agent systems, and 3D visualization tools.

In addition to the results of this specific demonstration, I will briefly present some of the open research problems on which our group is interested in working or collaborating. I will also provide some inside perspective on the current state of NASA's robotics programs and funding sources.

Speaker Bio:
Salvatore Domenick Desiano is a robotics research scientist at the NASA Ames Research Center. As a member of the Intelligent Robotics Group, he leads the K-9 Rover Team of the CDS project, the most elaborate combination of human and robot planetary exploration ever demonstrated at NASA Ames. His research focuses on developing fundamental navigation capabilities for mobile robots, and he works extensively with the NASA Office of Education. He has also served as the Integration Lead for the Personal Satellite Assistant project. Salvatore is currently on leave from being a doctoral candidate at the Robotics Institute and will be returning to the program early 2006.

Tuesday, November 29, 2005

My talk this Wednesday


Multi-Plannar Projection by Fixed-Center Pan-Tile Projectors
{Ikuhisa Mitsugami , Norimichi Ukita , Masatsugu Kidode}


We describe a new steerable projector, whose projection center precisely corresponds with its rotation center, which we call a “fixed-center pan-tilt (FC-PT) projector.” This mechanism allows it be set up more easily to display graph- ics precisely on the planes in the environment than for other steerable projectors; wherever we would like to display graphics, all we have to do are locating the FC-PT projec- tor in the environment, and directing it to the corners of the planes whose 2D sizes have been measured. Moreover, as the FC-PT projector can recognize automatically whether each plane is connected to others, it can display visual in- formation that lies across the boundary line of two planes in a similar way to a paper poster folded along the planes.

Multi-Plannar Projection by Fixed-Center Pan-Tile Projectors

Sunday, November 27, 2005

Paper: The Smart Wheelchair Component System

Richard Simpson, PhD, ATP; Edmund LoPresti, PhD; Steve Hayashi, PhD; Illah Nourbakhsh, PhD; David Miller, PhD

Journal of Rehabilitation Research & Development (JRRD) Volume 41, Number 3B, Pages 429–442 May/June 2004

Abstract—While the needs of many individuals with disabilities can be satisfied with power wheelchairs, some members of the disabled community find it difficult or impossible to operate a standard power wheelchair. To accommodate this population, several researchers have used technologies originally developed for mobile robots to create“smart wheelchairs”that reduce the physical, perceptual, and cognitive skills necessary to operate a power wheelchair. We are developing a Smart Wheelchair Component System (SWCS) that can be added to a variety of commercial power wheelchairs with minimal modification. This paper describes the design of a prototype of the SWCS, which has been evaluated on wheelchairs from four different manufacturers. [PDF]

Wednesday, November 23, 2005

What's New @ IEEE in Signal Processing, November 2005

High frequency ultrasound waves may allow physicians to both visualize the heart's interior in three dimensions and selectively destroy heart tissue with heat to correct arrhythmias, according to engineers at Duke University who are developing the technology. Building on previous work, the Duke team has created dual-function ultrasound probes that use tiny cables, as many as two hundred of them in a three-millimeter catheter. To destroy aberrant tissue in the heart, physicians currently use electrodes that must touch the target tissue, and are guided by x-rays, which do not provide sharp images of soft tissue. The Duke engineers say their prototype can destroy target tissue without touching it, and is guided by much cleared three-dimensional imaging. The work is described in two research papers published in last month in the journals "IEEE Transactions on Ultrasonics, Ferroelectronics and Frequency Control" and "Ultrasonic Imaging." Read more

Researchers at Cambridge Ultrasonics, in conjunction with UK firm Sonatest, have developed an ultrasound sensor that can "see" inside concrete. The sensor works by firing sound waves from up to six different transducers and then registering the returning echoes. A visual map of the inside of the concrete is then displayed as a three-dimensional image. The system, still in the testing stage, was designed to monitor concrete structures for interior corrosion, such as cracks and fissures, particularly in a building's tendons, which act as the skeletons of structures. But the sensor is also of particular interest to police organizations, which could use the device to locate corpses buried in concrete. Bodies buried in concrete break down and leave voids which the sensor would record. Cambridge Ultrasonics is also working on a monitoring system which could be attached to structures to provide regular feedback on corrosion. Read more

What's New @ IEEE in Wireless, November 2005

The National Institute of Standards and Technology (NIST) says it is working with the building industry, public safety officials and information technologists to study how "intelligent" building systems can be used by firefighters, police and other first responders to assess emergency conditions in real-time. NIST is developing standards for various types of communication networks (including wireless networks) to transmit real-time building sensor information on mechanical systems, elevators, lighting, security and fire systems, occupant locations, and temperature and smoke conditions to first responders. According to NIST, the network information would include floor plans and live data from motion, heat, biochemical and other sensors and video cameras. Read more

In related first-responder news, the article "Service-Based Computing on
Manets: Enabling Dynamic Interoperability of First Responders," can be
found in the current issue of IEEE Intelligent Systems magazine

Sunday, November 20, 2005

CMU talk: Consistent Segmentation for Optical Flow Estimation

Larry Zitnick

When: Monday, November 21, 3:30 p.m.- 4:45 p.m.
The computation of optical flow in the presence of large displacements and occlusion boundaries is a difficult problem, both in terms of accuracy and computational efficiency. Recently, work in stereo vision has shown promising results using image segmentation to constrain the matching process. Unfortunately, these same segmentation approaches are highly inefficient for optical flow estimation, due to the increased search space (2D vs. 1D) needed for optical flow. We propose a new approach that simultaneously computes a consistent segmentation across images while estimating optical flow. This approach leads to a computationally efficient algorithm while producing accurate results. In addition, we'll present results in video interpolation and exaggerated motion blur using the computed flow fields.

MIT talk: Vision-based robotics: Representation, Mapping and Exploration

Speaker: Robert Sim, University of British Columbia
Date: Tuesday, November 22 2005

Autonomous mobile robot systems have an important role to play in a wide variety of application domains. A key component for autonomy is the capability to explore an unknown environment and construct a representation that a robotic agent can use to localize, navigate, and reason about the world. In this talk I will present results on the automatic construction of visual representations. First, the Visual Map representation will be introduced as a method for modelling the visual structure of the world. Second, I will present a flexible architecture for robust real-time vision-based mapping of an unknown environment. Finally, I will conclude with a discussion of recent progress on the problem of autonomous robotic exploration, and illustrate issues in the problem of developing robotic explorers that are naturally curious about their environment.

The Visual Map framework is an approach to representing the visual world that enables a robot to learn models of the salient visual features of an environment. A key component of this representation is the ability to learn mappings between camera pose and image-domain features without imposing a priori assumptions about the structure of the environment, or the optical characteristics of the visual sensor. These mappings can be employed as generative models in a Bayesian framework for solving the robot localization problem, as well as for visual servoing and path planning.

The second part of this talk demonstrates an architecture for performing simultaneous localization and mapping with vision. The main goals of our work are to facilitate robust large-scale mapping in real time using vision. We employ a Rao-Blackwellised particle filter for managing uncertainty and examine a variety of robust proposal distributions, as well as the run-time and scaling characteristics of our architecture.

The latter part of this builds on representation and mapping to address robotic exploration. In order to acquire a representation of the world, a robot must first acquire data. From an information-theoretic point of view, this problem involves moving through the world so as to maximize the information that can be gained
from what is observed along the robot's trajectory. However, computing the optimal trajectory is complicated by several factors, including the presence of noise, the time horizon over which the robot plans, the specific objective function that is optimized, and the robot's choice of sensor. I will present several results in this area that lead to the development of robust robotic systems that can plan over the long term and successfully demonstrate an emergent sense of curiosity.

MIT PhD Defense: A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images

Speaker: Lilla Zollei , MIT CSAIL Vision Research Group
Date: Tuesday, November 22 2005
Time: 10:30AM to 12:00PM
Location: Star Seminar Room, 32-D463, Stata Center
Host: Prof. Eric Grimson, Head, EECS Dept.; MIT CSAIL Vision Research Group

The field of medical image analysis has been rapidly growing for the past two decades. Besides a significant growth in computational power, scanner performance, and storage facilities, this acceleration is partially due to an unprecedented increase in the amount of data sets accessible for researchers. Medical experts traditionally rely on manual comparisons of images, but the abundance of information now available makes this task increasingly difficult. Such a challenge prompts for more automation in processing the images.

In order to carry out any sort of comparison between multiple medical images, one frequently needs to identify the proper correspondence between them. This step allows us to follow the changes that happen to anatomy throughout a time interval, to identify differences between individuals, or to acquire complementary information from different data modalities. Registration achieves such correspondences. In this dissertation we focus on the unified analysis and characterization of statistical registration approaches.

First we formulate and interpret a select group of pair-wise registration methods in the context of a unified statistical and information theoretic framework. This clarifies the implicit assumptions of each method and yields a better understanding of their relative strengths and weaknesses. This guides us to a new registration algorithm that incorporates the advantages of the previously described methods. Next we extend the unified formulation with analysis of the group-wise registration algorithms that align a population as opposed to pairs of data sets. Finally, we present our group-wise registration framework, stochastic congealing. That algorithm runs in a simultaneous fashion, with every member of the population approaching the central tendency of the collection at the same time. It eliminates the need for selecting a particular reference frame a priori, resulting in a non-biased estimate of a digital template. Our algorithm adopts an information theoretic objective function which is optimized via a gradient-based stochastic approximation process embedded in a multi-resolution setting. We demonstrate the accuracy and performance characteristics of stochastic congealing via experiments on both synthetic and real images.

11/23 lab meeting

Dear all,

This is the paper I'll present.

Modeling the Static and the Dynamic Parts of the Environment
to Improve Sensor-based Navigation

ICRA 2005

Monday, November 14, 2005

11/16 lab meeting

Dear all,

This is the paper I'll present.

Simultaneous Localization and Mapping with
Detection and Tracking of Moving Objects

-C.-C. Wang and C. Thorpe.In IEEE International Conference on Robotics and Automation (ICRA'02), May, 2002.

Saturday, November 12, 2005

CMU talk: Automatic Filters for the Detection of Coherent Structure in Spatiotemporal Systems

Cosma Shalizi
November 29, 2005
Abstract: Current methods for identifying coherent structures in spatially-extended systems rely on prior information about the form which those structures take. This talk describes two new approaches to automatically filter the changing configurations of spatial dynamical systems and extract coherent structures. One, local sensitivity filtering, gauges the ability of locally-applied perturbations to produce large-scale changes in the system configuration. The other, local statistical complexity filtering, calculates the amount of information needed for optimal prediction of the system's behavior in the vicinity of a given point. By examining the changing spatiotemporal distributions of these quantities, we can find the coherent structures in a variety of pattern-forming systems, without needing to guess or postulate the form of that structure. The results are at least comparable to those obtained with older techniques based on formal language theory or the statistical- mechanical theory of order parameters. Paper URL:

CMU talk: Scalable Inference in Hierarchical Models of the Neocortex

Tom Dean

November 21, 2005
Title: Scalable Inference in Hierarchical Models of the Neocortex
Borrowing insights from computational neuroscience, we present a class of generative models well suited to modeling perceptual processes and an algorithm for learning their parameters that promises to scale to learning very large models. The models are hierarchical, composed of multiple levels, and allow input only at the lowest level, the base of the hierarchy. Connections within a level are generally local and may or may not be directed. Connections between levels are directed and generally do not span multiple levels. The learning algorithm falls within the general family of expectation maximization algorithms. Parameter estimation proceeds level-by-level starting with components in the lowest level and moving up the hierarchy. Having learned the parameters for the components in a given level, those parameters are fixed and needn't be revisited for the purposes of learning. These parameters do, however, play an important role in learning the parameters for higher-level components by helping to generate the samples used in subsequent parameter estimation. Within levels, learning is decomposed into many local subproblems suggesting a straightforward parallel implementation. The inference required for learning is carried out by local message passing and the arrangement of connections within the underlying networks is designed to facilitate this method of inference. Learning is unsupervised but can be easily adapted to accommodate labeled data. In addition to describing several variants of the basic algorithm, we present preliminary experimental results demonstrating the pattern-recognition capabilities of our approach and some of the characteristics of the approximations that the algorithms produce.

Stanford Talk: Rethinking State, Action, and Reward in Reinforcement Learning

Satinder Singh
November 7, 2005, 4:15PM

Over the last decade and more, there has been rapid theoretical and empirical progress in reinforcement learning (RL) using the well- established formalisms of Markov decision processes (MDPs) and partially observable MDPs or POMDPs. At the core of these formalisms are particular formulations of the elemental notions of state, action, and reward that have served the field of RL so well. In this talk, I will describe recent progress in rethinking these basic elements to take the field beyond (PO)MDPs. In particular, I will briefly describe older work on flexible notions of actions called options, briefly describe some recent work on intrinsic rather than extrinsic rewards, and then spend the bulk of my time on recent work on predictive representations of state. I will conclude by arguing that taken together these advances point the way for RL to address the many challenges of building an artificial intelligence.

About the Speaker
Satinder Singh is an Associate Professor of Electrical Engineering and Computer Science in the University of Michigan, Ann Arbor. His main research interest is in the old-fashioned goal of Artificial Intelligence, that of building autonomous agents that can learn to be broadly competent in complex, dynamic, and uncertain environments. The field of reinforcement learning (RL) has focused on this goal, and accordingly his deepest contributions are in RL.

MIT talk: Visual Recognition: From Generative to Discriminative Models

Speaker: Pietro Perona , Caltech
Date: Monday, November 14 2005

We can easily recognize objects and properties of the world by looking. If machines had this ability they could be much more intelligent and useful. I will present a taxonomy of visual recognition, review the state of the art and discuss a number of fascinating open problems.

Pietro Perona studies the computational aspects of vision; his current focus is visual recognition. He has published on applications of PDEs to image segmentation, human texture perception and segmentation, dynamic vision, grouping, perception of human motion, learning and recognition of object categories, categorization of scenes in human vision, human perception of 3D shape, interaction of attention and recognition. Perona is Professor of Electrical Engineering and of Computation and Neural Systems at the California Institute of Technology (Caltech). He is the Director of the National Science Foundation Engineering Research Center in Neuromorphic Systems Engineering at Caltech.

MIT talk: A robust layered control system for a mobile robot

Speaker: Rodney Brooks , MIT
Date: Tuesday, November 15 2005

Rod will present a historical perspective on robot control, planning, and intelligence and discuss his influential trend-changing paper A robust layered control system for a mobile robot published in IEEE Transactions on Robotics and Automation, 2(1), pages 14-23, April 1986.

The paper is available to download here:

Friday, November 11, 2005

CMU talk: Fast Inference and Learning in Large-State-Space HMMs

Speaker: Sajid Siddiqi, CMU

Date: November 14

For Hidden Markov Models (HMMs) with fully connected transition models, the three fundamental problems of evaluating the likelihood of an observation sequence, estimating an optimal state sequence for the observations, and learning the model parameters, all have quadratic time complexity in the number of states. We introduce a novel class of non-sparse Markov transition matrices called Dense-Mostly-Constant (DMC) transition matrices that allow us to derive new algorithms for solving the basic HMM problems in sub-quadratic time. We describe the DMC HMM model and algorithms and attempt to convey some intuition for their usage. Empirical results for these algorithms show dramatic speedups for all three problems. In terms of accuracy, the DMC model yields strong results and outperforms the baseline algorithms even in domains known to violate the DMC assumption.

Fast Inference and Learning in Large-State-Space HMMs
S. Siddiqi and A. Moore
Proceedings of the 22nd International Conference on Machine Learning, August, 2005 Paper.

Thursday, November 10, 2005

New @ IEEE in Communications, November 2005

The current special issue of IEEE Internet Computing (v. 9, no. 5) explores how ideas from social networking can propel creative designs in the communications technology field. To help people use communication technologies to understand and manage their social networks more effectively, the issue' guest editors have selected three articles that position social networks and social networking in terms of relationships among individuals. One author focuses on the networked ego of the e-mail user to present sociograms as end-user visualizations of connections between individuals who have been co-addressed on e-mail messages. The second article examines social isolation and depression in elderly individuals and uses social networking and computing technologies to help reduce these feelings and provide health feedback displays. The third article connects physical place, mobile technologies, and social networks into the P3 framework, a system which helps designers determine appropriate geographic context clues for specific social interactions. The guest editors' introduction, along with a sidebar entitled "Resources on Social Networks, Social Networking, and Social Analysis," are available to all readers online: the link

Missions to other planets and moons may one day use combined space, aerial and ground vehicles to deploy sensors and a communications network more robust and adaptable than current one-vehicle missions, researchers say. The new concept would ensure that the failure of one instrument or vehicle would not doom a mission, say scientists from the California Institute of Technology, the University of Arizona and the U.S. Geological Survey. The researchers propose multi-tiered robotic space missions that link orbiting spacecrafts, blimps and balloons with ground robots, all of which will carry instruments which can communicate and interact with instruments on the other platforms to exploit local weather and geographic conditions.
Read more: the link.

Papers to the 9th International Conference on Information Fusion should be submitted by 15 January 2006. The conference, sponsored by the IEEE Aerospace & Electronic Systems Society, seeks papers on advancements and applications in information fusion, particularly those with special emphasis on non-traditional topics. Some areas of interest include foundational tools, algorithmic developments, technological advancements and applications. The conference will take place in Florence, Italy, next June. For more details, visit: the link.

New @ IEEE for Students, November 2005

Speech technology gets the special-focus treatment in the current issue of IEEE Signal Processing Magazine (v. 22, no. 5). The issue contains nine articles around the theme of "Speech Technology in Human-Machine Communication," along with an introduction by the guest editors, who write that "the full potential of speech technology still remains to be uncovered." The table of contents and abstracts for all papers in the current issue can be found in the IEEE Xplore digital library, where subscribers may also access the full text of the articles: the link.

Scientists studying the world's oceans are limited to short trips during times of the year when weather conditions are most favorable, and the underwater instruments they leave behind lack the power and bandwidth to deliver much useful information. But this month, construction begins on an Internet-connected undersea observatory covering hundreds of thousands of square kilometers of sea floor. When the project, called the North-East Pacific Time Series Undersea Networked Experiments (NEPTUNE), is completed in 2007, instruments such as hydrophones, current sensors, high-definition video cameras and even robotic crawlers will deliver data around the clock.
IEEE Spectrum has more: the link.

Wednesday, November 09, 2005

Jim's presentation today

An Application of Markov Random Fields to Range Sensing by Diebel and Thrun, NIPS 2005.

An Introduction to the Conjugate Gradient Method Without the Agonizing Pain by Jonathan Richard Shewchuk. <-- I have not read it yet.

CMU LTI talk: Natural Language Processing in Bioinformatics: Uncovering Semantic Relations

Speaker: Barbara Rosario, University of California, Berkeley

TITLE: Natural Language Processing in Bioinformatics: Uncovering Semantic Relations

ABSTRACT: Current-generation search engines provide a glimpse of the kinds of activities that can be catalyzed by intelligent processing of large-scale document corpora. Further progress in this area will require the tools of statistical natural language processing, including tools for automatic extraction of propositional information from text. This presentation will explore several lines of research on one of the core problems that arise in this domain---the identification of semantic relations between constituents in sentences. First, I will discuss the problem of identifying relationships between two-word noun compounds (to characterize, for example, the treatment-for-disease relationship between the words of "migraine treatment" versus the method-of-treatment relationship between the words of "aerosol treatment".) Second, I'll describe my work in the area of Information Extraction, in particular the problem of identifying semantic entities such as "treatment" and "disease" from biomedical text. Finally, I will present my recent work on the problem of predicting protein-protein interactions from biological text. A major impediment to such work is the acquisition of appropriately labeled training data; for my experiments I have identified a database that serves as a proxy for training data. In each of these cases I will describe the statistical machine learning methods---both generative and discriminative---used to tackle these tasks.

Tuesday, November 08, 2005

CMU RI talk: Social robots, social development, and social disorders

Brian Scassellati
Department of Computer Science
Yale University


Social robots recognize and respond to human social cues with appropriate behaviors. These robots are unique tools in the study of human social development, and have the potential to play a critical role in the diagnosis and treatment of social disorders such as autism.

In the first part of this talk, I present four vignettes on what the practicality of constructing social robots has taught us about human social development. These vignettes cover topics of perceptual development (vocal prosody), sensorimotor development (declarative and imperative pointing), linguistic development (learning pronouns), and cognitive development (self-other discrimination).

The second half will focus on the application of social robots to the diagnosis and therapy of autism. Autism is a pervasive developmental disorder that is characterized by social and communicative impairments. Based on three years of integration and immersion with a clinical research group which performs more than 130 diagnostic evaluations of children for autism per year, I will discuss how social robots will impact the ways in which we diagnose, treat, and understand autism.

Speaker Biography
Brian Scassellati is an assistant professor of Computer Science at Yale University. His research focuses on the construction of humanoid robots that interact with people using natural social cues. These robots are used both to evaluate models of how infants acquire social skills and to assist in the diagnosis and quantification of disorders of social development (such as autism). He is an associate editor of the International Journal of Humanoid Robotics and the program chair for the upcoming 6th International Conference on Development and Learning. In 2003, he was awarded an NSF CAREER award.

Saturday, November 05, 2005

CMU talk: Snake Robots and Stuff that Makes them Go

Speaker: Howie Choset, Associate Professor, Robotics Institute, Carnegie Mellon University

Date: Thursday, November 10

Snake robots, formally called hyper-redundant mechanisms, are highly articulated devices that can use their many internal degrees of freedom to thread through tightly packed volumes accessing locations that people and machinery otherwise cannot. Moreover, the internal degrees of freedom of hyper-redundant mechanisms give them the ability to achieve different forms of mobility, including crawling, climbing and swimming.

The many degrees of freedom that furnish these robots with their benefits, also provide their greatest challenges: mechanism design, control, systems integration and power. This talk discusses my group's work in addressing these challenges and overviews future work. Also, this summarizes some of the applications for snake robots that my group is active; these applications include urban search and rescue, minimally invasive sugery, inspection of wings, and site characterization of buried tanks.

CNN: MIT maps wireless users across campus

Friday, November 4, 2005; Posted: 9:54 a.m. EST (14:54 GMT)

CAMBRIDGE, Massachusetts (AP) -- In another time and place, college students wondering whether the campus cafe has any free seats, or their favorite corner of the library is occupied, would have to risk hoofing it over there.

But for today's student at the Massachusetts Institute of Technology, that kind of information is all just a click away.

MIT's newly upgraded wireless network -- extended this month to cover the entire school -- doesn't merely get you online in study halls, stairwells or any other spot on the 9.4 million square foot campus.

It also provides information on exactly how many people are logged on at any given location at any given time. It even reveals a user's identity if the individual has opted to make that data public.

MIT researchers did this by developing electronic maps that track across campus, day and night, the devices people use to connect to the network, whether they're laptops, wireless PDAs or even Wi-Fi equipped cell phones.

The maps were unveiled this week at the MIT Museum, where they are projected onto large Plexiglas rectangles that hang from the ceiling. They are also available online to network users, the data time-stamped and saved for up to 12 hours.

Red splotches on one map show the highest concentration of wireless users on campus. On another map, yellow dots with names written above them identify individual users, who pop up in different places depending where they're logged in.

"With these maps, you can see down to the room on campus how many people are logged on," said Carlo Ratti, director of the school's SENSEable City Laboratory, which created the maps. "You can even watch someone go from room to room if they have a handheld device that's connected."

Researchers use log files from the university's Internet service provider to construct the maps. The files indicate the number of users connected to each of MIT's more than 2,800 access points. The map that can pinpoint locations in rooms is 3-D, so researchers can even distinguish connectivity in multistoried buildings.

"Laptops and Wi-Fi are creating a revolutionary change in the way people work," Ratti said. The maps aim to "visualize these changes by monitoring the traffic on the wireless network and showing how people move around campus."

Some of the results so far aren't terribly surprising for students at the vanguard of tech innovation.

The maps show, for example, that the bulk of wireless users late at night and very early in the morning are logged on from their dorms. During the day, the higher concentration of users shifts to classrooms.

But researchers also found that study labs that once bustled with students are now nearly empty as people, no longer tethered to a phone line or network cable, move to cafes and nearby lounges, where food and comfy chairs are more inviting.

Researchers say this data can be used to better understand how wireless technology is changing campus life, and what that means for planning spaces and administering services.

The question has become, Ratti said, "If I can work anywhere, where do I want to work?"

"Many cities, including Philadelphia, are planning to go wireless. Something like our study will help them understand usage patterns and where best to invest," said researcher Andres Sevtsuk.

Sevtsuk likened the mapping project to a real-time census.

"Instead of waiting every year or every 10 years for data, you have new information every 15 minutes or so about the population of the campus," he said.

While every device connected to the campus network via Wi-Fi is visible on the constantly refreshed electronic maps, the identity of the users is confidential unless they volunteer to make it public.

Those students, faculty and staff who opt in are essentially agreeing to let others track them.

"This raises some serious privacy issues," Ratti said. "But where better than to work these concerns out but on a research campus?"

Rich Pell, a 21-year-old electrical engineering senior from Spartanburg, South Carolina, was less than enthusiastic about the new system's potential for people monitoring. He predicted not many fellow students would opt into that.

"I wouldn't want all my friends and professors tracking me all the time. I like my privacy," he said.

"I can't think of anyone who would think that's a good idea. Everyone wants to be out of contact now and then."

MIT talk: The subjective nature of straight lines: shortest paths for mobile Robots

Speaker: Matthew T. Mason, Director, Robotics Institute, CMU
Date: Tuesday, November 8 2005

One way to define a straight line for a mobile robot is to put a bound on the robot's velocity, and then solve for the time-optimal paths using Pontryagin's maximum principle. Different types of mobile robots yield different solutions, corresponding to different notions of straight lines and distance. The resulting robot-specific metrics are useful for motion planning.

Friday, November 04, 2005

CMU talk: Who Am I If a Robot Can Do My Job?

"Who Am I If a Robot Can Do My Job?
Identity’s Impact on Pre-Implementation Sensemaking and Subsequent Use of New Technology"

Pamela Hinds
November 09

This talk will focus on research that I’ve been doing with Rosanne Siino and others on how people make sense of robots in the work environment. Based on an ethnographic study of the introduction of an autonomous mobile robot into a community hospital, we argue that sensemaking begins prior to the implementation of new technology, as actors learn about and prepare for the arrival of a technology. Using data collected during a community hospital’s pre-implementation of an autonomous mobile robot, we propose that the sensemaking process triggered by a technology’s anticipated introduction into an organization can commit people to certain understandings of the technology that impact its subsequent use. During the pre-implementation phase, individuals make sense of the technology by drawing on cognitive frames related to self- and organizational identities. Individuals take public actions during sensemaking, subsequently justifying those actions, with justifications leading to the actions’ repetition - a cycle that lays the seeds for the reinforcement, transformation and creation of structures. I will discuss the implications of this process for technology design, adoption and use within organizations.

Pamela J. Hinds is an Associate Professor with the Center on Work, Technology, & Organization in the Department of Management Science & Engineering, Stanford University. She conducts research on the effects of technology on groups. Much of her research has focused the dynamics of geographically distributed work teams, particularly those spanning national boundaries. Most recently, Pamela has been conducting research on professional service robots in the work environment, examining how people make sense of them and how they affect work practices. She serves on the editorial board of Organization Science and is co-editor with Sara Kiesler of the book Distributed Work (MIT Press). Her research has appeared in journals such as Organization Science, Research in Organizational Behavior, Human-Computer Interaction, Journal of Applied Psychology, Journal of Experimental Psychology: Applied, and Organizational Behavior and Human Decision Processes.

Thursday, November 03, 2005

MIT talk: Representations and Algorithms for Monitoring Dynamic Systems

Speaker: Avi Pfeffer , Harvard University
Date: Thursday, November 3 2005

Continually monitoring the state of a dynamic system is an important problem for artificial intelligence. Dynamic Bayesian networks (DBNs) provide for compact representation of probabilistic dynamic models. However the monitoring task is extremely difficult even for well-factored DBNs. Therefore approximate monitoring algorithms are needed. One family of approximate monitoring algorithms is based on the idea of factoring the joint distribution over the state of the system into a product of distributions over factors consisting of subsets of variables. Factoring relies on the notion of weak interaction between subsystems. We identify a new notion of weak interaction called separability, and show that it leads to the property that, in order to compute the factor distributions at one point in time, only the factored distributions at the previous time point are needed. We also define an approximate form of separability. We show that separability and approximate separability lead to very good approximations for the monitoring task.

Unfortunately, sometimes the factoring approach is computationally infeasible. An alternative approach to approximate monitoring is particle filtering (PF), in which the joint distribution over the state of the system is approximated by a set of samples, or particles. In high dimensional spaces, the variance of PF is high and too many particles are required to provide good performance. We improve the performance of PF by introducing factoring, maintaining particles over factors instead of the global state space. This has the effect of reducing the variance of PF and so reducing its error. Maintaining factored particles also allows us to improve PF by looking ahead to future evidence before deciding which particles to propagate, thus leading to much better accuracy.

Monday, October 31, 2005

My talk this Wednesday


Hybrid Simultaneous Localization and Map Building:
A Natural Integration of Topological and Metric


  • Introduction
  • Environment Modeling
  • Localization and Map Building
  • Experimental Results
  • Conclusions and Outlook

Saturday, October 29, 2005

News: Future smart cars could help to cut accidents

Future smart cars could help to cut accidents
Using peppermint, lavender, citrus scents, vibrating seat belts

Updated: 7:25 a.m. ET Sept. 6, 2005

DUBLIN - Whether it is wafting lavender or citrus scents to calm drivers and keep them awake, or vibrating seat belts to get them to slow down, smart cars in the future could help reduce road accidents.


News: Robot dog: Man's best friend or diet nag?

Updated: 9:07 p.m. ET Aug. 31, 2005

MIT researchers plan to recruit Aibo into the obesity police

LONDON - It could be a dream or a nightmare -- scientists have created a robotic dog that tells you when it's time for your daily walk.


CMU ML lunch: Patient-Specific Predictive Modeling

Speaker: Shyam Visweswaran, University of Pittsburgh
Date: October 31


We investigated two patient-specific and four population-wide machine learning methods for predicting dire outcomes in community acquired pneumonia (CAP) patients. Predicting dire outcomes in CAP patients can significantly influence the decision about whether to admit the patient to the hospital or to treat the patient at home. Population-wide methods induce models that are trained to perform well on average on all future cases. In contrast, patient-specific methods specifically induce a model for a particular patient case. We trained the models on a set of 1601 patient cases and evaluated them on a separate set of 686 cases. One patient-specific method performed better than the population-wide methods when evaluated within a clinically relevant range of the ROC curve. Our study provides support for patient-specific methods being a promising approach for making clinical predictions.

Latest News from (October 28, 2005)

More NHTSA Collision Avoidance Research Project Results Available
The U.S. National Highway Traffic Safety Administration (NHTSA) has posted four new research reports for download. The reports address the Automotive Collision Avoidance Systems (ACAS) project as well as the Collision Avoidance Metrics Partnership (CAMP).

PReVENT Announces First Results for 3D Camera R&D
The European PReVENT Integrated Project has issued information regarding first results for their 3D CMOS camera research. During the past months, the UserRcams consortium has been working on their first prototype for their 3D CMOS camera based on the UseRCams general specification deliverable.

PReVENT ProFusion2 Sets Timeframe for Fusion Forum
ProFusion2 works on sensor data fusion (SDF), developing a common SDF framework for automotive active safety applications and carrying out research on environment modeling and data fusion algorithms for object tracking. The Fusion Forum with SDF experts has been established and will organize its first one-day open workshop in March 2006 in Brussels.

PReVENT Participates in MADYMO Passive Safety Meeting
In September, European PReVENT representatives participated in the 5th Annual meeting of the MADYMO user community. This event is an annual event gathering a large community of experts on passive safety. PReVENT was invited to give a keynote presentation on the integration of passive and active safety. The speech presented by the active safety community attracted a great deal of attention since this field is seen as the next step up in improving passive safety components.

PReVENT ADASIS Forum Holds First Commercial Vehicle Task Force Meeting
The ADASIS Forum held its first Commercial Vehicle Task Force meeting on 21 September in Gothenburg, Sweden. Commercial vehicle makers, ADAS suppliers, map makers and navigation system suppliers met to discuss the future of digital maps for commercial vehicles and their applications.

PReVENT WILLWARN Shares Smarts with Network on Wheels
Within the European PReVENT integrated project, the WILLWARN subproject is sharing its vehicle-vehicle cooperative technology. A joint meeting recently took place between WILLWARN and members of the German Network on Wheels (NOW) project. At the meeting, the WILLWARN consortium agreed upon forming a joint task force for communication issues. The task force, consisting of WILLWARN communication and application experts and NOW representatives, will focus on the integration of the WILLWARN application's communication needs with the NOW project, as well as the use of NOW communication hardware in WILLWARN.

FMCSA Releases Performance Requirements Docs for Safety Systems
Culminating a two-year process, the U.S. Federal Motor Carrier Safety Administration has posted performance requirements for Forward Collision Warning Systems / Adaptive Cruise Control, Lane Departure Warning Systems, and Vehicle Stabilty Systems.