Thursday, November 30, 2006

Context Aware Computing, Understanding and Responding to Human Intention

[Original Link]

Ted Selker


This talk will demonstrate that Artificial intelligence can competently Improve human interaction with systems and even each other in a myriad of natural scenarios. Humans work to understand and react to each others intentions. The context aware computing group at the MIT Media lab has demonstrated that across most aspects of our life, computers can do this too. The groups demonstrations range from car to office kitchen to and even bed. The goal is to show that human intentions can be recognized considered and responded to appropriately by computer systems. Understanding and acting appropriately to intentions requires more than good sensors, it requires understanding of the value of the input. The context aware demonstrations therefore rely completely on models of what the system can do, what the tasks are that can be performed and what is known about the user . These models of system task and user form a central basis for deciding when and how to respond in a specific situation.

Dr. Ted Selker is an Associate Professor at the MIT Media, the Director of the Context Aware Computing Lab, the MIT director of The Voting Technology Project and the Counter Intelligence/ Design Intelligence special interest group on domestic and product-design of the future. Ted's work strives to demonstrate that peoples intentions can be recognized and respected by the things we design. Context aware computing creates a world in which peoples desires and intentions cause computers to help them. This group is recognized for its creating environments that use sensors and artificial intelligence to create so-called "virtual sensors"; adaptive models of users to create keyboard less computer scenarios. Ted's Design Intelligence work has used technology rich platforms such as kitchens to examine intention based design., Ted's work is also applied to developing and testing user experience technology and security architectures for recording and voter intentions securely and accurately.

Prior to joining MIT faculty in November 1999, Ted was an IBM fellow and directed the User Systems Ergonomics Research lab. He has served as a consulting professor at Stanford University, taught at Hampshire, University of Massachusetts at Amherst and Brown Universities and worked at Xerox PARC and Atari Research Labs.

Ted's research has contributed to products ranging from notebook computers to operating systems. His work takes the form of prototype concept products supported by cognitive science research. He is known for the design of the TrackPoint in-keyboard pointing device found in many notebook computers, and many other innovations at IBM. Ted's technologies are often featured in national and international news media.

Ted is work has resulted in award winning products, numerous patents, papers and is often featured by the press. And was co recipient of computer science policy leader awarded for Scientific American 50 in 2004 and the American Association For People with Disabilities Thomas Paine Award for his work on voting technology.

Wednesday, November 29, 2006

Lab meeting 1 Dec, 2006 (ZhenYu):Detecting Social Interaction of Elderly in a Nursing Home Environment

Datong Chen, Jie Yang, Robert Malkin, and Howard D. Wactlar

Social interaction plays an important role in our daily lives. It is one of the most important indicators of physical or mental changes in aging patients. In this paper, we investigate the problem of detecting social interaction patterns of patients in a skilled nursing facility. Our studies consist of both a “wizard of Oz” study and an experimental study of various sensors and detection models for detecting and summarizing social interactions among aging patients and caregivers. We first simulate plausible sensors using human labeling on top of audio and visual data collected from a skilled nursing facility. The most useful sensors and robust detection models are determined using the simulated sensors. We then present the implementation of some real sensors based on video and audio analysis techniques and evaluate the performance of these implementations in detecting interaction. We conclude the paper with discussions and future work.


Lab meeting 1 Dec, 2006 (Nelson): Randam Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography

Martin A. Fischler and Robert C. Bolles
SRI International

Communication of the ACM
June 1981 Volume 24 Number 6



A new paradigm, Random Sample Consensus
(RANSAC), for fitting a model to experimental data is
introduced. RANSAC is capable of interpreting/
smoothing data containing a significant percentage of
gross errors, and is thus ideally suited for applications
in automated image analysis where interpretation is
based on the data provided by error-prone feature
detectors. A major portion of this paper describes the
application of RANSAC to the Location Determination
Problem (LDP): Given an image depicting a set of
landmarks with known locations, determine that point
in space from which the image was obtained. In
response to a RANSAC requirement, new results are
derived on the minimum number of landmarks needed
to obtain a solution, and algorithms are presented for
computing these minimum-landmark solutions in closed
form. These results provide the basis for an automatic
system that can solve the LDP under difficult viewing

Monday, November 27, 2006

News: Scientists Try to Make Robots More Human

November 22, 2006

George the robot is playing hide-and-seek with scientist Alan Schultz. What's so impressive about robots playing children's games? For a robot to actually find a place to hide, and then hunt for its human playmate is a new level of human interaction. The machine must take cues from people and behave accordingly.

This is the beginning of a real robot revolution: giving robots some humanity.

"Robots in the human environment, to me that's the final frontier," said Cynthia Breazeal, robotic life group director at the Massachusetts Institute of Technology. "The human environment is as complex as it gets; it pushes the envelope."

Robotics is moving from software and gears operating remotely - Mars, the bottom of the ocean or assembly lines - to finally working with, beside and even on people.

"Robots have to understand people as people," Breazeal said. "Right now, the average robot understands people like a chair: It's something to go around."

See the full article.

Wednesday, November 22, 2006

News: Pleo, the sensitive robot

Behold the majesty of Pleo. It's a robot, coming out in the second quarter of 2007, that exhibits emotional reactions to its surroundings. So far, companion robots have been a big flop in the market, but Pleo maker Ugobe hopes to succeed by pricing it for around $250, lower than other companion robots, and giving consumers ways to program it.

Don't be fooled by the eyes; the vision system is in its nose.

When it's in a good mood, the Pleo wags it tail back and forth. It also makes a sort of mooing sound, which is appropriate given that the robot is modeled after the Camarasaurus, a cow-like dinosaur that roamed the Americas in the Jurassic period. A paleontologist helped Ugobe come up with the design.

There's a lot going on under the Pleo's rubbery skin. The robot contains six microprocessors and more than 150 gears. It also has a memory card slot where the sun don't shine.

Credit: Michael Kanellos/CNET

See the related link & video.

Sunday, November 19, 2006

News: Robot Senses Damage, Learns to Walk Again

November 17, 2006

—It may look like a metallic starfish, but scientists say this robot might have more in common with a newborn human.

The four-legged machine is a prototype "resilient robot" with the ability to detect damage to itself and alter its walking style in response.

Josh Bongard, an assistant professor of computer science at the University of Vermont in Burlington, and his colleagues created the robot as part of a NASA pilot project working on technology for the next generation of planetary rovers.

While people and animals can easily compensate for injuries, even a small amount of damage can ground NASA machinery entirely.

See the full article.

Friday, November 17, 2006

CMU ML talk: Machine Learning and Human Learning

Speaker: Prof. Tom Mitchell, CMU
Date: November 20
Time: 12:00 noon

For schedules, links to papers et al, please see the web page:


For the past 30 years, researchers studying machine learning and researchers studying human learning have proceeded pretty much independently. We now know enough about both fields that it is time to re-ask the question: "How can studies of human learning and studies of machine learning inform one another?" This talk will address this question by briefly covering some of the key facts we now understand about both machine learning and human learning, then examining in some detail several specific types of machine learning which may provide surprisingly helpful models for understanding aspects of human learning (e.g., reinforcement learning, cotraining).

Lab meeting 17 Nov, 2006 (Jim): Real-Time Simultaneous Localisation and Mapping with a Single Camera

Real-Time Simultaneous Localisation and Mapping with a Single Camera
Andrew J. Davison, ICCV 2003

PDF, homepage

      Ego-motion estimation for an agile single camera moving through general, unknown scenes becomes a much more challenging problem when real-time performance is required rather than under the off-line processing conditions under which most successful structure from motion work
has been achieved. This task of estimating camera motion from measurements of a continuously expanding set of self-mapped visual features is one of a class of problems known as Simultaneous Localisation and Mapping (SLAM) in the robotics community, and we argue that such real-time mapping research, despite rarely being camera-based, is more relevant here than off-line structure from motion methods due to the more fundamental emphasis placed on propagation of uncertainty.
      We present a top-down Bayesian framework for single-camera localisation via mapping of a sparse set of natural features using motion modelling and an information-guided active measurement strategy, in particular addressing the difficult issue of real-time feature initialisation via a factored sampling approach. Real-time handling of uncertainty permits robust localisation via the creating and active measurement of a sparse map of landmarks such that regions can be re-visited after periods of neglect and localisation can continue through periods when few features are visible. Results are presented of real-time localisation for a hand-waved camera with very sparse prior scene knowledge and all processing carried out on a desktop PC.

Jaron Lanier forecasts the future

* 16 November 2006
* news service
* Jaron Lanier

In the next 50 years, computer science needs to achieve a new unification between the inside of the computer and the outside. The inside is still governed by the mid-20th century approach – that is, every program must have defined inputs and outputs in order to function.

The outside, however, encounters the real world and must analyse data statistically. Robots must assess a terrain in order navigate it. Language translation programs must make guesses in order to function. Because the interface to the outside world involved approximation, it is also capable of adjusting itself to improve the quality of approximation. But the inside of a computer must adhere to protocols to function at all, and therefore cannot evolve automatically.

See the full article.

Eric Horvitz forecasts the future

* 8 November 2006
* news service
* Eric Horvitz

Computation is the fire in our modern-day caves. By 2056, the computational revolution will be recognised as a transformation as significant as the industrial revolution. The evolution and widespread diffusion of computation and its analytical fruits will have major impacts on socioeconomics, science and culture.

Within 50 years, lives will be significantly enhanced by automated reasoning systems that people will perceive as "intelligent". Although many of these systems will be deployed behind the scenes, others will be in the foreground, serving in an elegant, often collaborative manner to help people do their jobs, to learn and teach, to reflect and remember, to plan and decide, and to create. Translation and interpretation systems will catalyse unprecedented understanding and cooperation between people. At death, people will often leave behind rich computational artefacts that include memories, reflections and life histories, accessible for all time.

Robotic scientists will serve as companions in discovery by formulating theories and pursuing their confirmation.

See the full article.

Rodney Brooks forecasts the future

* 18 November 2006
* news service
* Rodney Brooks

Show a two-year-old child a key, a shoe, a cup, a book or any of hundreds of other objects, and they can reliably name its class - even when they have never before seen something that looks exactly like that particular key, shoe, cup or book. Our computers and robots still cannot do this task with any reliability. We have been working on this problem for a while. Forty years ago the Artificial Intelligence Laboratory at the Massachusetts Institute of Technology appointed an undergraduate to solve it over the summer. He failed, and I failed on the same problem in my 1981 PhD.

In the next 50 years we can solve the generic object recognition problem. We are no longer limited by lack of computer power, but we are limited by a natural risk aversion to a problem on which many people have foundered in the past few decades.

See the full article.

Thursday, November 16, 2006

Lab meeitng 17 Nov., 2006 (Any): World modeling for an autonomous mobile robot using heterogenous sensor information

Title: World modeling for an autonomous mobile robot using heterogenous sensor information

Local Copy: [Here]
Related Link: [Here]

Author: Klaus-Werner Jorg
From: Robotics and Autonomous Systems, 1995

An Autonomous Mobile Robot (AMR) has to show both goal-oriented behavior and reflexive behavior in order to be considered fully autonomous. In a classical, hierarchical control architecture these behaviors are realized by using several abstraction levels while their individual informational needs are satisfied by associated world models. The focus of this paper is to describe an approach which utilizes heterogenous information provided by a laser-radar and a set of sonar sensors in order to achieve reliable and complete world models for both real-time collision avoidance and local path planning. The approach was tested using MOBOT-IV, which serves as a test-platform within the scope of a research project on autonomous mobile robots for indoor applications. Thus, the experimental results presented here are based on real data.

Lab meeitng 17 Nov., 2006 (Stanley): Policies Based on Trajectory Libraries

Martin Stolle

We present a control approach that uses a library of trajectories to establish a global control law or policy. This is an alternative to methods for finding global policies based on value functions using dynamic programming and also to using plans based on a single desired trajectory. Our method has the advantage of providing reasonable policies much faster than dynamic programming can provide an initial policy. It also has the advantage of providing more robust and global policies than following a single desired trajectory. Trajectory libraries can be created for robots with many more degrees of freedom than what dynamic programming can be applied to as well as for robots with dynamic model discontinuities. Results are shown for the “Labyrinth” marble maze, both in simulation as well as a real world version. The marble maze is a difficult task which requires both fast control as well as planning ahead.

ICRA 2006
Thesis proposal

Sunday, November 12, 2006

News: Google and Microsoft aim to give you a 3-D world

By Brad Stone

Nov. 20, 2006 issue - The sky over San Francisco is cerulean blue as you begin your descent into the city from 2,000 feet. As you pass over the southern hills, the skyline of the Financial District rises into view. On the descent into downtown, familiar skyscrapers form an urban canyon around you; you can even see the trolley tracks running down the valley formed by Market Street. But then a little pop-up box next to the Bay Bridge explains that an accident has just occurred on the west-ern span, and a thick red line indicates the resulting traffic jam along the highway. A banner ad for Emeryville, Calif., firm ZipRealty hangs incongruously in the air over the Transamerica Pyramid. You are actually staring at your PC screen, not out an airplane window.

Virtual Earth 3D, the online service unveiled last week by Microsoft, is both incomplete (only 15 cities are depicted in 3-D) and imperfect (some of the buildings are shrouded in shadow, and you need a powerful PC running Windows XP or the new Vista to use it). But it is also the start of something potentially big: the 3-D Web. Traditional Web pages give us text, photos and video, unattached to real-world context. Now interactive mapping programs like Google Earth let us zoom around the globe on our PCs and peer down at the topography captured by satellites and aerial photographers. Both Google Earth and Microsoft's Virtual Earth are hugely popular and have been downloaded more than 100 million times each.

See the full article.

[Folks, we can build 4D maps of the real world in a much faster and more reliable way, right? -Bob]

Thursday, November 09, 2006

Lab Meeting 10 Nov talk (Casey): Object Class Recognition Using Multiple Layer Boosting with Heterogeneous Features

Title: Object Class Recognition Using Multiple Layer Boosting with Heterogeneous Features
Authors: Wei Zhang, Bing Yu, Gregory J. Zelinsky, Dimitris Samaras
(CVPR 2005)

We combine local texture features (PCA-SIFT), global features (shape context), and spatial features within a single
multi-layer AdaBoost model of object class recognition.
The first layer selects PCA-SIFT and shape context features and combines the two feature types to form a strong classifier.
Although previous approaches have used either feature type to train an AdaBoost model, our approach is the first to combine these complementary sources of information into a single feature pool and to use Adaboost to select
those features most important for class recognition.
The second layer adds to these local and global descriptions information about the spatial relationships between features.
Through comparisons to the training sample, we first find the most prominent local features in Layer 1, then capture
the spatial relationships between these features in Layer 2.
Rather than discarding this spatial information, we therefore use it to improve the strength of our classifier. We compared our method to [4, 12, 13] and in all cases our approach outperformed these previous methods using a popular
benchmark for object class recognition [4].
ROC equal error rates approached 99%. We also tested our method using a dataset of images that better equates the complexity
between object and non-object images, and again found that our approach outperforms previous methods.

Download: [link]

Robot learns to grasp everyday chores- BRIAN D. LEE

Original link

From left, graduate students Ashutosh Saxena and Morgan Quigley and Assistant Professor Andrew Ng were part of a large effort to develop a robot to see an unfamiliar object and ascertain the best spot to grasp it.

Stanford scientists plan to make a robot capable of performing everyday tasks, such as unloading the dishwasher. By programming the robot with "intelligent" software that enables it to pick up objects it has never seen before, the scientists are one step closer to creating a real life Rosie, the robot maid from The Jetsons cartoon show.

"Within a decade we hope to develop the technology that will make it useful to put a robot in every home and office," said Andrew Ng, an assistant professor of computer science who is leading the wireless Stanford Artificial Intelligence Robot (STAIR) project.

"Imagine you are having a dinner party at home and having your robot come in and tidy up your living room, finding the cups that your guests left behind your couch, picking up and putting away your trash and loading the dishwasher," Ng said.

Cleaning up a living room after a party is just one of four challenges the project has set out to have a robot tackle. The other three include fetching a person or object from an office upon verbal request, showing guests around a dynamic environment and assembling an IKEA bookshelf using multiple tools.

Developing a single robot that can solve all these problems takes a small army of about 30 students and 10 computer science professors—Gary Bradski, Dan Jurafsky, Oussama Khatib, Daphne Koller, Jean-Claude Latombe, Chris Manning, Ng, Nils Nilsson, Kenneth Salisbury and Sebastian Thrun.

From Shakey to Stanley and beyond

Stanford has a history of leading the field of artificial intelligence. In 1966, scientists at the Stanford Research Institute built Shakey, the first robot to combine problem solving, movement and perception. Flakey, a robot that could wander independently, followed. In 2005, Stanford engineers won the Defense Advanced Research Projects Agency (DARPA) Grand Challenge with Stanley, a robot Volkswagen that autonomously drove 132 miles through a desert course.

The ultimate aim for artificial intelligence is to build a robot that can create and execute plans to achieve a goal. "The last serious attempt to do something like this was in 1966 with the Shakey project led by Nils Nilsson," Ng said. "This is a project in Shakey's tradition, done with 2006 technology instead of 1966 AI technology."

To succeed, the scientists will need to unite fragmented research areas of artificial intelligence including speech processing, navigation, manipulation, planning, reasoning, machine learning and vision. "There are these disparate AI technologies and we'll bring them all together in one project," Ng said.

The true problem remains in making a robot independent. Industrial robots can follow precise scripts to the point of balancing a spinning top on a blade, he said, but the problem comes when a robot is requested to perform a new task. "Balancing a spinning top on the edge of a sword is a solved problem, but picking up an unfamiliar cup is an unsolved problem," Ng explained.

His team recently designed an algorithm that allowed STAIR to recognize familiar features in different objects and select the right grasp to pick them up. The robot was trained in a computer-generated environment to pick up five items—a cup, pencil, brick, book and martini glass. The algorithm locates the best place for the robot to grasp an object, such as a cup's handle or a pencil's midpoint. "The robot takes a few pictures, reasons about the 3-D shape of the object, based upon computing the location, and reaches out and grasps the object," Ng said.

In tests with real objects, the robotic arm picked up items similar to those with which it trained, such as cups and books, as well as unfamiliar objects including keys, screwdrivers and rolls of duct tape. To grasp a roll of duct tape, the robot employs an algorithm that evaluates the image against all prior strategies. "The roll of duct tape looks a little like a cup handle and also a little bit like a book," Ng said. The program formulates the best location to clutch based on a combination of all the robot's prior experiences and tells the arm where to go. "It would be a hybrid, or a combination of all the different grasping strategies that it has learned before," Ng said.

The word "robot" originates from a Slavic word meaning "toil," and robots may soon reduce the amount of drudgery in our daily lives. "I think if we can have a robot intelligent enough to do these things, that will free up vast amounts of human time and enable us to go to higher goals," Ng said.

Funding for the project has come from the National Science Foundation, DARPA and industrial technology companies Intel, Honda, Ricoh and Google.

Brian D. Lee is a science writing intern with the Stanford News Service.

[FRC seminar] Learning Robot Control Policies Using the Critique of Teacher

Brenna Argall
Ph.D. Candidate
Robotics Institute

Motion control policies are a necessary component of task execution on mobile robots. Their development, however, is often a tedious and exacting procedure for a human programmer. An alternative is to teach by demonstration; to have the robot extract its policy from the example executions of a teacher. With such an approach, however, most of the learning burden is typically placed with the robot. In this talk we present an algorithm in which the teacher augments the robot's learning with a performance critique, thus shouldering some of the learning burden. The teacher interacts with the system in two phases: first by providing demonstrations for training, and second by offering a critique on learner performance. We present an application of this algorithm in simulation, along with preliminary implementation on a real robot system. Our results show improved performance with teacher critiquing, where performance is measured by both execution success and efficiency.

Speaker Bio:
Brenna is currently a third year Ph.D. candidate in the Robotics Institute at Carnegie Mellon University, affiliated with the CORAL Research Group. Her research interests lie with robot autonomy and heterogeneous team coordination, and how machine learning may be used to build control policies which accomplish these tasks. Prior to joining the Robotics Institute, Brenna investigated functional MRI brain imaging in the Laboratory of Brain and Cognition at the National Institutes of Health. She received her B.S. in Mathematics from Carnegie Mellon in 2002, along with minors in Music and Biology.

Monday, November 06, 2006

NEWS: Robot guide makes appearance at Fukushima hospital

A robot with the ability to recognize speech and show hospital visitors the way to consultation rooms and wards has made its debut at Aizu Central Hospital in Aizuwakamatsu.

When the robot is asked to pinpoint a location, it projects a three-dimensional image showing the route to the destination through a projector on its head, and prints out a map from a built-in printer, which it hands to the visitor.

November 5, 2006

See the full article

[CMU VASC Seminar]Real Time 3D Surface Imaging and Tracking for Radiation Therapy

VASC Seminar Series

Speaker: Maud Poissonnier, Vision RT

Time: Thursday, 11/9

Title: Real Time 3D Surface Imaging and Tracking for Radiation Therapy


Radiation Therapy involves the precise delivery of high energy X-rays to
tumour tissue in order to treat cancer. The current challenge is to ensure
that the radiation is delivered to the correct target location, thus
reducing the volume of normal tissue irradiated and potentially enabling
the escalation of dose. This may be achieved through the exploitation of a
combination of imaging technologies.

Vision RT has developed a 3D surface imaging technology which can image
the entire 3D surface of a patient quickly and accurately. This relies on
close range stereo photogrammetric techniques using pairs of stereo
cameras. Registration algorithms are employed to match surface data
acquired from the same patient in different positions. High speed tracking
techniques have also been developed to allow tracking of regions at speeds
of approximately 20 fps.

AlignRT(r) is Vision RT's patient setup and surveillance system which is
now in use at a variety of clinics. The system acquires 3D surface data
during simulation or imports reference contours from diagnostic Computed
Tomography (CT). It then images the patient prior to treatment and
computes any 3D movement required to correct the patient's position. The
system is also able to monitor any patient movement during treatment. We
will finish by presenting work-in-progress systems which allow real time
tracking of breathing motion to facilitate 4D CT reconstruction and
respiratory gated radiotherapy.


Dr. Maud Poissonnier was educated in France until she decided to
explore Great Britain in 1995. She first graduated at Heriot-Watt
University in Edinburgh with an MSc in Reservoir Evaluation and Management
(Petroleum Engineering). She then joined the Medical Vision Lab (Robotics
Research Group) at the University of Oxford. She obtained her EPSRC-funded
DPhil in 2003 under the supervision of Sir Prof. Mike Brady in the area of
x-ray mammography image processing using physics-based modelling. After a
post-doctoral research on a multimodality, Grid enabled platform for
tele-mammography, she moved very slightly East (to London) and joined
Vision RT Ltd in April 2005 where she is presently Senior Software

Sunday, November 05, 2006

[NEWS]Walking Partner Robot helps old ladies cross the street

Nomura Unison Co., Ltd. has developed a walking-assistance robot that perceives and responds to its environment. The machine, called Walking Partner Robot, was developed with the cooperation of researchers from Tohoku University. It will be unveiled to the general public at the 2006 Suwa Area Industrial Messe on October 19 in Suwa, Nagano prefecture.
The robot is equipped with a system of sensors that detect the presence of obstacles, stairs, etc. while monitoring the motion and behavior of the user. Three sensors monitor the status of the user while detecting and measuring the distance to potential obstacles, and two angle sensors measure the slope of the path in front of the machine. The robot responds to these measurements with voice warnings and by automatically applying brakes when necessary.
Walking Partner Robot is essentially a high-tech walker designed to support users as they walk upright, preventing them from falling over. The user grasps a set of handles while pushing the unmotorized 4-wheeled robot, which measures 110 (H) x 70 (W) x 80 (D) cm and weighs 70 kilograms (154 lbs).
Walking Partner Robot is the second creation from the team responsible for the Partner Ballroom Dance Robot, which includes Tohoku University robotics researchers Kazuhiro Kosuge and Yasuhisa Hirata. The goal was to apply the Partner Ballroom Dance Robot technology, which perceives the intended movement and force of human footsteps, to a robot that can play a role in the realm of daily life. The result is a machine that can perceive its surroundings and provide walking assistance to the elderly and physically disabled.
The developers, who also see potential medical rehabilitation applications, aim to develop indoor and outdoor models of the robot. The company hopes to make the robot commercially available soon at a price of less than 500,000 yen (US$4,200) per unit.

A Braille Writing Tutor to Combat Illiteracy in Developing Communities

Title: A Braille Writing Tutor to Combat Illiteracy in Developing Communities
Speaker: Nidhi Kalra, Tom Lauwers
Date/Time/Location: Tuesday, November 7th, 2006, 11am, NSH 3305
We present the Braille Writing Tutor project, an initiative sponsored by TechBridgeWorld to combat the high rates of illiteracy among the blind in developing communities using an intelligent tutoring system. Developed in collaboration withe Mathru Educational Trust for the Blind in Bangalore, India, the tutor uses a novel input device to capture students' activity on a slate and stylus and uses a range of scaffolding techniques and Artificial Intelligence to teach writing skills to both beginner and advanced students. We conducted our first field study from August to September 2006 at the Mathru School to evaluate its feasibility and impact in a real educational setting. The tutor was met with great enthusiasm by both the teachers and the students and has already demonstrated a concrete impact on the students' writing abilities. Our study also highlights a number of important areas for future research and development which we invite the community to explore and investigate with us.

For more information and videos:

Speaker Bio:
Nidhi Kalra is a fifth year Ph.D. student at the Robotics Institute. She is keenly interested in applying technology to sustainable development and in understanding related public policy issues. Nidhi hopes to start a career in this field after completing her Ph.D. Her thesis area of research is in multirobot coordination and she is advised by Dr. Anthony Stentz. She has an MS in Robotics from the RI and received her BS in computer science from Cornell University in 2002. Nidhi is a native of India.

Tom Lauwers is a fourth year Ph.D. student at the Robotics Institute. He has a long-standing interest in educational robotics, as both a participant in programs like FIRST and later as a designer of a robotics course and education technology. He is currently studying curriculum development and evaluation and hopes that his study of the educational sciences will help him design better and more useful educational technologies. Tom received a BS in Electrical Engineering and a BS in Public Policy from CMU.

Saturday, November 04, 2006

[The ML Lunch talk ]Boosting Structured Prediction for Imitation Learning

Speaker: Nathan Ratliff, CMU

Title: Boosting Structured Prediction for Imitation Learning

Venue: NSH 1507

Date: November 06

Time: 12:00 noon


The Maximum Margin Planning (MMP) algorithm solves imitation learning
problems by learning linear mappings from features to cost functions in
a planning domain. The learned policy is the result of minimum-cost
planning using these cost functions. These mappings are chosen so that
example policies (or trajectories) given by a teacher appear to be lower
cost (with a loss-scaled margin) than any other policy for a given
planning domain. We provide a novel approach, MMPBoost, based on the
functional gradient descent view of boosting that extends MMP by
``boosting'' in new features. This approach uses simple binary
classification or regression to improve performance of MMP imitation
learning, and naturally extends to the class of structured maximum
margin prediction problems. Our technique is applied to navigation and
planning problems for outdoor mobile robots and robotic legged

In this talk, I will first provide an overview of the MMP approach to
imitation learning, followed by an introduction to our boosting
technique for learning nonlinear cost functions within this framework. I
will finish with a number of experimental results and a sketch of how
stuctured boosting algorithms of this sort can be derived.

Friday, November 03, 2006

[FRC Seminar] Learning-enhanced Market-based Task Allocation for Disaster Response

E. Gil Jones
Ph.D. Candidate
Robotics Institute

This talk will introduce a learning-enhanced market-based task allocation system for disaster response domains. I model the disaster response domain as a team of robots cooperating to extinguish a series of fires that arise due to a disaster. Each fire is associated with a time-decreasing reward for successful mitigation, with the value of the initial reward corresponding to task importance, and the speed of decay of the reward determining the urgency of the task. Deadlines are also associated with each fire, and penalties are assessed if fires are not extinguished by their deadlines. The team of robots aims to maximize summed reward over all emergency tasks, resulting in the lowest overall damage from the series of fires.

In this talk I will first describe my implementation of a baseline market-based approach to task allocation for disaster response. In the baseline approach the allocation respects fire importance and urgency, but agents do a poor job of anticipating future emergencies and are assessed a high number of penalties. I will then describe two regression-based approaches to learning-enhanced task allocation. The first approach, task-based learning, seeks to improve agents' valuations for individual tasks. The second method, schedule-based learning, tries to quantify the
tradeoff between performing a given task or not performing the task and retaining the flexibility to better perform future tasks. Finally, I will compare the performance of the two learning methods and the baseline approach over a variety of parameterizations of the disaster response domain.

Speaker Bio:
Gil is a second year Ph.D. student at the Robotics Institute, and is co-advised by Bernardine Dias and Tony Stentz. His primary interest is market-based multi-robot coordination. He received his BA in Computer Science from Swarthmore College in 2001, and spent two years as a software engineer at Bluefin Robotics - manufacturer of autonomous underwater vehicles - in Cambridge, Mass.

Wednesday, November 01, 2006

Lab meeitng 1 Nov., 2006 (Vincent): Fitting a Single Active Appearance Model Simultaneously to Multiple Images

In this talk, I will present the following 2 papers.

Title :
Fitting a Single Active Appearance Model Simultaneously to Multiple Images

Author :
Changbo Hu, Jing Xiao, Iain Matthews, Simon Baker, Jeff Cohn and Takeo Kanade
The Robotics Institute,CMU

Origin :
BMVC 2004

Abstract :
Active Appearance Models (AAMs) are a well studied 2D deformable model. One recently proposed extension of AAMs to multiple images is the Coupled-View AAM. Coupled-View AAMs model the 2D shape and appearance of a face in two or more views simultaneously. The major limitation of Coupled-View AAMs, however, is that they are specific to a particular set of cameras, both in geometry and the photometric responses. In this paper, we describe how a single AAM can be fit to multiple images, captured simultaneously by cameras with arbitrary geometry and response functions. Our algorithm retains the major benefits of Coupled-View AAMs:~the integration of information from multiple images into a single model, and improved fitting robustness.

Title :
Real-Time Combined 2D+3D Active Appearance Models

Author :
Jing Xiao, Simon Baker, Iain Matthews, and Takeo Kanade
The Robotics Institute,CMU

Origin :
CVPR 2004

Abstract :
Active Appearance Models (AAMs) are generative models commonly used to model faces. Another closely related type of face models are 3D Morphable Models (3DMMs). Although AAMs are 2D, they can still be used to model 3D phenomena such as faces moving across pose. We first study the representational power of AAMs and show that they can model anything a 3DMM can, but possibly require more shape parameters. We quantify the number of additional parameters required and show that 2D AAMs can generate model instances that are not possible with the equivalent 3DMM. We proceed to describe how a non-rigid structure-from-motion algorithm can be used to construct the corresponding 3D shape modes of a 2D AAM. We then show how the 3D modes can be used to constrain the AAM so that it can only generate model instances that can also be generated with the 3D modes. Finally, we propose a real-time algorithm for fitting the AAM while enforcing the constraints, creating what we call a "Combined 2D+3D AAM."

[NEWS]iRobot Unveils New Technology for Simultaneous Control of Multiple Robots

iRobot Corp. today released the first public photo of a new project in development, code named Sentinel. This innovative new networked technology will allow a single operator to simultaneously control and coordinate multiple semi-autonomous robots via a touch-screen computer. Funded by the U.S. Army's Small Business Innovation and Research (SBIR) program, the Sentinel technology includes intelligent navigation capabilities that enable the robots to reach a preset destination independently, overcoming obstacles and other challenges along the way without intervention from an operator.Sentinel’s capability will allow warfighters and first responders to use teams of iRobot® PackBot® robots to conduct surveillance and mapping, therefore rendering dangerous areas safe without ever setting foot in a hostile environment.


Lab meeitng 1 Nov., 2006 (Chihao): Microphone Array Speaker Localizers Using Spatial-Temporal Information

Sharon Gannot and Tsvi Gregory Dvorkind

EURASIP Journal on Applied Signal Processing
Volume 2006

A dual-step approach for speaker localization based on a microphone array is addressed in this paper. In the first stage, which is not the main concern of this paper, the time difference between arrivals of the speech signal at each pair of microphones is estimated. These readings are combined in the second stage to obtain the source location. In this paper, we focus on the second stage of the localization task. In this contribution, we propose to exploit the speaker’s smooth trajectory for improving the current position estimate. Three localization schemes, which use the temporal information, are presented. The first is a recursive form of the Gauss method. The other two are extensions of the Kalman filter to the nonlinear problem at hand, namely, the extended Kalman filter and the unscented Kalman filter. These methods are compared with other algorithms, which do not make use of the temporal information. An extensive experimental study demonstrates the advantage of using the spatial-temporalmethods. To gain some insight on the obtainable performance of the localization algorithm, an approximate analytical evaluation, verified by an experimental study, is conducted. This study shows that in common TDOA-based localization scenarios—where the microphone array has small interelement spread relative to the source position—the elevation and azimuth angles can be accurately estimated, whereas the Cartesian coordinates as well as the range are poorly estimated.