This Blog is maintained by the Robot Perception and Learning lab at CSIE, NTU, Taiwan. Our scientific interests are driven by the desire to build intelligent robots and computers, which are capable of servicing people more efficiently than equivalent manned systems in a wide variety of dynamic and unstructured environments.
Thursday, May 04, 2006
CMU ML Lunch talk: Smoothed Dirichlet distribution: Understanding the Cross-entropy
Date: May 08
Title: Smoothed Dirichlet distribution: Understanding the Cross-entropy ranking function in Information Retrieval
Abstract:
Unigram Language modeling is a successful probabilistic framework for Information Retrieval (IR) that uses the multinomial distribution to model documents and queries. An important feature in this approach is the usage of cross-entropy between the query model and document models as a document ranking function. The Naive Bayes model for text classification uses the same multinomial distribution to model documents but in contrast, employs document-log-likelihood as a scoring function. Curiously, the cross-entropy function roughly corresponds to query-log-likelihood w.r.t. the document models, in some sense an inverse of the scoring function used in the Naive Bayes model. It has been empirically demonstrated that cross entropy is a better performer than document-likelihood, but this interesting phenomenon remains largely unexplained. In this work we investigate the cross-entropy ranking function in IR. In particular, we show that the cross entropy ranking function corresponds to the log-likelihood of documents w.r.t. the approximated Smoothed-Dirichlet (SD) distribution, a novel variant of the Dirichlet distribution. We also empirically demonstrate that this new distribution captures term occurrence patterns in documents much better than the multinomial, thus offering a reason behind the superior performance of the cross entropy ranking function compared to the multinomial document-likelihood.
Our experiments in text classification show that a classifier based on the Smoothed Dirichlet performs significantly better than the multinomial based Naive Bayes model and on par with the SVMs, confirming our reasoning. We also construct a well-motivated classifier for IR based on SD distribution that uses the EM algorithm to learn from pseudo-feedback and show that its performance is equivalent to the Relevance model (RM), a state-of-the-art model for IR in the language modeling framework that also uses cross-entropy as its ranking function. In addition, the SD based classifier provides more flexibility than RM in modeling queries of varying lengths owing to a consistent generative framework. We demonstrate that this flexibility translates into a superior performance compared to RM on the task of topic tracking, an on-line classification task.
Link
CMU master's thesis: Data Structure for Efficient Dynamic Processing in 3-D
master's thesis, tech. report CMU-RI-TR-06-22, Robotics Institute, Carnegie Mellon University, May, 2006.
[pdf]
abstract
In this paper, we consider the problem of the dynamic processing of large amounts of sparse three-dimensional data. It is assumed that computations are performed in a neighborhood defined around each point in order to retrieve local properties. This general kind of processing can be applied to a wide variety of applications. We propose a new, efficient data structure and corresponding algorithm that significantly improve the speed of the range search operation and that are suitable for on-line operation, where data is accumulated dynamically. The method relies on taking advantage of overlapping neighborhoods and the reuse of previously computed data as the algorithm scans each data point. To demonstrate the dynamic capabilities of the data structure, we use data obtained from a laser radar mounted on a ground mobile robot operating in complex, outdoor environments. We show that this approach considerably improves the speed of an established 3-D perception processing algorithm.
Wednesday, May 03, 2006
PAL lab meeting 4, May, 2006 (Jim): Self-calibration and metric 3D reconstruction from images
M. Pollefeys, Self-calibration and metric 3D reconstruction from uncalibrated image sequences, Ph.D. Thesis, ESAT-PSI, K.U.Leuven, 1999, Scientific Prize BARCO 1999. (PollefeysPhD.pdf)
Pollefeys' website.
Abstract (PollefeysICCV98.pdf):
In this paper the feasibility of self-calibration in the presence of varying internal camera parameters is under investigation. A self-calibration method is presented which efficiently deals with all kinds of constraints on the internal camera parameters. Within this framework a practical method is proposed which can retrieve metric reconstruction from image sequences obtained with uncalibrated zooming/focusing cameras. The feasibility of the approach is illustrated on real and synthetic examples.
Tuesday, May 02, 2006
CMU VASC seminar : Spectral Rounding: with Applications in Image Segmentation and Clustering
Presenter : David Tolliver
Abstract
I'll discuss a novel family of spectral partitioning methods. Edgeseparators of a graph are produced by iteratively reweighting the edges until the graph disconnects into the prescribed number of components. At each iteration a small number of eigenvectors with small eigenvalue arecomputed and used to determine the reweighting. In this way spectralrounding directly produces discrete solutions where as current spectralalgorithms must map the continuous eigenvectors to discrete solutions byemploying a heuristic geometric separator (\\eg k-means). We show thatspectral rounding compares favorably to current spectral approximations onthe Normalized Cut criterion (NCut). Results are given for natural imagesegmentation, medical image segmentation, and clustering. A simple versionis shown to converge.This is joint work with Gary Miller in the Computer Science Departmentat CMU.
More info about VASC seminar.
Machine Learning List
Machine Learning List: Volume 18, Number 4
Monday, May 1, 2006
************************************************************************
Administrative announcements
New ML List format
Calls for Papers and Participation
ILP 2006
ABiALS Workshop 2006
ICGI 2006
ICML Workshop on Learning in Structured Output Spaces
AAI Special Issue on Applications of Grammatical Inference
ICML Workshop on Surveillance and Event Detection
Symposium on Semantic Web for Collaborative Knowledge Acquisition
ICML Workshop on Transfer Learning
ICML Workshop on Applications of Multiple-Instance Learning
ICML Workshop on Knowledge Discovery from Data Streams
ICML Workshop on Learning with Nonparametric Bayesian Methods
Workshop on Machine Learning in Structural and Systems Biology
ICML Workshop on Kernel Machines and Reinforcement Learning
Human-Competitive Competition at GECCO-2006
Summer School on Neural Networks
Book Announcements
Gaussian Processes for Machine Learning
Career Opportunities
Postdoc job in "Evidence" at UCL
************************************************************************
Date: Mon, 1 May 2006
From: Pat Langley
Subject: New ML List format
In this issue of the Machine Learning List, we introduce a new, briefer format that contains only the essential information about conferences, special issues, and similar events. We hope our readers will find it easier to find items that interest them, and that they can then go to the relevant URL to get additional information. We will continue to include longer announcements about open positions.
Sincerely, -Pat Langley
------------------------------------------------------------------------
Date: Mon, 13 Mar 2006 19:09:42 +0000
From: ILP 2006
Subject: ILP 2006
Call for Papers
16th International Conference on Inductive Logic Programming (ILP 2006) Santiago, Spain
http://ilp06.doc.ic.ac.uk
Submission deadline, short paper July 24, 2006
Acceptance notification, short paper: August 4, 2006
ILP 2006 Conference: August 24-27, 2006
Acceptance notification, selected papers: August 31, 2006
Submission deadline, full paper: September 30, 2006
Acceptance notification, full paper: November 3, 2006
Camera-ready deadline: December 8, 2006
Publication of conference proceedings: Early 2007
------------------------------------------------------------------------
From: Giovanni Pezzulo
Subject: ABiALS Workshop 2006
Date: Mon, 13 Mar 2006 17:07:55 +0100
Call for Papers
ABiALS Workshop 2006
Anticipatory Behavior in Adaptive Learning Systems
ROME, ITALY
http://www-illigal.ge.uiuc.edu/ABiALS
Submission Deadline: June 15, 2006
ABiALS Workshop 2006: September 30, 2006
------------------------------------------------------------------------
Date: 17 Mar 2006 07:18:02 -0000
From: satoshi@cs.uec.ac.jp
Subject: ICGI 2006
CALL FOR PAPERS
Eighth International Colloquium on Grammatical Inference (ICGI 2006)
http://www.tnlab.ice.uec.ac.jp/icgi06/
The University of Electro-Communications, Chofu, Tokyo 182-8585, JAPAN
Submission deadline: May 20, 2006
Acceptance notification: June 19, 2006
Final version of manuscripts: July 16, 2006
Conference date: September 20-22, 2006
------------------------------------------------------------------------
Date: Mon, 20 Mar 2006 11:46:41 +0100
From: Ulf Brefeld
Subject: ICML Workshop on Learning in Structured Output Spaces
Call for Papers
ICML-2006 Workshop on Learning in Structured Output Spaces
Carnegie Mellon University, Pittsburgh, PA
http://www.informatik.hu-berlin.de/~brefeld/lisos
Submission deadline: April 28, 2006
Acceptance notification: May 19, 2006
Final paper deadline: June 9, 2006
Workshop date: June 29, 2006
------------------------------------------------------------------------
Date: Tue, 21 Mar 2006 20:01:31 +1100
From: Menno van Zaanen
Subject: AAI Special Issue on Applications of Grammatical Inference
Call for Submissions
APPLIED ARTIFICIAL INTELLIGENCE
Special Issue on Applications of Grammatical Inference
http://www.ics.mq.edu.au/~menno/AAI06/
Submission deadline: May 1, 2006
Acceptance notification: October, 1, 2006
Final versions of accepted papers: December, 1, 2006
Publication: Second half of 2007
------------------------------------------------------------------------
Date: Tue, 21 Mar 2006 13:04:59 -0700
From: Terran Lane
Subject: ICML Workshop on Surveillance and Event Detection
Call for Papers and Contributions
ICML-2006 Workshop on Machine Learning Algorithms for Surveillance and Event Detection
Carnegie Mellon University, Pittsburgh, PA
Submissions deadline: April 28, 2006 (tentative)
Acceptance notification: May 19, 2006 (tentative)
Workshop proceedings posted on Web site: June 18, 2006
Workshop date: June 29, 2006
------------------------------------------------------------------------
Date: Tue, 21 Mar 2006 15:52:15 -0600
From: Vasant Honavar
Subject: Symposium on Semantic Web for Collaborative Knowledge Acquisition
AAAI Fall Symposium
Semantic Web for Collaborative Knowledge Acquisition (SWeCKa 2006)
Arlington, VA
Submission deadline: May 1, 2006
Acceptance notification: May 22, 2006
Camera-ready deadline: June 2, 2006
Symposium date: October 12-15, 2006
------------------------------------------------------------------------
Date: Mon, 27 Mar 2006 16:57:39 -0600 (CST)
From: Bikramjit Banerjee
Subject: ICML Workshop on Transfer Learning
CALL FOR PAPERS
Structural Knowledge Transfer for Machine Learning
Workshop at the 23rd International Conference on Machine Learning
Carnegie Mellon University, Pittsburgh, PA
http://www.cs.utexas.edu/~banerjee/icmlws06/
Submission deadline: May 02, 2006
Workshop date: June 29, 2006
------------------------------------------------------------------------
Date: Tue, 28 Mar 2006 16:04:43 -0600 (CST)
From: Stephen D. Scott
Subject: ICML Workshop on Applications of Multiple-Instance Learning
Call for Papers and Participation
Workshop on Applications of Multiple-Instance Learning
at the 23rd International Conference on Machine Learning
Carnegie Mellon University, Pittsburgh, PA
http://www.cs.wisc.edu/~sray/miws.html
Submission dealine: April 28, 2006
Acceptance notification: June 2, 2006
Camera-ready deadline: June 9, 2006
Workshop date: June 29, 2006
------------------------------------------------------------------------
Date: Fri, 31 Mar 2006 13:10:24 -0500
From: Josep Roure
Subject: ICML Workshop on Knowledge Discovery from Data Streams
Call for Papers and Participation
Third International Workshop on Knowledge Discovery from Data Streams
At the 23rd International Conference on Machine Learning
Carnegie Mellon University, Pittsburgh, PA
http://www.cs.cmu.edu/~jroure/iwkdds/iwkdds_icml06.html
Submission deadline: April 28, 2006
Acceptance notification: June 2, 2006
Camera-ready deadline: June 16, 2006
Workshop date: June 29, 2006
------------------------------------------------------------------------
Date: Fri, 31 Mar 2006 20:27:54 +0200
From: Steffen Bickel
Subject: ICML Workshop on Learning with Nonparametric Bayesian Methods
CALL FOR PAPERS / ABSTRACTS
ICML-2006 Workshop on Learning with Nonparametric Bayesian Methods
Carnegie Mellon University, Pittsburgh, PA
Submission deadline: April 28, 2006
Acceptance notification: May 19, 2006
Camera-ready deadline: June 9, 2006
Workshop date: June 29, 2006
------------------------------------------------------------------------
Date: Wed, 05 Apr 2006 13:08:23 +0300
From: Esa Pitkanen
Subject: Workshop on Machine Learning in Structural and Systems Biology
Workshop on Probabilistic Modeling and Machine Learning in
Structural and Systems Biology
Tuusula, Finland
http://www.cs.helsinki.fi/bioinformatiikka/events/pmsb06/
Submission deadline: April 23, 2006
Notification of acceptance: May 7, 2006
Final version due: May 31, 2006
Workshop date: June 17-18, 2006
------------------------------------------------------------------------
Date: Wed, 05 Apr 2006 20:09:17 +0200
From: Remi Munos
Subject: ICML Workshop on Kernel Machines and Reinforcement Learning
CALL FOR CONTRIBUTIONS
ICML-2006 Workshop on Kernel Machines and Reinforcement Learning
Carnegie Mellon University, Pittsburgh, PA
http://www.grappa.univ-lille3.fr/krl.html
Submission deadline: April 30, 2006
Workshop date: June 29, 2006
------------------------------------------------------------------------
Date: Tue, 28 Mar 2006 10:14:27 -0800
From: John Koza
Subject: Human-Competitive Competition at GECCO-2006
CALL FOR ENTRIES FOR
$10,000 IN AWARDS
HUMAN-COMPETITIVE RESULTS
www.human-competitive.org
to be held as part of the
GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO-2006)
July 8-12, 2006 (Saturday-Wednesday)
Renaissance Seattle Hotel, Seattle, Washington, USA
Entry deadline: May 29, 2006
Finalists' notification: June 25, 2006
Submission deadline: July 5, 2006
------------------------------------------------------------------------
Date: Tue, 21 Mar 2006 05:08:55 -0000
From: Jorge Santos
Subject: Summer School on Neural Networks
SUMMER SCHOOL NN2006
Neural Networks in Classification, Regression, and Data Mining
ISEP - Porto, Portugal
http://www.nn.isep.ipp.pt? email: nn-2006@isep.ipp.pt
Early Registration: May 15, 2006
Hotel booking: June 15, 2006
Summer School: July 3-7, 2006
------------------------------------------------------------------------
Date: Thu, 23 Mar 2006 15:26:42 -0500
From: David Weininger
Subject: Gaussian Processes for Machine Learning
This title is available from MIT Press:
http://mitpress.mit.edu/promotions/books/SP2006026218253X
Gaussian Processes for Machine Learning
Carl Edward Rasmussen and Christopher K. I. Williams
------------------------------------------------------------------------
Date: Mon, 27 Mar 2006 23:02:59 +0100
From: Peter Dayan
Subject: Postdoc job in "Evidence" at UCL
A vacancy has arisen for a Postdoctoral Fellow to work on the project "Formal tools for handling evidence", which forms part of the research programme "Evidence, Inference and Enquiry" at University College London -- see http://www.evidencescience.org
This is a 2-year post, available with immediate effect. Applicants should have a PhD in Statistics, Machine Learning, or similar, and be knowledgeable in theoretical and computational aspects of Bayesian Networks. The appointment will be on Grade 6, salary range 20234-23457 plus London Allowance of 2400.
Letters of application, including a Curriculum Vitae and names of 3 referees, should be sent to: Marion Ware, Department of Statistical Science, University College London, Gower Street, London WC1E 6BT, UK, telephone +44 (0)20 7679 1872, e-mail marion@stats.ucl.ac.uk. Full details of the post and the project can be found at:
http://128.40.59.163/evidence/people/emp0305.html
Monday, May 01, 2006
"Loose-Limbed People" Paradigm: Distributed Approach for Articulated Pose Estimation and Tracking
Speaker: Leonid Sigal , Brown University
Date: Monday, May 1 2006
Time: 2:00PM to 3:00PM
Refreshments: 1:30PM
Location: Seminar Room D463 (Star)
Host: C. Mario Christoudias, Gerald Dalley, MIT CSAIL
Contact: C. Mario Christoudias, Gerald Dalley, 3-4278, 3-6095, cmch@csail.mit.edu, dalleyg@mit.edu
Relevant URL:
Abstract:
In the recent years we presented a number of methods for a fully automatic pose estimation and tracking of human bodies in 2D and 3D. Initialization and failure recovery in these methods are facilitated by the use of a loose-limbed body model in which limbs are connected via learned probabilistic constraints. The pose estimation and tracking can then be formulated as inference in a loopy graphical model and approximate belief propagation can be used to estimate the pose of the body. Each node in the graphical model represents the position and orientation of the limb, and the directed edges between nodes represent statistical dependencies between limbs. There are a number of significant advantages of this paradigm as compared to the more traditional methods for tracking human motion.
In this talk I will introduce the loose-limbed model paradigm and its application to 3D and 2D pose estimation and tracking. I will also show some preliminary results of a fully-automatic 3D hierarchical inference framework for pose estimation and tracking from a single view, where a 2D loose-limbed body model serves as an intermediate representation in the inference hierarchy.
PAL lab meeting 4, Mar, 2006 (Tailion) Occupancy Grid Maps
Alberto ELfes
COMPUTER MAGAZINE
Jun. 1989
Abstract:
This article reviews an approach to
robot perception and world modeling that
uses a probabilistic tesselated representation
of spatial information called the occupancy
grid.’ The occupancy grid is a multidimensional
random field that maintains stochastic estimates
of the occupancy state of the cells in a spatial lattice.
To construct a sensor-derived map of the robot’s world,
the cell state estimates are obtained by interpreting the
incoming range readings using probabilistic sensor models.
Bayesian estimation procedures allow the incremental
updating of the occupancy grid using readings taken
from several sensors over multiple points of view.
CMU CFR Seminar: Hierarchical Simultaneous Localization and Mapping (HSLAM)
Deryck Morales
Time : 5:00pm
Place : NSH 1507
Abstract
H-SLAM is an autonomous localization and mapping strategy that
scales well to large indoor environments by decomposing the work
space into subregions. This is achieved using a topological graph
representation and associating ahigh resolution local map to each
graph edge. This organized collection ofmaps forms the Hierarchical
Atlas.
In this talk I will present the H-SLAM method in the context of
established mapping strategies and discuss the applications of the
atlas to path planning and global localization. I will present
experimental results verifying the addressed applications, and
compare the computational complexity of the H-SLAM approach to
other recent SLAM solutions. The most current work towards using
natural landmarks will be presented, and finally, future extensions of
this approach will be discussed.
http://www.cs.cmu.edu/~cfr
Saturday, April 29, 2006
CMU RI FRC Seminar: Teaching a Robot to Avoid Obstacles
Speaker: Bradley Hamner, FRC Staff / Masters Student, Robotics Institute, Carnegie Mellon University
Date: Thursday, May 4, 2006
Time: Noon
Location: NSH 1109
Abstract:
Many obstacle avoidance methods have been presented in the literature, all of which rely on tuning a set of parameters to a control function. Frequently, programmers tune gains by hand until the robot behaves as desired, a nonintuitive and frustrating process. In this talk I will present a method of learning the gains of an obstacle avoidance system automatically by observing how a human operator manually drives the vehicle. I will present an obstacle avoidance algorithm, and its parameters, and show how parameters learned by our method outperform parameters which were hand-tuned. I will also show preliminary results from learning the parameters for multiple vehicles which perform in different environments.
Speaker Bio:
Brad Hamner received a B.S. in Mathematics from Carnegie Mellon University in 2002. Since then he has worked as a staff member in the Field Robotics Center. He entered the Robotics Institute masters program in 2005. His research interests include mobile robot navigation and obstacle avoidance.
CMU RI FRC Seminar: Navigation Autonomy for Legged Machines
Speaker: James Kuffner, Assistant Professor, Robotics Institute, Carnegie Mellon University
Date: Thursday, April 27, 2006
Time: Noon
Location: NSH 1109
Abstract:
Legged robots are complex dynamic systems whose technology has evolved rapidly during the past decade. Presently, several companies are developing commercial prototype biped and quadruped robots. In this talk, I will present research aimed at improving the autonomy of legged robots through the development of practical motion planning algorithms that can be applied in dynamic unstructured environments. Specifically, I will discuss footstep placement planning over rough terrain, our "intelligent joystick" design for semi-autonomous control, and navigation among movable obstacles (NAMO). Experimental results obtained by implementations running on Honda's ASIMO, the AIST HRP2 humanoid, the H7 Humanoid (U. Tokyo), and the Boston Dynamics Little Dog quadruped robot will be shown.
Speaker Bio:
James Kuffner is an Assistant Professor at the Robotics Institute, Computer Science Dept., Carnegie Mellon University. He received a B.S. and M.S. in Computer Science from Stanford University in 1993 and 1995, and a Ph.D. from the Stanford University Dept. of Computer Science Robotics Laboratory in 1999. He was a Japan Society for the Promotion of Science (JSPS) Postdoctoral Research Fellow at the University of Tokyo from 1999 to 2001. He joined the faculty at Carnegie Mellon University in May 2002. His research interests include robotics, motion planning, and computer graphics and animation.
Friday, April 28, 2006
Computer scientists at Sheffield Hallam University, UK, have developed a new face recognition software which can produce an exact 3D image of a face within 40 milliseconds. A pattern of light is projected on your face, creating a 2D image, from which an accurate 3D representation is generated. This technology should speed airport check-ins, but it could also be used in banks or for checking ID cards as it allows full identification in less than one second.
This technology was developed at Sheffield Hallam University by the Geometric Modelling and Pattern Recognition Research Group of the Materials and Engineering Research Institute (MERI).
Here is what MERI Professor Marcos Rodrigues says about this new technology.
"This technology could be used anywhere there is a need for heightened security. It is well suited to a range of applications including person identification from national databases, access control to public and private locations, matching 3D poses to 2D photographs in criminal cases, and 3D facial biometric data for smart cards such as ID and bank cards. We have developed a viable, working system at the cutting edge of 3D technology."
Below are two screenshots showing the technology at work. (Credit: MERI)
These two screenshots have been extracted from a short movie available in different formats from this page about 3D Imaging at MERI.
But why similar systems have failed until now? The answer is provided by an article from Vision Systems Design, "Imaging technology may speed airport check-in."
Other 3-D systems, requiring 16 shots of the face, have proved unworkable because of the time it takes to construct a picture. The chance of movement during such a multishot process is extremely high, and if the face moves even a fraction then the 2-D to 3-D image is unworkable.
This is where the MERI's technology brings something new, including its accuracy -- and its low cost.
MERI also claims several other advantages for its technology. Hardware requirements are a projector and a single camera, making setup inexpensive--a few hundred pounds, compared with up to £40,000 for older systems. These need at least three or four cameras to capture an image, which means time-consuming parameters and complex calibrations.
Besides airports and banks, this technology could be used for industrial applications.
"Objects can go on a conveyor belt, and, instead of using a flat image, a 3-D image can help locate defects in them. Although we are focusing on security applications now, there is great potential in the future," said Rodrigues.
I sure hope that this system will go through extensive tests before being adopted.
Sources: Sheffield Hallam University news release, February 20, 2006; Vision Systems Design, February 27, 2006; and various web sites
Here is the link of the demo video
http://www.shu.ac.uk/research/meri/gmpr/projects/projects1a.html
Glasses that hear well
Glasses that hear well
If you live in the Netherlands and don't hear well, you'll soon be able to buy a new hearing aid, a pair of Varibel glasses. These special glasses originally developed at Delft University of Technology have four small interconnected microphones in each leg of their frames. And these microphones can "selectively intensify the sounds that come from the front, while dampening the surrounding noise." So these glasses offer a better sound quality than other hearing aids.
Before going further, here is a picture of these hearing glasses with their tiny microphones (Credit: Varibel).
Now, why people using current hearing aids are not satisfied?
Many hearing aids intensify sounds from all directions. The result is that people hear noise, but not the people they are speaking to. Because people have such difficulty understanding what others are saying, many people -- in spite of their hearing aid -- have less social contact with others or must retire from their jobs earlier than desired. The hearing-glasses can provide a solution to this problem, say the experts and users who have tried and tested the Varibel.
So what is the solution brought by Varibel?
The Varibel cannot be compared to traditional hearing aids. In each leg of the glass' frame there is a row of four tiny, interconnected microphones, which selectively intensify the sounds that come from the front, while dampening the surrounding noise. The result is a directional sensitivity of +8.2 dB. In comparison, regular hearing aids have a maximum sensitivity of +4 dB. With this solution, the user can separate the desired sounds from the undesired background noise.
Below is a picture of a full Varibel package as you'll be able to buy before the end of April 2006 (Credit: Varibel).
And will it be of good help in public places?
Martin de Jong, audio-technician, says: "With the Varibel, the natural sounds that people enjoy are retained. This works surprisingly well. People can hear good and at the same time clearly – and especially in rooms such as in a cafe or at a birthday party."
For more information, you can visit the Varibel web site -- if you read Dutch.
Sources: Delft University of Technology news release, April 7, 2006; and various web sites
LinkFinding a Better Way to Quiet Noisy Environments
“Noise cancellation is a hidden technology that most consumers aren’t aware of, but vehicles made by BMW, Mercedes, Honda, and other companies are now using it,” said Raymond de Callafon, co-author of the paper and a professor of mechanical and aerospace engineering at UCSD’s Jacobs School of Engineering. “Our new technique should greatly expand the potential of active noise-cancellation technologies.”
Basic active noise-cancellation is composed of four inter-related parts: a microphone that measures incoming noise and feeds that information to a computer, a computer processor that converts the noise information into anti-noise instructions, and an audio speaker that is driven by the anti-noise signal to broadcasts sound waves that are exactly 180 degrees out of phase with the unwanted signal and of the same magnitude. In addition, a downstream microphone monitors residual noise and signals the computer as part of a process to optimize the anti-noise signal.
This “feedforward” active-noise control can reduce unwanted helicopter and cabin noise or the steady roar of industrial air handling systems by 40 decibels or more. However, most commercial systems suffer from acoustic feedback because the anti-noise signal produced by the noise-cancellation speakers can feed back into the microphone and become amplified repeatedly until the resulting sound becomes an ear-splitting squeal or whistle.
“Most people ignore this acoustic coupling but we took it into account and designed the feedforward noise cancellation knowing that the acoustic coupling is there,” said de Callafon.
Some makers of active noise cancellation avoid acoustic coupling by shielding microphones from speakers, or by using directional microphones and speakers that are pointed away from each other. “This works fine in the case of noise-reduction headphones and air-conditioning ducts, but it’s impractical in hundreds of other applications,” de Callafon said.
For example, the algorithm developed by de Callafon and Ph.D. candidate J. Zeng may be adapted to cancel unwanted complex signals that are moving, such as the sound of bustling urban traffic coming through a ventilation opening.
“We think we’ve developed a totally new approach that works by generating the ‘feedforward’ noise cancellation signals and adaptively changing them in the presence of acoustic coupling,” de Callafon said. “This has been a really complicated problem to solve and we think the approach we’ve taken will have a significant impact on the field.”
Source: University of California, San Diego
Link
Center for the Foundations of Robotics Seminar, April 26, 2006: Methodology for Design and Analysis of Physically Cooperating Mobile Robots
Ashish D Deshpande
Time and Place:
Newell Simon Hall 1507
Refreshments 4:45 pm
Talk 5:00 pm
Abstract:
A team of small, low-cost robots instead of a single large, complex robot is useful in operations such as search and rescue, urban exploration etc. However, the performance of such a team is limited due to the restricted mobility of the team members. The first part of my talk will present the results obtained toward the goal of enhancing mobility of a team of mobile robots by physical cooperation among the robots. We have carried out static as well as dynamic analysis of cooperating mobile robot system and developed 2-robot hardware to demonstrate cooperative behaviors.
There is a need to develop a methodology to design and analyze cooperative maneuvers involving multiple mobile robots. The second part of my talk will present our efforts toward the development of such a methodology. Our approach is to treat the linked mobile robots as a multiple degree-of-freedom object, comprising an articulated open kinematic chain, which is being manipulated by pseudo robots (p-robots) at the ground interaction points. Such rearrangement of the problem facilitates the adaptation of ideas from the cooperative manipulation literature. We present the new methodology by carrying out static as well as dynamic analysis for a 2-robot cooperation case with the new methodology. Also, we have demonstrated that introduction of redundant actuation, by an additional (third) robot, can help in improving the friction requirements. We also present our ideas for employing this newly designed methodology to analyze other interesting multi-body robotic systems.
Bio
Ashish Deshpande is a doctoral candidate under Dr. Jonathan Luntz in the Mechanical Engineering Dept. at the University of Michigan, Ann Arbor. His areas of interest include mobile robotics, multi-body dynamics, controls and engineering design. Ashish has recived B.E. from VNIT, Nagpur, India in 1999 and M.S. from the University of Massachusetts, Amherst in 2002.
http://www.cs.cmu.edu/~cfr/talks/2006-Apr-26.html
Thursday, April 27, 2006
Detecting and tracking multiple interacting objects without class-specific models
Bose, Biswajit
Wang, Xiaogang
Grimson, Eric
Abstract:
We propose a framework for detecting and tracking multiple interacting objects from a single, static, uncalibrated camera. The number of objects is variable and unknown, and object-class-specific models are not available. We use background subtraction results as measurements for object detection and tracking. Given these constraints, the main challenge is to associate pixel measurements with (possibly interacting) object targets. We first track clusters of pixels, and note when they merge or split. We then build an inference graph, representing relations between the tracked clusters. Using this graph and a generic object model based on spatial connectedness and coherent motion, we label the tracked clusters as whole objects, fragments of objects or groups of interacting objects. The outputs of our algorithm are entire tracks of objects, which may include corresponding tracks from groups of objects during interactions. Experimental results on multiple video sequences are shown.
Link
Tuesday, April 25, 2006
CMU RI Special FRC Seminar: Optimal Rough Terrain Trajectory Generation for Wheeled Mobile Robots
Date: *Tuesday*, April 25, 2006
Time: **3pm** (not Noon)
Location: NSH 1109
Refreshments will be served
Speaker: Thomas Howard, PhD Candidate, Robotics Institute, Carnegie Mellon University
Abstract:
In order to operate competently in any environment, a mobile robot must understand the effects of its own dynamics and of its interactions with the terrain. It is therefore natural to incorporate models of these effects in a trajectory generator which determines the controls necessary to achieve motion between a prescribed set of boundary states. This talk addresses recent work in developing a general algorithm for continuous motion primitive trajectory generation for arbitrary vehicle models on rough three dimensional terrain. The generality of the method derives from linearizing and inverting forward models of propulsion, suspension, and motion to minimize boundary state error and path cost given a parameterized set of controls. The simulation-based approach can accommodate effects such as rough terrain, wheel slip, and predictable vehicle dynamics. We will present this algorithm for local motion planning and discuss applications in planetary rovers and unmanned ground vehicles.
Related Links:
- Terrain-Adaptive Generation of Optimal Continuous Trajectories for Mobile Robots, T. Howard and A. Kelly, Proceedings of the 8th International Symposium on Artificial Intelligence, Robotics, and Automation in Space (i-SAIRAS '05), September, 2005.
- Trajectory Generation on Rough Terrain Considering Actuator Dynamics, T. Howard and A. Kelly, Proceedings of the 5th International Conference on Field and Service Robotics (FSR '05), July, 2005.
[Robot Perception and Learning] PAL lab meeting 27, April, 2006 (Casey) Robutst Real-Time Face Detection
From: International Journal of Computer Vision 2004
Abstract:
This paper describes a face detection framework that is capable of processing images extremely rapidly
while achieving high detection rates. There are three key contributions. The first is the introduction of a new
image representation called the “Integral Image” which allows the features used by our detector to be computed
very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm
(Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of
potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background
regions of the image to be quickly discarded while spending more computation on promising face-like
regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance
comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and
Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per
second.
Paper link: http://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/viola04ijcv.pdf
CMU Thesis Proposal : Face View Synthesis Using A Single Image (3 May 2006)
Robotics Institute
Carnegie Mellon University
Abstract
Face view synthesis involves using one view of a face to artificially render another view. It is an interesting problem in computer vision and computer graphics, and can be applied in the entertainment industry such as animated movies or video games. The fact that the input is only a single image, makes the problem very difficult. Previous approaches perform machine learning on pair of poses from 2D training data and then predict the unknown pose in the test example. Such 2D approaches are much more practical than approaches requiring 3D data and more computationally efficient. However they perform inadequately when dealing with large angles between poses. In this proposal we seek to improve performance through better choices in probabilistic modeling. As a first step, we have implemented a statistical model combining distance in feature space (DIFS) and distance from feature space (DFFS) for such pair of poses. Such a representation leads to better performance. Furthermore, we have observed that statistical dependency varies among different groupings of pixels. In particular, a given pixel variable is often statistically correlated with only a small number of other pixel variables. We propose to exploit this statistical structuring by modeling the synthesis problem using graphical probability models. Such representations concisely describe the synthesis problem, providing a rich model with reduced susceptibility to over-fitting.
More detail : http://www.cs.cmu.edu/~jiangni/thesis/jiang_proposal.pdf
Monday, April 24, 2006
PAL lab meeting 27, April, 2006 (Any) Solving Partially Observable Markov Decision Processes
Abstract:
This paper describes the POMDP framework and presents some well-known results from the field. It then presents a novel method called the witness algorithm for solving POMDP problems and analyzes its computational complexity. We argue that the witness algorithm is superior to existing algorithms for solving POMDP's in an important complexity-theoretic sense.
Outlines:
- Introduction to MDP
- CO(Completely Observable)-MDP vs. POMDP
- Definition of POMDP
- Solving POMDP
- POMDP Value Iteration
Tuesday, April 18, 2006
PAL lab meeting 20th,April,2006(Stanley) CMU RI Thesis Oral: Visual Feedback Manipulation for Hand Rehabilitation in a Robotic Environment
18 April 2006
Abstract: In this thesis, I examine how manipulations of the visual feedback given to a patient can be used to make robotic therapy more effective than traditional human-assisted therapy and previous robotic rehabilitation applications. Patients may not strive for difficult goals in therapy due to entrenched habits or personality variables such as low self-efficacy or a fear of failure. Visual feedback manipulation can be used to encourage patients to move beyond an established level of performance. Specifically, I examine two types of visual feedback manipulation: visual distortion and visual progression. By “visual progression,” I mean veridical visual feedback emphasizing and encouraging gradual improvements in performance; by “visual distortion,” I mean visual feedback that establishes a metric of performance for a given rehabilitation task and then gradually changes this metric such that improved performance is required for the same visual response. For a therapeutic program involving distortion to be most effective, patients must not detect the visual distortions. Thus, the first set of experiments I conducted addressed the limits of imperceptible visual distortion with unimpaired subjects. Further experiments with unimpaired subjects were conducted to show that vision dominates kinesthetic feedback in our robotic rehabilitation environment and that gradual visual distortion can be used to control force production and movement distance within a single experimental session. I also examined the effects of distortion during a difficult two-finger coordination task. Based on this work, I designed paradigms applying visual feedback manipulation to the rehabilitation of chronic stroke and traumatic brain injury patients. I performed initial tests with three patients, each of whom participated in a 6-week rehabilitation protocol. Patients' performances during the initial assessment at each therapeutic session were found to be an underestimate of their actual abilities and a poor metric for setting the difficulty level of therapeutic exercise. All three patients were willing and able to improve their performance by following distortion or progression, and all patients showed functional improvements after participation in the study. Visual feedback manipulation may provide a way to help a patient move beyond his or her self-assessed “best” performance, improving the outcome of robotic rehabilitation.
http://www.cs.cmu.edu/~broberts/Dissertation.pdf
PAL lab meeting 20th,April,2006(Vincent) Face recognition using eigenfaces
Matthew Turk and Pentland A.P.
Media Lab., MIT, Cambridge, MA, USA ;
This paper appears in:
Computer Vision and Pattern Recognition, 1991. Proceedings CVPR '91., IEEE Computer Society Conference on
Publication Date: 3-6 June 1991
Abstract :
An approach to the detection and identification of human faces is presented, and a working, near-real-time face recognition system which tracks a subject's head and then recognizes the person by comparing characteristics of the face to those of known individuals is described. This approach treats face recognition as a two-dimensional recognition problem, taking advantage of the fact that faces are normally upright and thus may be described by a small set of 2-D characteristic views. Face images are projected onto a feature space (`face space') that best encodes the variation among known face images. The face space is defined by the `eigenfaces', which are the eigenvectors of the set of faces; they do not necessarily correspond to isolated features such as eyes, ears, and noses. The framework provides the ability to learn to recognize new faces in an unsupervised manner
Here is the link of this paper.
Saturday, April 15, 2006
CNN: Snake robots could aid in rescues
'Breadstick' and 'Pepperoni' are being tested
Wednesday, April 12, 2006; Posted: 9:20 p.m. EDT (01:20 GMT)
PITTSBURGH, Pennsylvania (AP) -- For most people, snakes seem unpleasant or even threatening. But Howie Choset sees in their delicate movements a way to save lives.
The 37-year-old Carnegie Mellon University professor has spent years developing snakelike robots he hopes will eventually slither through collapsed buildings in search of victims trapped after natural disasters or other emergencies.
In recent weeks, Choset and some of his students made what he said was an industry breakthrough: enabling the articulated, remote-controlled devices to climb up and around pipes.
Full Article
Friday, April 14, 2006
CMU VASC talk: 3D Photography: Reconstructing Photorealistic 3D Models of Large-Scale Scenes
Monday, April 17, 2006
Abstract:
Recently there has been an increased interest in the photorealistic modeling and rendering of large-scale scenes, such as urban structures. This requires a fusion of range sensing technology and traditional digital photography. A major bottleneck in this process is the automated registration of a large number of geometrically complex 3D range scans and high-resolution 2D images in a common frame of reference. In this talk we will present a novel system that integrates automated 3D registration techniques with multiview geometry for texture mapping 2D images onto 3D range data. Our methods utilize range segmentation and feature extraction algorithms. We will also describe our approach in 3D mesh generation. The produced 3D representations are useful for urban planning, historical preservation, or entertainment applications. We will present results of scanning large urban structures, such as the interior of the Grand Central Terminal in New York.
Bio: Ioannis Stamos is an associate professor of computer science and director of the Vision & Graphics Laboratory at Hunter College of the City University of New York (2001-present). He is also a member of the doctoral faculty of the Graduate Center of CUNY. His research interests include 3D segmentation, range to image registration and 3D modeling. Stamos received a PhD, an MPhil and an MS in computer science from Columbia University. He received an Engineering Diploma in computer engineering & informatics from the University of Patras, Greece. Stamos is a recipient of the Faculty Early Career Development Award (CAREER) by the National Science Foundation.
CMU ML talk: Bayesian Inference for Gaussian Mixed Graph Models
http://www.cs.cmu.edu/~rbas
Date: April 17
Abstract: We introduce priors and algorithms to perform Bayesian inference in Gaussian models defined by acyclic directed mixed graphs. Such a class of graphs, composed of directed and bi-directed edges, is a representation of conditional independencies that is closed under marginalization and arises naturally from causal models which allow for unmeasured confounding. Monte Carlo methods and a variational approximation for such models are presented. Our algorithms for Bayesian inference allow the evaluation of posterior distributions for several quantities of interest, including causal effects that are not identifiable from data alone but could otherwise be inferred where informative prior knowledge about confounding is available.
Joint work with Zoubin Ghahramani
Thursday, April 13, 2006
What's New @ IEEE April 2006
"IEEE Spectrum" has issued its fourth annual list of the top 10 tech cars. The article focuses on production cars now in showrooms or soon to be available, but this year also singles out three concept cars for special mention. Cars on this year's list include the 2006 Chrysler Heritage Edition, whose headlights automatically switch to low beams when the car detects approaching vehicles and the 2007 Mercedes-Benz E 320 Bluetec, which will have the cleanest diesel engine on the planet. Read more: http://www.spectrum.ieee.org/apr06/3173
3. PROJECT SEEKS SOLUTIONS FOR FUTURE OF WIRELESS NETWORKS
With an increasing amount of embedded wireless sensor technology being developed, system designers are now faced with the challenge of deciding the best direction to take for future research so that the full capabilities of the networks can be realized. As a result, the European Commission's Information Society Technologies project Embedded WiseNts is focusing on finding solutions to the problems associated with the production of Wireless Sensor Networks and their applications, particularly in the form of Cooperative Objects. The team's goal is to acquire a general vision of these networks and predict technical progress over the next 10 years. The project will conclude in December 2006 and team members have already identified several key areas of weaknesses, including the lack of a middleware layer for the adaptation of diverse application software and the need for better energy efficiency in both hardware and software. Read more: http://www.eurekalert.org/pub_releases/2006-03/ir-ptr032706.php
4. SMARTPHONES NOW AND IN THE FUTURE
A new report appearing in "IEEE Distributed Systems Online" (v. 7, no. 3) discusses what makes a cellphone a smartphone and looks at the future of the market. According to the article, smartphones are broken into three categories -- high-end phones, PDAs, and enhanced wireless email devices such as Blackberrys. The components that comprise them, such as internal memory, location-based services, and screen display, are common on all, but differ slightly depending on model. For instance, some use SVGA screens while others still use VGAs. Their operating systems consist mostly of Windows-based and Linux-based systems, with Symbian OS considered the leader. As these technologies improve, and WiFi hot spots increase worldwide, users can expect to find more location-specific services, especially in the realm of commerce programs that will cater to shopping centers. Additionally, M-commerce, the ability to use a phone to pay for items, is also something software developers are trying streamline. Read more: the link
6. TELEPHONY'S NEXT ACT: "IEEE SPECTRUM" REPORTS
Will Voice Over Internet Protocol wreak havoc with the systems of the Internet, or will it make our lives easier and better? Folding traditional telephony into the Internet is tricky, according to an article in this month's issue of "IEEE Spectrum" magazine. Their hardware and software are different and, perhaps hardest of all, today they involve totally different databases. The thing they do most differently is called signaling -- keeping track of all of the potential communicating parties, their equipment, and their services, and selecting the right combination for each contact. The next seven years will be key. Read more:
http://www.spectrum.ieee.org/apr06/3204
Wednesday, April 12, 2006
MIT CSAIL talk: Functional Specificity in the Cortex: Selectivity, Experience, & Generality
Functional MRI has revealed several cortical regions in the ventral visual pathway in humans that exhibit a striking degree of functional specificity: the fusiform face area (FFA), parahippocampal place area (PPA), and extrastriate body area (EBA). I will briefly review this work and then discuss more recent studies that investigate the specificity, origins, and generality of domain specificity in the cortex. In particular these studies ask i) how specialized is the FFA for faces and what exactly it does with faces?, ii) how do cortical responses to visually presented objects change with experience and is extensive experience ever sufficient to create them?, and iii) are domain specific regions of cortex found only in the visual system, or can they sometimes be found for very abstract high-level cognitive functions as well?
Monday, April 10, 2006
PAL Lab Meeting 4/13(Eric): 3D Scanner Demo
1. Projector-camera system calibration.
2. How to compute the object depth.
3. Demo.
PAL Lab Meeting 4/13(ChiHao): Robust Speaker's Location Detection in a Vehicle Environment Using GMM Models
Author: Jwu-Sheng Hu, Member, IEEE, Chieh-Cheng Cheng, and Wei-Han Liu
Abstract:
Human–computer interaction (HCI) using speech communication is becoming increasingly important, especially in driving where safety is the primary concern. Knowing the speaker's location (i.e., speaker localization) not only improves the enhancement results of a corrupted signal, but also provides assistance to speaker identification. Since conventional speech localization algorithms suffer from the uncertainties of environmental complexity and noise, as well as from the microphone mismatch problem, they are frequently not robust in practice. Without a high reliability, the acceptance of speech-based HCI would never be realized. This work presents a novel speaker's location detection method and demonstrates high accuracy within a vehicle cabinet using a single linear microphone array. The proposed approach utilize Gaussian mixture models (GMM) to model the distributions of the phase differences among the microphones caused by the complex characteristic of room acoustic and microphone mismatch. The model can be applied both in near-field and far-field situations in a noisy environment. The individual Gaussian component of a GMM represents some general location-dependent but content and speaker-independent phase difference distributions. Moreover, the scheme performs well not only in nonline-of-sight cases, but also when the speakers are aligned toward the microphone array but at difference distances from it. This strong performance can be achieved by exploiting the fact that the phase difference distributions at different locations are distinguishable in the environment of a car. The experimental results also show that the proposed method outperforms the conventional multiple signal classification method (MUSIC) technique at various SNRs.
Link
Saturday, April 08, 2006
Stanford AI talk: Toward a Geometrically Coherent Image Interpretation
April 10, 2006, 3:15PM (NOT 4:15PM)
http://graphics.stanford.edu/ba-colloquium/
Abstract
Image interpretation, the ability to see and understand the three-dimensional world behind a two-dimensional image, goes to the very heart of the computer vision problem. The ultimate objective is, given an image, to automatically produce a coherent interpretation of the depicted scene. This requires not only recognizing specific objects (e.g. people, houses, cars, trees), but understanding the underlying structure of the 3D scene where these objects reside.
In this talk I will describe some of our recent efforts toward this lofty goal. I will present an approach for estimating the coarse geometric properties of a scene by learning appearance-based models of geometric classes. Geometric classes describe the 3D orientation of image regions with respect to the camera. This geometric information is then combined with camera viewpoint estimation and local object detection producing a prototype for a coherent image-interpretation framework.
Joint work with Derek Hoiem and Martial Hebert at CMU.
MIT CSAIL Thesis Oral: Multi-Stream Speech Recognition: Theory and Practice
Date: Monday, April 10 2006
In this thesis, we have focused on improving the acoustic modeling of speech recognition systems to increase the overall recognition performance. We formulate a novel multi-stream speech recognition framework using multi-tape finite-state transducers (FSTs). The multi-dimensional input labels of the multi-tape FST transitions specify the acoustic models to be used for the individual feature streams. An additional auxiliary field is used to model the degree of asynchrony among the feature streams. The individual feature streams can be linear sequences such as fixed-frame-rate features in traditional hidden Markov model (HMM) systems, and the feature streams can also be directed acyclic graphs such as segment features in segment-based systems. In a single-tape mode, this multi-stream framework also unifies the frame-based HMM and the segment-based approach.
Systems using the multi-stream speech recognition framework were evaluated on an audio-only and an audio-visual speech recognition task. On the Wall Street Journal speech recognition task, the multi-stream framework combined a traditional frame-based HMM with segment-based landmark features. The system achieved word error rate (WER) of 8.0%, improved from both the WER of 8.8% of the baseline HMM-only system and the WER of 10.4% of the landmark-only system. On the AV-TIMIT audio-visual speech recognition task, the multi-stream framework combined a landmark model, a segment model, and a visual HMM. The system achieved a WER of 0.9%, which also improved from the baseline systems. These results demonstrate the feasibility and versatility of the multi-stream speech recognition framework.
Thesis Supervisor: James R. Glass
Committee: Victor Zue, Michael Collins, Herb Gish
MIT CSAIL talk: Steps Toward the Creation of a Retinal Implant for the Blind
Date: Monday, April 10 2006
Abstract: This talk describes the efforts at MIT and the Massachusetts Eye and Ear Infirmary over the past 15 years to develop a chronically implantable retinal prosthesis. The goal is to restore some useful level of vision to patients suffering from outer retinal diseases, primarily retinitis pigmentosa and macular degeneration. We initially planned to build an intraocular implant, wirelessly supplied with signal and power, to stimulate the surviving cells of the retina. In this design electrical stimulation is applied through an epiretinal microelectrode array attached to the inner (front) surface of the retina. We have carried out a series of six acute surgical trials on human volunteers (five of whom were blind with retinitis pigmentosa and one with normal vision and cancer of the orbit) to assess electrical thresholds and the perceptions resulting from epiretinal retinal stimulation. The reported perceptions often corresponded poorly to the spatial pattern of the stimulated electrodes. In particular, no patient correctly recognized a letter. We hope that chronically implanted patients will adapt over time to better interpret the abnormal stimuli supplied by such a prosthesis.
Experiences with both animals and humans exposed surgical, biocompatibility, thermal and packaging difficulties with this epiretinal approach. Two years ago we altered our approach to a subretinal design which will, we believe, reduce these difficulties. Our current design places essentially the entire bulk of the implant on the temporal outer wall of the eye, with only a tiny sliver of the 10 micron thick microelectrode array inserted through a scleral flap beneath the retina. In this design the entire implant lies in a sterile area behind the conjunctiva. We plan to have a wireless prototype version of this design ready for chronic animal implantation this Spring.
about posting your talk...
This is regarding posting your talk on the blog. As "my talk this week" does not say much, please use this format "PAL Lab Meeting date(Name): talk title". For instance, PAL Lab Meeting 4/13(ChiHao): Super Microphone Array Localization.
Best,
-Bob
CMU RI Thesis Oral: Visual Feedback Manipulation for Hand Rehabilitation in a Robotic Environment
18 April 2006
Abstract: In this thesis, I examine how manipulations of the visual feedback given to a patient can be used to make robotic therapy more effective than traditional human-assisted therapy and previous robotic rehabilitation applications. Patients may not strive for difficult goals in therapy due to entrenched habits or personality variables such as low self-efficacy or a fear of failure. Visual feedback manipulation can be used to encourage patients to move beyond an established level of performance. Specifically, I examine two types of visual feedback manipulation: visual distortion and visual progression. By “visual progression,” I mean veridical visual feedback emphasizing and encouraging gradual improvements in performance; by “visual distortion,” I mean visual feedback that establishes a metric of performance for a given rehabilitation task and then gradually changes this metric such that improved performance is required for the same visual response. For a therapeutic program involving distortion to be most effective, patients must not detect the visual distortions. Thus, the first set of experiments I conducted addressed the limits of imperceptible visual distortion with unimpaired subjects. Further experiments with unimpaired subjects were conducted to show that vision dominates kinesthetic feedback in our robotic rehabilitation environment and that gradual visual distortion can be used to control force production and movement distance within a single experimental session. I also examined the effects of distortion during a difficult two-finger coordination task. Based on this work, I designed paradigms applying visual feedback manipulation to the rehabilitation of chronic stroke and traumatic brain injury patients. I performed initial tests with three patients, each of whom participated in a 6-week rehabilitation protocol. Patients' performances during the initial assessment at each therapeutic session were found to be an underestimate of their actual abilities and a poor metric for setting the difficulty level of therapeutic exercise. All three patients were willing and able to improve their performance by following distortion or progression, and all patients showed functional improvements after participation in the study. Visual feedback manipulation may provide a way to help a patient move beyond his or her self-assessed “best” performance, improving the outcome of robotic rehabilitation.
Further Details
A copy of the thesis oral document can be found at http://www.cs.cmu.edu/~broberts/Dissertation.pdf.
Wednesday, April 05, 2006
PAL Lab meeting (Tailion 4/5): slammot demo
I'll show my slammot demo tomorrow.
In this demo, I could show you latest version of slammot interface and the problems I met.
-tailion
My talk this week
I will show a program tracking vehicle in NTU using Kalman Filter.
Tuesday, April 04, 2006
CMU VASC talk: Visual patterns with matching subband statistics and higher order image
Abstract: Statistical representations of visual patterns are commonly used in computer vision. One such representation is a distribution measured from the output of a bank of filters (Gaussian, Laplacian, Gabor, wavelet etc). Both marginal and joint distributions of filter responses have been advocated and effectively used for a variety of vision tasks.
We begin by examining the ability of these representations to discriminate between an arbitrary pair of visual stimuli. Examples of patterns are derived that possess the same statistical properties, yet are "visually distinct." The existence of these patterns suggests the need for more powerful early visual representations.
It has been argued that the primary role of early vision is the modeling of statistical redundancy in natural imagery. One of the most striking properties of images is scale invariance. In the second part of this talk, this property is examined and a novel image representation, the higher order pyramid, is introduced. The representation is tuned to the scale invariant properties of images and constitutes a form of "higher order signal whitening."
BIO: Joshua Gluckman received the BS degree in economics from the University of Virginia (1992), the MS degree in computer science from the College of William and Mary (1995), and the PhD degree in computer science from Columbia University (2000). Since 2001, he has held the position of assistant professor of computer science at Polytechnic University in Brooklyn, NY. His area of research is computer vision.
Friday, March 31, 2006
CMU ML talk: Rodeo: Sparse Nonparametric Regression in High Dimensions
Date: April 03
Time: 12:00 noon
Abstract:
We present a method for simultaneously performing bandwidth selection and variable selection in nonparametric regression. The method starts with a local linear estimator with large bandwidths, and incrementally decreases the bandwidth in directions where the gradient of the estimator with respect to bandwidth is large. When the unknown function satisfies a sparsity condition, the approach avoids the curse of dimensionality. The method---called rodeo(regularization of derivative expectation operator)---conducts a sequence of hypothesis tests, and is easy to implement. A modified version that replaces testing with soft thresholding may be viewed as solving a sequence of lasso problems. When applied in one dimension, the rodeo yields a method for choosing the locally optimal bandwidth.
Joint work with John Lafferty.
Videos about the Great Robot Race.
http://www.pbs.org/wgbh/nova/darpa
CMU ML talk: Calibration, Regret and Learning in Games
Speaker: Rakesh Vohra
Abstract:
This talk will be a survey of the connections between calibration (a measure of the accuracy of a probability forecast), regret (a measure of how well a decision rule performs which in Wodsworth's words: `looks before and after and pines for what is not') and the question of learning in games (will boundedly rational players in repeated play of a game converge to ones favorite equilibrium of the game?).
http://www.kellogg.northwestern.edu/faculty/vohra/htm/vohra.htm
Thursday, March 30, 2006
CFP: JFR Special Issue on Space Robotics
Special Issue On Space Robotics
Guest Editors: David Wettergreen and Alonzo Kelly, CMU, and Larry Matthies, JPL
It seems impossible to get a robot farther afield than by putting it into space. Space applications present many challenges to robotic systems: from extremes of temperature, vacuum, shock and gravity, to limitations on power and communication, from the intricate complexity of systems engineering, to requirements for reliability, robustness and autonomy.
The Journal of Field Robotics (JFR) [ http:// www.journalfieldrobotics.org ] announces a special issue on space robotics to examine these and other issues related to robots and space. This special issue will present and discuss the state of the art in space robots, their theory and practice.
We invite papers that exhibit theory and methods applied to robotic systems in space including:
- specification and evaluation of system concepts and designs;
- effects of the space environment on robotic devices;
- methods of sensing, actuation, and mobility;
- experiments in manipulation, assembly, construction and excavation;
- algorithms for localization and navigation, and task or mission planning;
- efforts related to deep space navigation and autonomous operation;
- techniques for safe and precise entry, descent, and landing; and
- analysis of human robot interaction and robot autonomy.
Papers for this special issue must also provide technical descriptions of systems and results and analysis of experimentation with orbital robots and spacecraft or planetary landers or rovers or with system prototypes in terrestrial analogue environments. Lessons learned in development and operation are also pertinent.
We encourage papers addressing all aspects of space robotic systems. Our emphasis is on systems that fulfill a specific space-relevant application. Robotic systems in Earth orbit, traveling in deep space, and operating on the surfaces of planets, moons, comets, or asteroids are of particular interest, as well systems envisioned for space application but developed and demonstrated in relevant environments here on Earth.
The JFR encourages multimedia content and this special issue seeks inclusion of movies illustrating system concept and operation, engineering experiments, and of course space operation.
Deadlines:
June 2, 2006 – Submit manuscripts
July 14, 2006 – Reviews completed
August 4, 2006 – Decisions and author notification
September 1, 2006 – Final manuscripts for publication
Authors interested in submitting to this issue can discuss submissions with the special issue editors, David Wettergreen
Robot PAL lab (Any): Car-like Robot Control
Outlines:
Modeling the car-like robot
System identification issues
Path following of our toy car
Path planning
Wednesday, March 29, 2006
Robot PAL Lab Meeting (Casey): Fast and Accurate Hand Pose Detection for Human-Robot Interaction
Author: Luis Antón-Canalís1, Elena Sánchez-Nielsen, and Modesto Castrillón-Santana
From: IbPRIA 2005, LNCS 3522, pp. 553–560, 2005
Abstract: Enabling natural human-robot interaction using computer vision
based applications requires fast and accurate hand detection. However, previous
works in this field assume different constraints, like a limitation in the number
of detected gestures, because hands are highly complex objects difficult to locate.
This paper presents an approach which integrates temporal coherence cues
and hand detection based on wrists using a cascade classifier. With this approach,
we introduce three main contributions: (1) a transparent initialization
mechanism without user participation for segmenting hands independently of
their gesture, (2) a larger number of detected gestures as well as a faster training
phase than previous cascade classifier based methods and (3) near real-time performance
for hand pose detection in video streams.
1 Introduction
Improving human-robot interaction has been an active research field
Tuesday, March 28, 2006
Lab meeting schedule changed this week!
Sorry for this late notice. The lab meeting this Wednesday is rescheduled to 10:30AM, this Thursday. The place is the same, CSIE R524. No advisee meeting this week.
Any and Casey, could you please post your talk titles?
Best,
-Bob
Monday, March 27, 2006
CMU RI talk: Distributed Estimation and Control of Multi-Agent Systems
Northwestern University
We are pursuing a framework for systematic design of emergent behaviors in sensing and communication networks of mobile agents. The problem is to design a control law to run on each agent, based on sensor and communication input, so that the desired collective behavior emerges. Example tasks include sensor coverage, formation control, multi-agent pursuer-evader, and other types of self-organization. The key constraints are that each agent may have significant dynamics and limited sensing, computation, motion, and communication capabilities. The behavior of the system should improve or degrade gracefully as agents are added or deleted; in other words, the approach should be scalable, robust, and require no central controller.
Our approach requires each agent to simultaneously (1) estimate properties of the global behavior of the system and (2) use those estimates in a motion control law. This suggests a systematic approach of separately designing the estimator and controller, and then ensuring that the coupled system retains desired performance properties. I will give an example applying this framework to swarm formation control, where the desired formation is described by inertial moments. Implementing a simple gradient control law on each agent, the coupled estimation and control system is globally convergent to the desired family of formations.
Speaker Biography: Kevin Lynch was a member of Carnegie Mellon's first class of robotics Ph.D. students. After graduation in 1996 he spent a year as a postdoctoral fellow at the AIST Mechanical Engineering Laboratory in Tsukuba, Japan. Since 1997 he has been on the faculty of the Mechanical Engineering Department at Northwestern University, where he co-directs the Laboratory for Intelligent Mechanical Systems. He was the recipient of the IEEE Early Career Award in Robotics and Automation in 2001, and he currently serves as Editor of the IEEE Transactions on Robotics. He is a co-author of Principles of Robot Motion, MIT Press, along with Howie Choset, George Kantor, and others. His research interests are in robot motion planning and manipulation, underactuated systems, human-robot interaction, and distributed multi-agent systems.
CMU ML Lunch talk: Dynamic Contextual Friendship Networks
http://www.cs.cmu.edu/~alicez/
Date: March 27
For schedules, links to papers et al, please see the web page:
http://www.cs.cmu.edu/~learning/
Abstract:
The study of social networks has gained new importance with the recent rise of large on-line communities. Most current approaches focus on deterministic (descriptive) models and are usually restricted to a preset number of social actors. Moreover, the dynamic aspect is often treated as an addendum to the static model. Taking inspiration from real-life friendship formation patterns, we propose a new generative model of evolving social networks that allows for birth and death of social ties and addition of new actors. Each actor has a distribution over social interaction spheres, which we term "contexts." We study the robustness of our model by examining statistical properties of simulated networks relative to well known properties of real social networks. A Gibbs sampling procedure is developed for parameter learning.
Sunday, March 26, 2006
CMU RI oral: Statistical Modeling and Localization of Nonrigid and Articulated Shapes
Carnegie Mellon University
An articulated object can be loosely defined as a structure or mechanical system composed of links and joints. The human body is a good example of a nonrigid, articulated object. Localizing body shapes in still images remains a fundamental problem in computer vision, with potential applications in surveillance, video editing/annotation, human computer interfaces, and entertainment.
In this thesis, we present a 2D model-based approach to human body localization. We first consider a fixed viewpoint scenario (side-view) by introducing a triangulated model of the nonrigid and articulated body contours. Four types of image cues are combined to relate the model configuration to the observed image, including edge gradient, silhouette, skin color, and region similarity. The model is arranged into a sequential structure, enabling simple yet effective spatial inference through Sequential Monte Carlo (SMC) sampling.
We then extend the system to situations where the viewpoint of the human target is unknown. To accommodate large viewpoint changes, a mixture of view-dependent models is employed. Each model is decomposed based on the concept of parts, with anthropometric constraints and self-occlusion explicitly treated. Inference is done by direct sampling of the posterior mixture, using SMC enhanced with annealing. The fitting method is independent of the number of mixture components, and does not require the preselection of a “correct” viewpoint.
Finally, we return to the generic setting of single image, arbitrary pose, and arbitrary viewpoint. The constraints on the body pose and background subtraction that have been used in previous systems are no longer required. Our proposed solution is a hybrid search facilitated by a 3-level hierarchical decomposition of the model. We first fit a simple tree-structured model defined on a compact landmark set along the body contours by Dynamic Programming (DP). The output is a series of proposal maps that encode the probabilities of partial body configurations. Next, we fit a mixture of view-dependent models by SMC, which handles self-occlusion, anthropometric constraints, and large viewpoint changes. DP and SMC are designed to search in opposite directions such that the DP proposals are utilized effectively to initialize and guide the SMC inference. This hybrid strategy of combining deterministic and stochastic search ensures both the robustness and efficiency of DP, and the accuracy of SMC. Finally, we fit an expanded mixture model with increased landmark density through local optimization.
The models were trained on a large number of gait images. Extensive tests on cluttered images with varying poses including walking, dancing and various types of sports activities justified the feasibility of the proposed approach.
CMU talk: Dynamic Models of Human Behavior
Zoran Popovic, Associate Professor
University of Washington
In this talk I will describe two models of human locomotion that attempt to describe both micro (stylistic variation of locomotion) and macro (complex crowd behavior) motion behavior patterns of humans through a set of tuned differential equations.
The first model of human locomotion incorporates several important aspects of human biology, including relative preferences for using some muscles more than others, elastic mechanisms at joints due to the mechanical properties of tendons, ligaments, and muscles, and variable stiffness at joints depending on the task. When used in a spacetime optimization framework, the parameters of this model define a wide range of styles of natural human movement. Due to the complexity of biological motion, these style parameters are too difficult to design by hand. To address this, I will describe the process of Nonlinear Inverse Optimization, an algorithm for estimating optimization parameters from motion capture data. We show how salient physical parameters cam be extracted from a single short motion sequence. Once captured, this representation of style is extremely flexible: motions can be generated in the same style but performing different tasks, and styles may be edited to change the physical properties of the body.
The second part of the talk will present a real-time model of crowd dynamics that is based on the continuum computations instead of per-agent simulations. This formulation yields a set of continuous velocity and potential fields that guide all people simultaneously. A dynamic potential field integrates both local collision avoidance and global navigation, efficiently solving for smooth realistic motion for large crowds without the need for collision detection. Simulations created with our system run at interactive rates, exhibit smooth flow under a variety of conditions, and naturally exhibit emergent phenomena that have been observed in real crowds.
This talk describes joint work with Karen C. Liu, Aaron Hertzmann, Adrien Treuille, and Seth Cooper.
Speaker Biography: Zoran Popovic is an Associate Professor in computer science at University of Washington. He received a Sc.B. with Honors from Brown University, and M.S. and Ph.D in Computer Science from Carnegie Mellon University. He has held research positions at Sun Microsystems and Justsystem Research Center and University of California at Berkeley. Zoran's research interests lie in computer animation, primarily in physically based modeling, high-fidelity human modeling, and control of realistic natural motion. His contributions to the field of computer graphics have been recently recognized by a number of awards including the NSF CAREER Award, Alfred P. Sloan Fellowship and ACM SIGGRAPH Significant New Researcher Award.
Monday, March 20, 2006
My first talk
Author: Jur P. van den Berg and Mark H. Overmars
From: IEEE TRANSACTIONS ON ROBOTICS, VOL. 21, NO. 5, OCTOBER 2005, p.885-897
Abstract:
In this paper, a new method is presented for motion
planning in dynamic environments, that is, finding a trajectory for
a robot in a scene consisting of both static and dynamic, moving obstacles.
We propose a practical algorithm based on a roadmap that
is created for the static part of the scene. On this roadmap, an approximately
time-optimal trajectory from a start to a goal configuration
is computed, such that the robot does not collide with any
moving obstacle. The trajectory is found by performing a two-level
search for a shortest path. On the local level, trajectories on single
edges of the roadmap are found using a depth-first search on an
implicit grid in state-time space. On the global level, these local
trajectories are coordinated using an A -search to find a global
trajectory to the goal configuration. The approach is applicable to
any robot type in configuration spaces with any dimension, and
the motions of the dynamic obstacles are unconstrained, as long as
they are known beforehand. The approach has been implemented
for both free-flying and articulated robots in three-dimensional
workspaces, and it has been applied to multirobot motion planning,
as well. Experiments show that the method achieves interactive
performance in complex environments.
Sunday, March 19, 2006
My talk this Wednesday.
First, I'll give a brief demo of my last talk. (about AdaBoost)
Second, I'll talk about the extension of Adaboost.
This time I'll introduce adaboost algorithm under multiclass condition.
My talk is based on the following 2 paper :
1.
Title : A decision-theoretic generalization of on-line learningand an application to boosting Author : Yoav Freund and Robert E. Schapire in AT&T Lab
This paper appears in : Journal of Computer and System Sciences,55(1):110-139,August 1997
Abstract :
In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worst-case on-line framework. The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting. We show that the multiplicative weight-update rule of Littlestone and Warmuth can be adapted to this model yielding boundsthat are slightly weaker in some cases, but applicable to a considerably more general classof learning problems. We show how the resulting learning algorithm can be applied to avariety of problems, including gambling, multiple-outcome prediction, repeated games andprediction of points in R^n. In the second part of the paper we apply the multiplicative weight-update technique to derive a new boosting algorithm. This boosting algorithm doesnot require any prior knowledge about the performance of the weak learning algorithm.We also study generalizations of the new boosting algorithm to the problem of learningfunctions whose range, rather than being binary, is an arbitrary finite set or a bounded segment of the real line.
link
2.
Title : Improved Boosting AlgorithmsUsing Confidence-rated Predictions
Author : Robert E. Schapire and Yoram Singer in AT&T Lab
This paper appears in : Machine Learning, 37(3):297-336, 1999.
Abstract :
We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a simplified analysis of AdaBoost in this setting, and we show how this analysis can be used to find improved parametersettings as well as a refined criterion for training weak hypotheses. We give a specific method for assigning confidences to the predictions of decision trees, a method closely related to one used by Quinlan. This method also suggests a technique for growing decision trees which turns out to be identical to one proposed by Kearns and Mansour.We focus next on how to apply the new boosting algorithms to multiclass classification problems, particularlyto the multi-label case in which each example may belong to more than one class. We give two boosting methods for this problem, plus a third method based on output coding. One of these leads to a new method for handling the single-label case which is simpler but as effective as techniques suggested by Freund and Schapire. Finally,we give some experimental results comparing a few of the algorithms discussed in this paper.
link
Thursday, March 16, 2006
Computational Thinking
Computational Thinking, CACM vol. 49, no. 3, March 2006, pp. 33-35. slides.
Folks, you must read this article!! -Bob
MIT ME Talk: Robotics -the next globally disruptive technology?
Date: Friday, March 17 2006
Time: 2:30PM to 3:30PM
Location: 3-370
Host: John Leonard, MIT
Abstract: Over the course of human history the emergence of certain new technologies have globally transformed life as we know it. Disruptive technologies like fire, the printing press, oil, and television have dramatically changed both the planet we live on and mankind itself, most often in extraordinary and unpredictable ways. In pre-history these disruptions took place over hundreds of years. With the time compression induced by our rapidly advancing technology, they can now take place in less than a generation. We are currently at the edge of one such event. In ten years robotic systems will fly our planes, grow our food, explore space, discover life saving drugs, fight our wars, sweep our homes and deliver our babies. In the process, this robotics driven disruptive event will create a new 200 billion dollar global industry and change life as you now know it, forever. Just as my children cannot imagine a world without electricity, your children will never know a world without robots. Come take a bold look at the future and the opportunities for Mechanical Engineers that wait there.
WHAT'S NEW @ IEEE IN CIRCUITS March 2006
Power consumption was a hot topic at February's 2006 FPGA, most notably the difference between FPGAs (Field-Programmable Gate Arrays) and ASICs (Application-Specific Integrated Circuits). Many researchers argued that ASICs are more practical than FPGAs, but not everyone agreed. One presenter, Tim Tuan of Xilinx Inc., said his company is building a low-power architecture based on the company's Spartan 3 fabric that will apply such optimizations as voltage scaling, power gating, low-leakage configuration memory and sleep mode. This Pika architecture is said to produce 46 percent less active power and 99 percent less standby power than the baseline Spartan 3. Pika claims to lessen the problem of dissipating standby power milliamps, bringing FPGAs into an acceptable range for mobile, battery-powered products. Despite the arguments that these advancements will improve the FPGAs marketability, other researchers were not so sure, arguing that despite the scaling, FPGAs are still 20 times more power-hungry than ASICs. For more on this and other topics discussed at the conference, visit: http://www.powermanagementdesignline.com/news/181400903
IEEE Tech Alert for 15 Mar 2006
Passing a significant milestone, the IEEE 802.11 working group has announced its adoption of a proposed basis for a standard that will extend Wi-Fi wireless distribution by means of mesh points. In a mesh network, computers become transceivers, forwarding packets of data for other nearby computers on the network. By sending packets only as far as the next computer, instead of a distant base station, meshed computers use less power, emit fewer interfering signals, and have higher data rates.
The 802.11s extension will mesh the intermediate access points in a network, but not each and every individual computer, to somewhat boost the performance and efficiency of Wi-Fi systems.
For further information, go to the IEEE Standards Web site at: http://standards.ieee.org
Bob: Mesh Robots?
4. New handheld helps reduce stress
Considering the proliferation of handheld devices, all with their own little alerts and alarms, it may seem that stress is getting worse, not better. But now a new handheld device promises to help. A sleek, solid, handheld biofeedback device called the StressEraser, is designed as an aid for deep breathing exercises, which are commonly prescribed to alleviate stress. The device tells you just when to inhale and when to stop.
See Calm in Your Palm, by Samuel K. Moore: http://www.spectrum.ieee.org/mar06/3044
5. Prototype planetary rover tested in Chilean desert
A hardy band of researchers has braved freezing nights, bad food, and high winds in the Chilean desert to test a rover that could be the prototype for the next generation of vehicles to explore the surface of the Moon or Mars. Weighing in at 180 kilograms, the rover, dubbed Zoë, looks something like a motorized, overgrown ice cream cart. But it is beautiful in the one way that really matters to planetary scientists: unlike all the rovers built thus far, Zoë can roam autonomously.
See Halfway to Mars, by Jean Kumagai: http://www.spectrum.ieee.org/mar06/3059
Sunday, March 12, 2006
My talk this week
Base on the idea mentioned on "AFFINE STRUCTURE FROM SOUND", I am going to show a simulation that given structured microphone-array and some arbitrary audio events, this system will reconstruct the locations of microphones and audio events that occurred.
reference:
S. Thrun. Affine Structure From Sound In Proceedings of the 2005 Conference on Neural Information Processing Systems (NIPS). MIT Press, 2006.
link
My talk this week
Author:Nelson L. Chang
From:IEEE International Workshop on Projector-Camera Systems (PROCAMS) 2003
Abstract:
Establishing reliable dense correspondences is crucialfor many 3-D and projector-related applications. Thispaper proposes using temporally encoded patterns todirectly obtain the correspondence mapping between aprojector and a camera without any searching orcalibration. The technique naturally extends to efficientlysolve the difficult multiframe correspondence problemacross any number of cameras and/or projectors.Furthermore, it automatically determines visibility acrossall cameras in the system and scales linearly incomputation with the number of cameras. Experimentalresults demonstrate the effectiveness of the proposedtechnique for a variety of applications.
[PDF]