Speaker: Eyal Amir, Computer Science Department, University of Illinois, Urbana-Champaign
Date: Friday, March 10 2006
Time: 3:00PM to 4:00PM
Location: Seminar Room 32-G449 (Kiva/Patil)
Host: Professor Leslie Kaelbling, MIT CSAIL
Contact: Teresa Cataldo, 617-452-5005, cataldo@csail.mit.edu
Many complex domains offer limited information about their exact state and the way actions affect them. There, agents need to learn action models to act effectively, at the same time that they track the state of the domain.
In this presentation I will describe polynomial-time algorithms for learning logical models of actions' effects and preconditions in deterministic partially observable domains. These algorithms represent the set of possible action models compactly, and update it after every action execution and partial observation. This approach is the first tractable learning algorithm for partially observable dynamic domains. I will mention recent extensions of this work to relational domains, and will also discuss potential applications of this work to agents playing adventure games and to active web mining.
Relevant papers:
Partially Observable Deterministic Action Models, IJCAI'05.
Learning partially observable action models, CogRob'04, part of ECAI'04.
This Blog is maintained by the Robot Perception and Learning lab at CSIE, NTU, Taiwan. Our scientific interests are driven by the desire to build intelligent robots and computers, which are capable of servicing people more efficiently than equivalent manned systems in a wide variety of dynamic and unstructured environments.
Saturday, March 11, 2006
CMU FRC talk: Recognizing Things in Images
Speaker: Martial Hebert, Professor, Robotics, Carnegie Mellon University
Date: Thursday, March 16, 2006
Abstract: Finding things (objects, regions, events) in images or video is the main objective of computer vision. A common view of this problem is to start with tentative labeling of parts of the image as possible locations of objects or possible types of regions, followed by reasoning about context and relations between these elements. We have been working on tools for recognition that are very effective for this type of tasks. In particular, we have developed tools for representing relations beween scene elements (features, objects, regions..) and for enforcing geometric constraints between elements.
I'll review some of the recent results in using relations and context between image elements for recognition and classification. The applications include finding salient structures in images, recognizing individual objects, and segmenting the image into labeled regions. Most of the examples deal with single images but the techniques can be used also for more recognizing things in video sequences and I will show some results in this area from a recently completed project. Applications include detection of useful landmarks, object localization for navigation and manipulation, and surveillance.
Speaker Bio: Martial Hebert is Professor at the Robotics Institute, Carnegie Mellon University. He has led many major computer vision and robotics projects, funded by DARPA, NASA, NSF, ONR, DOE, and industry. Prof. Hebert has worked in multiple areas of robotics: computer vision, autonomous mobile robots, and sensors. His current research interests include object recognition in images, video, and range data, scene understanding using context representations, and model construction from images and 3-D data. His group has explored applications in the areas of autonomous mobile robots, both in indoor and in unstructured, outdoor environments, automatic model building for 3D content generation, and video monitoring. He has published more than 150 technical papers and reports in these areas.
Date: Thursday, March 16, 2006
Abstract: Finding things (objects, regions, events) in images or video is the main objective of computer vision. A common view of this problem is to start with tentative labeling of parts of the image as possible locations of objects or possible types of regions, followed by reasoning about context and relations between these elements. We have been working on tools for recognition that are very effective for this type of tasks. In particular, we have developed tools for representing relations beween scene elements (features, objects, regions..) and for enforcing geometric constraints between elements.
I'll review some of the recent results in using relations and context between image elements for recognition and classification. The applications include finding salient structures in images, recognizing individual objects, and segmenting the image into labeled regions. Most of the examples deal with single images but the techniques can be used also for more recognizing things in video sequences and I will show some results in this area from a recently completed project. Applications include detection of useful landmarks, object localization for navigation and manipulation, and surveillance.
Speaker Bio: Martial Hebert is Professor at the Robotics Institute, Carnegie Mellon University. He has led many major computer vision and robotics projects, funded by DARPA, NASA, NSF, ONR, DOE, and industry. Prof. Hebert has worked in multiple areas of robotics: computer vision, autonomous mobile robots, and sensors. His current research interests include object recognition in images, video, and range data, scene understanding using context representations, and model construction from images and 3-D data. His group has explored applications in the areas of autonomous mobile robots, both in indoor and in unstructured, outdoor environments, automatic model building for 3D content generation, and video monitoring. He has published more than 150 technical papers and reports in these areas.
Thursday, March 09, 2006
CMU & MIT talk: Visual classification by a hierarchy of semantic fragments
Boris Epshtein, Weizmann Institute
CVPR 2005 Oral paper
We describe a novel technique for identifying semantically equivalent parts in images belonging to the same object class, (e.g. eyes, license plates, aircraft wings etc.). The visual appearance of such object parts can differ substantially, and therefore traditional image similarity-based methods are inappropriate for this task. The technique we propose is based on the use of common context. We first retrieve context fragments, which consistently appear together with a given input fragment in a stable geometric relation. We then use the context fragments in new images to infer the most likely position of equivalent parts. Given a set of image examples of objects in a class, the method can automatically learn the part structure of the domain–identify the main parts, and how their appearance changes across objects in the class. Two applications of the proposed algorithm are shown: the detection and identification of object parts and object recognition.
PDF Slides
CVPR 2005 Oral paper
We describe a novel technique for identifying semantically equivalent parts in images belonging to the same object class, (e.g. eyes, license plates, aircraft wings etc.). The visual appearance of such object parts can differ substantially, and therefore traditional image similarity-based methods are inappropriate for this task. The technique we propose is based on the use of common context. We first retrieve context fragments, which consistently appear together with a given input fragment in a stable geometric relation. We then use the context fragments in new images to infer the most likely position of equivalent parts. Given a set of image examples of objects in a class, the method can automatically learn the part structure of the domain–identify the main parts, and how their appearance changes across objects in the class. Two applications of the proposed algorithm are shown: the detection and identification of object parts and object recognition.
PDF Slides
What's New @ IEEE in Communications, March 2006
7. REMOTE 'WEAR AND TEAR' SENSORS BEING DEVELOPED
A new type of wireless sensor is being developed to remotely monitor mechanical parts and systems such as gearboxes, engines, and door mechanisms, to predict machinery and transportation breakdowns, according to scientists at the University of Manchester. Developers say the sensors could be in service in the next four years, and would greatly reduce maintenance costs in the manufacturing, automotive and plant machinery industries by predicting when parts require maintenance or need replacing before the machinery fails. Different kinds of sensors would measure a range of selected parameters, such as vibration, temperature, and pressure, or the concentrations of metallic elements in lubricating oil created through machinery wear and tear. Read more: the link
8. RFID TAGS CAN BE HACKED USING CELL PHONES, RESEARCHER SAYS
Passwords for the most popular brand of RFID tags can be obtained using a directional antenna and digital oscilloscope to monitor power used by the tags while they are being read, according to a cryptographer and professor of computer science at the Weizmann Institute. Patterns in power use could be analyzed to determine when the tag received correct and incorrect password bits, according to the researcher, who said the brand of RFID tag he tested was "totally unprotected," and that a cell phone has all the ingredients necessary to compromise all RFID tags in its immediate vicinity. Read more: the link
9. NASA'S NEW SOFTWARE GETS COMPUTERS THINKING TOGETHER
A new NASA computer program that operates as a collective on many computers at once has designed an antenna that will be launched into space to study the Earth's magnetosphere. The revolutionary AI program uses Darwin's theory of evolution to determine what the best outcome will be for a given project. To create the antenna in question, attributes of thousands of antennae were given to the program. Eighty computers then combined their "brains" over a period of ten hours, a significantly smaller amount of time than could have been achieved by humans, to create an optimal design. The resulting antenna looks like a bent paperclip and can receive commands and send data to Earth. The writers of the program say the evolutionary AI software can invent and create new structures, computer chips and various other machines, and it can operate on up to 120 personal computers at once. Read more: the link
A new type of wireless sensor is being developed to remotely monitor mechanical parts and systems such as gearboxes, engines, and door mechanisms, to predict machinery and transportation breakdowns, according to scientists at the University of Manchester. Developers say the sensors could be in service in the next four years, and would greatly reduce maintenance costs in the manufacturing, automotive and plant machinery industries by predicting when parts require maintenance or need replacing before the machinery fails. Different kinds of sensors would measure a range of selected parameters, such as vibration, temperature, and pressure, or the concentrations of metallic elements in lubricating oil created through machinery wear and tear. Read more: the link
8. RFID TAGS CAN BE HACKED USING CELL PHONES, RESEARCHER SAYS
Passwords for the most popular brand of RFID tags can be obtained using a directional antenna and digital oscilloscope to monitor power used by the tags while they are being read, according to a cryptographer and professor of computer science at the Weizmann Institute. Patterns in power use could be analyzed to determine when the tag received correct and incorrect password bits, according to the researcher, who said the brand of RFID tag he tested was "totally unprotected," and that a cell phone has all the ingredients necessary to compromise all RFID tags in its immediate vicinity. Read more: the link
9. NASA'S NEW SOFTWARE GETS COMPUTERS THINKING TOGETHER
A new NASA computer program that operates as a collective on many computers at once has designed an antenna that will be launched into space to study the Earth's magnetosphere. The revolutionary AI program uses Darwin's theory of evolution to determine what the best outcome will be for a given project. To create the antenna in question, attributes of thousands of antennae were given to the program. Eighty computers then combined their "brains" over a period of ten hours, a significantly smaller amount of time than could have been achieved by humans, to create an optimal design. The resulting antenna looks like a bent paperclip and can receive commands and send data to Earth. The writers of the program say the evolutionary AI software can invent and create new structures, computer chips and various other machines, and it can operate on up to 120 personal computers at once. Read more: the link
What's New @ IEEE for Students, March 2006
2. 3-D TECHNOLOGIES FOCUS OF "PROCEEDINGS OF THE IEEE" SPECIAL ISSUE
The March 2006 special issue of "Proceedings of the IEEE" (v. 94, no. 3) examines the broad subject of three-dimensional (3-D) imaging, display and visualization technologies. Writing in their introduction to the issue, Guest Editors Bahram Javidi and Fumio Okano say that 3-D technologies are "important applications of information systems in a society that is increasingly dependent on the presentation of information." Overview papers in this issue present the fundamental ideas, theory, experiments and application of some leading 3-D technologies, illustrated with examples, simulations and experiment results. A preview is available online:
http://www.ieee.org/web/publications/procieee/current.html
7. PROTOTYPING FOR HUMAN INTERACTION
How can inventors determine if their designs are a good fit for human users? A new program called d.Tools prototypes consumer products by blending the interactive components and physical design interface with the software's intents and purposes. The programs developers argue that many devices fail because the inventors try to mimic or supplant physical attributes with computer software, forgetting that people are inherently hands-only. They hope d.Tools will help bring about new technologies that are more in tune with what humans want. Read more:
http://www.physorg.com/news11112.html
8. INCREASED USE OF BIOMETRICS SEEN TO STOP IDENTITY THEFT
Biometrics, such as the digital record of an individual's fingerprints or iris patterns, are increasingly being used as a more secure way to confirm user identity in a variety of systems, writes Alfred C. Weaver in the current issue of "Computer" (v. 39, no. 2). Weaver identifies three broad classes of personal identification: what an individual knows (such as a password); what an individual carries (such as an ID card); and who an individual is (based on fingerprints, DNA, or some other physical or behavioral measurement). Of the three, biometric identification is the most reliable proof of identity, Weaver says, and is being implemented in more and more places like airports and border crossings, where the stakes are highest for positive identification. According to Weaver, the security of biometric identification is highly dependent upon who is collecting the data, and on the data being stored as mathematical templates so that it cannot be used to recreate the users' identifying characteristics. Read more: the link
The March 2006 special issue of "Proceedings of the IEEE" (v. 94, no. 3) examines the broad subject of three-dimensional (3-D) imaging, display and visualization technologies. Writing in their introduction to the issue, Guest Editors Bahram Javidi and Fumio Okano say that 3-D technologies are "important applications of information systems in a society that is increasingly dependent on the presentation of information." Overview papers in this issue present the fundamental ideas, theory, experiments and application of some leading 3-D technologies, illustrated with examples, simulations and experiment results. A preview is available online:
http://www.ieee.org/web/publications/procieee/current.html
7. PROTOTYPING FOR HUMAN INTERACTION
How can inventors determine if their designs are a good fit for human users? A new program called d.Tools prototypes consumer products by blending the interactive components and physical design interface with the software's intents and purposes. The programs developers argue that many devices fail because the inventors try to mimic or supplant physical attributes with computer software, forgetting that people are inherently hands-only. They hope d.Tools will help bring about new technologies that are more in tune with what humans want. Read more:
http://www.physorg.com/news11112.html
8. INCREASED USE OF BIOMETRICS SEEN TO STOP IDENTITY THEFT
Biometrics, such as the digital record of an individual's fingerprints or iris patterns, are increasingly being used as a more secure way to confirm user identity in a variety of systems, writes Alfred C. Weaver in the current issue of "Computer" (v. 39, no. 2). Weaver identifies three broad classes of personal identification: what an individual knows (such as a password); what an individual carries (such as an ID card); and who an individual is (based on fingerprints, DNA, or some other physical or behavioral measurement). Of the three, biometric identification is the most reliable proof of identity, Weaver says, and is being implemented in more and more places like airports and border crossings, where the stakes are highest for positive identification. According to Weaver, the security of biometric identification is highly dependent upon who is collecting the data, and on the data being stored as mathematical templates so that it cannot be used to recreate the users' identifying characteristics. Read more: the link
Monday, March 06, 2006
Paper: Distributed Localization of Networked Cameras
Authors: Stanislav Funiak, Carlos Guestrin, Mark Paskin, Rahul Sukthankar
Conf: IPSN 2006
Abstract:
Camera networks are perhaps the most common type of sensor network and are deployed in a variety of real-world applications including surveillance, intelligent environments and scientific remote monitoring. A key problem in deploying a network of cameras is calibration, i.e., determining the location and orientation of each sensor so that observations in an image can be mapped to locations in the real world. This paper proposes a fully distributed approach for camera network calibration. The cameras collaborate to track an object that moves through the environment and reason probabilistically about which camera poses are consistent with the observed images. This reasoning employs sophisticated techniques for handling the difficult nonlinearities imposed by projective transformations, as well as the dense correlations that arise between distant cameras. Our method requires minimal overlap of the cameras' fields of view and makes very few assumptions about the motion of the object. In contrast to existing approaches, which are centralized, our distributed algorithm scales easily to very large camera networks. We evaluate the system on a real camera network with 25 nodes as well as simulated camera networks of up to 50 cameras and demonstrate that our approach performs well even when communication is lossy.
PDF Movies
Conf: IPSN 2006
Abstract:
Camera networks are perhaps the most common type of sensor network and are deployed in a variety of real-world applications including surveillance, intelligent environments and scientific remote monitoring. A key problem in deploying a network of cameras is calibration, i.e., determining the location and orientation of each sensor so that observations in an image can be mapped to locations in the real world. This paper proposes a fully distributed approach for camera network calibration. The cameras collaborate to track an object that moves through the environment and reason probabilistically about which camera poses are consistent with the observed images. This reasoning employs sophisticated techniques for handling the difficult nonlinearities imposed by projective transformations, as well as the dense correlations that arise between distant cameras. Our method requires minimal overlap of the cameras' fields of view and makes very few assumptions about the motion of the object. In contrast to existing approaches, which are centralized, our distributed algorithm scales easily to very large camera networks. We evaluate the system on a real camera network with 25 nodes as well as simulated camera networks of up to 50 cameras and demonstrate that our approach performs well even when communication is lossy.
PDF Movies
Sunday, March 05, 2006
My talk this week
An Introduction to the Kalman Filter
Author : Greg Welch, Gary Bishop
Introduction
The Kalman filter is a mathematical power tool that is playing an increasingly importantrole in computer graphics as we include sensing of the real world in our systems. The goodnews is you don’t have to be a mathematical genius to understand and effectively useKalman filters. This tutorial is designed to provide developers of graphical systems with abasic understanding of this important mathematical tool.
Link
Author : Greg Welch, Gary Bishop
Introduction
The Kalman filter is a mathematical power tool that is playing an increasingly importantrole in computer graphics as we include sensing of the real world in our systems. The goodnews is you don’t have to be a mathematical genius to understand and effectively useKalman filters. This tutorial is designed to provide developers of graphical systems with abasic understanding of this important mathematical tool.
Link
Saturday, March 04, 2006
lab meeting : Rigid-Body Alignment
Rigid-Body Alignment
Abstract:
This section of the course covers techniques for pairwise (i.e., scanto-
scan) and global (i.e., involving more than 2 scans) alignment,
given that the algorithms are constrained to obtain a rigid-body
transformation.
ICCV 2005 Short Course: 3D Scan Matching and Registration
Szymon Rusinkiewicz, Princeton UniversityAbstract:
This section of the course covers techniques for pairwise (i.e., scanto-
scan) and global (i.e., involving more than 2 scans) alignment,
given that the algorithms are constrained to obtain a rigid-body
transformation.
Friday, March 03, 2006
CMU FRC talk: Parameterizing Deformable Systems to Tame Complexity
Speaker: Doug James, Assistant Professor, Computer Science and Robotics, Carnegie Mellon University
Date: Thursday, March 9, 2006
Abstract: The complexity and beauty of physical deformation phenomena in our lives is truly amazing. It fundamentally affects our appearance (skin, hair, clothing), our composition (protein folding), the sounds we make (talking, clapping), beauty in nature (irises blowing in the wind), our creations (aerospace design), and important decisions (surgical intervention). Computer modeling of deformation has made enormous progress, but the complexity of the world is humbling. We still do not know how to create immersive, realistic, real-time computer simulations of our ever-changing and deforming world.
In this talk, I will discuss our recent work on data-driven approaches for preprocessing and parameterizing deformable systems to enable greater interactivity. These techniques exploit the structure of deformable motion to build efficient output-sensitive algorithms in several key areas: subspace dynamics integration, output-sensitive collision processing, haptic force-feedback rendering, dynamic illumination modeling, and hardware-accelerated mesh animation.
Date: Thursday, March 9, 2006
Abstract: The complexity and beauty of physical deformation phenomena in our lives is truly amazing. It fundamentally affects our appearance (skin, hair, clothing), our composition (protein folding), the sounds we make (talking, clapping), beauty in nature (irises blowing in the wind), our creations (aerospace design), and important decisions (surgical intervention). Computer modeling of deformation has made enormous progress, but the complexity of the world is humbling. We still do not know how to create immersive, realistic, real-time computer simulations of our ever-changing and deforming world.
In this talk, I will discuss our recent work on data-driven approaches for preprocessing and parameterizing deformable systems to enable greater interactivity. These techniques exploit the structure of deformable motion to build efficient output-sensitive algorithms in several key areas: subspace dynamics integration, output-sensitive collision processing, haptic force-feedback rendering, dynamic illumination modeling, and hardware-accelerated mesh animation.
CMU VASC talk: A Spectral Technique for Correspondence Problems Using Pairwise Constraints
Marius Leordeanu,
Monday, March 6, 2006
Abstract:
We present an efficient spectral method for finding consistent correspondences between two sets of features. We build the adjacency matrix M of a graph whose nodes represent the potential correspondences and the weights on the links represent pairwise agreements between potential correspondences. Correct assignments are likely to establish links among each other and thus form a strongly connected cluster. Incorrect correspondences establish links with the other correspondences only accidentally, so they are unlikely to belong to strongly connected clusters. We recover the correct assignments based on how strongly they belong to the main cluster of M, by using the principal eigenvector of M and imposing the mapping constraints required by the overall correspondence mapping (one-to-one or one-to-many). The experimental evaluation shows that our method is robust to outliers, accurate in terms of matching rate, while being several orders of magnitude faster than
existing methods.
Short Bio: Marius Leordeanu received a double BA in Mathematics and Computer Science from Hunter College of The City University of New York. From 2002 to 2003 hw worked in the vision lab at Hunter College in the area of 3D registration and modeling. Since 2003, he has been a PhD student at the Robotics Institute of Carnegie Mellon University. At CMU his main reasearch is focusing on object recognition.
Monday, March 6, 2006
Abstract:
We present an efficient spectral method for finding consistent correspondences between two sets of features. We build the adjacency matrix M of a graph whose nodes represent the potential correspondences and the weights on the links represent pairwise agreements between potential correspondences. Correct assignments are likely to establish links among each other and thus form a strongly connected cluster. Incorrect correspondences establish links with the other correspondences only accidentally, so they are unlikely to belong to strongly connected clusters. We recover the correct assignments based on how strongly they belong to the main cluster of M, by using the principal eigenvector of M and imposing the mapping constraints required by the overall correspondence mapping (one-to-one or one-to-many). The experimental evaluation shows that our method is robust to outliers, accurate in terms of matching rate, while being several orders of magnitude faster than
existing methods.
Short Bio: Marius Leordeanu received a double BA in Mathematics and Computer Science from Hunter College of The City University of New York. From 2002 to 2003 hw worked in the vision lab at Hunter College in the area of 3D registration and modeling. Since 2003, he has been a PhD student at the Robotics Institute of Carnegie Mellon University. At CMU his main reasearch is focusing on object recognition.
Monday, February 27, 2006
My talk this week
Probabilistic Cooperative Localization and Mapping in Practice
Author: Ioannis Rekleitis, Gregory Dudek and Evangelos Milios
From: International Conference on Robotics and Automation, 2003.
Abstract:
In this paper we present a probabilistic framework for the reduction in the uncertainty of a moving robot pose during exploration by using a second robot to assist. A Monte Carlo Simulation technique (specifically, a Particle Filter) is employed in order to model and reduce the accumulated odometric error. Furthermore, we study the requirements to obtain an accurate yet timely pose estimate. A team of two robots is employed to explore an indoor environment in this paper, although several aspects of the approach have been extended to larger groups. The concept behind our exploration strategy has been presented previously and is based on having one robot carry a sensor that acts as a “robot tracker” to estimate the position of the other robot. By suitable use of the tracker as an appropriate motion-control mechanism we can sweep areas of free space between the stationary and the moving robot and generate an accurate graph-based description of the environment. This graph is used to guide the exploration process. Complete exploration without any overlaps is guaranteed as a result of the guidance provided by the dual graph of the spatial decomposition (triangulation) of the environment. We present experimental results from indoor experiments in our laboratory and from more complex simulated experiments.
Paper: Cooperative Localization and Multi-Robot Exploration
Related Materials: Particle Filter Tutorial for Mobile Robots
Author: Ioannis Rekleitis, Gregory Dudek and Evangelos Milios
From: International Conference on Robotics and Automation, 2003.
Abstract:
In this paper we present a probabilistic framework for the reduction in the uncertainty of a moving robot pose during exploration by using a second robot to assist. A Monte Carlo Simulation technique (specifically, a Particle Filter) is employed in order to model and reduce the accumulated odometric error. Furthermore, we study the requirements to obtain an accurate yet timely pose estimate. A team of two robots is employed to explore an indoor environment in this paper, although several aspects of the approach have been extended to larger groups. The concept behind our exploration strategy has been presented previously and is based on having one robot carry a sensor that acts as a “robot tracker” to estimate the position of the other robot. By suitable use of the tracker as an appropriate motion-control mechanism we can sweep areas of free space between the stationary and the moving robot and generate an accurate graph-based description of the environment. This graph is used to guide the exploration process. Complete exploration without any overlaps is guaranteed as a result of the guidance provided by the dual graph of the spatial decomposition (triangulation) of the environment. We present experimental results from indoor experiments in our laboratory and from more complex simulated experiments.
Paper: Cooperative Localization and Multi-Robot Exploration
Related Materials: Particle Filter Tutorial for Mobile Robots
Nature: Efficient auditory coding
Evan C. Smith & Michael S. Lewicki, CMU
Nature 439, 978-982 (23 February 2006)
The auditory neural code must serve a wide range of auditory tasks that require great sensitivity in time and frequency and be effective over the diverse array of sounds present in natural acoustic environments. It has been suggested that sensory systems might have evolved highly efficient coding strategies to maximize the information conveyed to the brain while minimizing the required energy and neural resources. Here we show that, for natural sounds, the complete acoustic waveform can be represented efficiently with a nonlinear model based on a population spike code. In this model, idealized spikes encode the precise temporal positions and magnitudes of underlying acoustic features. We find that when the features are optimized for coding either natural sounds or speech, they show striking similarities to time-domain cochlear filter estimates, have a frequencybandwidth dependence similar to that of auditory nerve fibres, and yield significantly greater coding efficiency than conventional signal representations. These results indicate that the auditory code might approach an information theoretic optimum and that the acoustic structure of speech might be adapted to the coding capacity of the mammalian auditory system.
[PDF]
Nature 439, 978-982 (23 February 2006)
The auditory neural code must serve a wide range of auditory tasks that require great sensitivity in time and frequency and be effective over the diverse array of sounds present in natural acoustic environments. It has been suggested that sensory systems might have evolved highly efficient coding strategies to maximize the information conveyed to the brain while minimizing the required energy and neural resources. Here we show that, for natural sounds, the complete acoustic waveform can be represented efficiently with a nonlinear model based on a population spike code. In this model, idealized spikes encode the precise temporal positions and magnitudes of underlying acoustic features. We find that when the features are optimized for coding either natural sounds or speech, they show striking similarities to time-domain cochlear filter estimates, have a frequencybandwidth dependence similar to that of auditory nerve fibres, and yield significantly greater coding efficiency than conventional signal representations. These results indicate that the auditory code might approach an information theoretic optimum and that the acoustic structure of speech might be adapted to the coding capacity of the mammalian auditory system.
[PDF]
Sunday, February 26, 2006
MIT talk: Hierarchical Abstractions for Planning & Control of Robotic Swarms
Speaker: Calin Belta, Boston University
Date: Tuesday, February 28 2006
Host: Daniela Rus, MIT
Abstract:
Specifying, planning, and controlling the motion of large groups of mobile agents (swarms) are difficult problems that received a lot of attention in recent years. I will present some recent results on reducing the dimension and complexity of such problems by defining abstractions. First, I will focus on continuous abstractions, which are obtained by extracting a small set of essential features of a swarm that can be used for planning and control. Second, I will show how discrete abstractions can be used to construct a finite dimensional description of the problem. Third, I will present an example in which the above two types of abstractions are seamlessly linked into a hierarchical abstraction framework, in which high level swarm specifications given as temporal logic formulas over features of interest are automatically converted into provably correct robot control laws.
Date: Tuesday, February 28 2006
Host: Daniela Rus, MIT
Abstract:
Specifying, planning, and controlling the motion of large groups of mobile agents (swarms) are difficult problems that received a lot of attention in recent years. I will present some recent results on reducing the dimension and complexity of such problems by defining abstractions. First, I will focus on continuous abstractions, which are obtained by extracting a small set of essential features of a swarm that can be used for planning and control. Second, I will show how discrete abstractions can be used to construct a finite dimensional description of the problem. Third, I will present an example in which the above two types of abstractions are seamlessly linked into a hierarchical abstraction framework, in which high level swarm specifications given as temporal logic formulas over features of interest are automatically converted into provably correct robot control laws.
MIT talk: Medical Image Registration in Healthcare, Biomedical Research and Drug Discovery
Speaker: Daniel Rueckert , Imperial College London
Date: Tuesday, February 28 2006
Contact: Polina Golland, x38005, polina@csail.mit.edu
Abstract:
Imaging technologies are developing at a rapid pace allowing for in-vivo 3D and 4D imaging of the anatomy and physiology in humans and animals. This is opening up unprecedented opportunities for research and clinical applications ranging from imaging for drug discovery and delivery, over imaging for diagnosis and therapy, to imaging for basic research such as brain mapping. In this talk we will focus on how computational techniques based on non-rigid image registration can be used to address the image analysis challenges in healthcare, biomedical research and drug discovery.
Date: Tuesday, February 28 2006
Contact: Polina Golland, x38005, polina@csail.mit.edu
Abstract:
Imaging technologies are developing at a rapid pace allowing for in-vivo 3D and 4D imaging of the anatomy and physiology in humans and animals. This is opening up unprecedented opportunities for research and clinical applications ranging from imaging for drug discovery and delivery, over imaging for diagnosis and therapy, to imaging for basic research such as brain mapping. In this talk we will focus on how computational techniques based on non-rigid image registration can be used to address the image analysis challenges in healthcare, biomedical research and drug discovery.
Saturday, February 25, 2006
CMU FRC talk: Online and Structured Learning Techniques for Outdoor Robotics
Speaker: Drew Bagnell, Research Scientist, Robotics Institute
Date: Thursday, March 2, 2006
Abstract:
This presentation is based on joint work with Nathan Ratliff, Boris Sofman, Ellie Lin, Nicolas Vandapel, and Anthony Stentz
Programming behaviors for outdoor mobile robot navigation is hard. Machine learning promises to alleviate this difficulty but existing techniques often fall short. For instance, it is often the case that some features that, while potentially powerful for improving navigation, prove difficult to profit from as they generalize poorly to novel situations. Overhead imagery data, for instance, has the potential to greatly enhance autonomous robot navigation in complex outdoor environments. In practice, reliable and effective automated interpretation of imagery from diverse terrain, environmental conditions, and sensor varieties proves challenging. I'll discuss online, probabilistic models to effectively learn to use these scope-limited features by leveraging other features that, while perhaps otherwise more limited, generalize reliably.
I'll also discuss work on mobile robot learning based on demonstrated trajectories. This is a natural and potentially powerful approach to teaching a system. Unfortunately, most existing techniques to learn based on demonstrated trajectories face at least two important difficulties. First, it very hard to get "negative examples", in this framework; we can't actually drive the robot off a cliff or into a boulder. Secondly, it is very difficult to acquire long-horizon and goal-directed behavior by imitating a trainer. I'll talk about a new approach that addresses both concerns. It learns to map features of the world into costs for a planner in such a way so that resulting optimal plans mimic the trainer's behavior. This approach is powerful, as the behavior that a designer wishes the planner to execute is often clear, while specifying costs that engender this behavior is often much more difficult.
Date: Thursday, March 2, 2006
Abstract:
This presentation is based on joint work with Nathan Ratliff, Boris Sofman, Ellie Lin, Nicolas Vandapel, and Anthony Stentz
Programming behaviors for outdoor mobile robot navigation is hard. Machine learning promises to alleviate this difficulty but existing techniques often fall short. For instance, it is often the case that some features that, while potentially powerful for improving navigation, prove difficult to profit from as they generalize poorly to novel situations. Overhead imagery data, for instance, has the potential to greatly enhance autonomous robot navigation in complex outdoor environments. In practice, reliable and effective automated interpretation of imagery from diverse terrain, environmental conditions, and sensor varieties proves challenging. I'll discuss online, probabilistic models to effectively learn to use these scope-limited features by leveraging other features that, while perhaps otherwise more limited, generalize reliably.
I'll also discuss work on mobile robot learning based on demonstrated trajectories. This is a natural and potentially powerful approach to teaching a system. Unfortunately, most existing techniques to learn based on demonstrated trajectories face at least two important difficulties. First, it very hard to get "negative examples", in this framework; we can't actually drive the robot off a cliff or into a boulder. Secondly, it is very difficult to acquire long-horizon and goal-directed behavior by imitating a trainer. I'll talk about a new approach that addresses both concerns. It learns to map features of the world into costs for a planner in such a way so that resulting optimal plans mimic the trainer's behavior. This approach is powerful, as the behavior that a designer wishes the planner to execute is often clear, while specifying costs that engender this behavior is often much more difficult.
CMU ML talk: Machine Learning in TAC SCM (Trading Agent Competition in Supply)
Speaker: Michael Benisch, COS, CMU. http://www.cs.cmu.edu/~mbenisch/
Date: February 27
Abstract:
Supply chains aid the manufacturing of many complex goods. Traditionally, supply chains have been maintained by human negotiators through long-term, static contracts, despite uncertain and dynamic market conditions. However, there has been a recent growing interest, from both industry and academia, in the potential for automating more efficient supply chain processes. The TAC SCM (Trading Agent Comeptition in Supply Chain Management) scenario is an international competition that provides a research platform facilitating the application of new academic technologies to the problem of managing a dynamic supply chain. Since the inception of TAC SCM, machine learning has emerged an essential aspect of successful agent design. Many agents, such as Carnegie Mellon's 2005 entry, CMieux, utilize learning techniques to estimate market conditions, and model opponent behavior. In this talk, we will discuss some specific learning problems faced by these agents, including the problem of forecasting future demand, the problem of predicting auction closing prices, and the problem of approximating supply availability. We will also discuss various solutions developed by researchers to address them, including a new extension of M5 regression trees used by CMieux, called distribution trees.
Date: February 27
Abstract:
Supply chains aid the manufacturing of many complex goods. Traditionally, supply chains have been maintained by human negotiators through long-term, static contracts, despite uncertain and dynamic market conditions. However, there has been a recent growing interest, from both industry and academia, in the potential for automating more efficient supply chain processes. The TAC SCM (Trading Agent Comeptition in Supply Chain Management) scenario is an international competition that provides a research platform facilitating the application of new academic technologies to the problem of managing a dynamic supply chain. Since the inception of TAC SCM, machine learning has emerged an essential aspect of successful agent design. Many agents, such as Carnegie Mellon's 2005 entry, CMieux, utilize learning techniques to estimate market conditions, and model opponent behavior. In this talk, we will discuss some specific learning problems faced by these agents, including the problem of forecasting future demand, the problem of predicting auction closing prices, and the problem of approximating supply availability. We will also discuss various solutions developed by researchers to address them, including a new extension of M5 regression trees used by CMieux, called distribution trees.
CMU thesis proposal: Real-time Planning for Single Agents and Multi-agent Teams in Unknown and Dynamic Environments
David Ferguson, Robotics Institute, Carnegie Mellon University
3 Mar 2006
Abstract
As autonomous agents make the transition from solving simple, well-behaved problems to being useful entities in the real world, they must deal with the added complexity and uncertainty inherent in real environments. In particular, agents navigating through the real world can be confronted with incomplete or imperfect information (e.g. when prior maps are absent or incomplete), large state spaces (e.g. for robots with several degrees of freedom or teams of robots), and dynamic elements (e.g. when there are humans or other agents in the environment). In this work, we propose to address the problem of path planning and replanning in both static and dynamic environments for which prior information may be incomplete or imperfect. We intend to develop a set of planning algorithms that will enable single agents and multi-agent teams to operate more effectively in a wider range of realistic scenarios.
A copy of the thesis proposal document can be found at http://gs2045.sp.cs.cmu.edu/downloads/proposal.pdf.
3 Mar 2006
Abstract
As autonomous agents make the transition from solving simple, well-behaved problems to being useful entities in the real world, they must deal with the added complexity and uncertainty inherent in real environments. In particular, agents navigating through the real world can be confronted with incomplete or imperfect information (e.g. when prior maps are absent or incomplete), large state spaces (e.g. for robots with several degrees of freedom or teams of robots), and dynamic elements (e.g. when there are humans or other agents in the environment). In this work, we propose to address the problem of path planning and replanning in both static and dynamic environments for which prior information may be incomplete or imperfect. We intend to develop a set of planning algorithms that will enable single agents and multi-agent teams to operate more effectively in a wider range of realistic scenarios.
A copy of the thesis proposal document can be found at http://gs2045.sp.cs.cmu.edu/downloads/proposal.pdf.
Thursday, February 23, 2006
What's New @ IEEE in Wireless, February 2006
4. WHEELED NETWORKS REQUIRE NEW SECURITY SOLUTIONS
Greater numbers of vehicles equipped for wireless networking present new security challenges due to the short contact times between different mobile nodes and the large size of the networks, according to researchers studying the issue. The German-funded Network on Wheels (NoW) project incorporates security considerations into network development. Researchers say those concerns include continuous system availability (a system is robust even in the presence of malicious or faulty nodes); privacy, including un-traceability of actions to a user and un-linkability of the actions of a node; and secure communication. Current work on NoW includes detecting attacks on the different parts of the system and estimating both their impact and probability, researchers say. Read more: http://www.primidi.com/2006/02/01.html
7. WIRELESS RESCUE SYSTEM TO BE TESTED IN U.S. MINES
Wireless systems that locate trapped miners and send them text messages are being tested by the U.S. Mine Safety and Health Administration (MSHA), including one system which pinpoints the location of individual miners, according to researchers. One of the systems uses a transmitter worn by miners that sends out a signal unique to each individual, researchers say, while another device is a personal receiver that allows rescuers to send text messages to the miners. Both technologies operate on a network of wireless radio transmitters installed in the tunnels, and were developed by the Australian firm Mine Site Technologies. Read more: http://www.physorg.com/news10522.html
13. ENGLAND'S WINES PROJECT AGING NICELY
Four groups funded by England's Wired and Wireless Networked Systems (WINES) program -- which studies the creation of massive-scale ubiquitous and pervasive computing environments -- are examined in this month's issue of IEEE Distributed Systems Online. TIME-EACM, a collaboration between the University of London and Birkbeck College, is studying how wired and wireless systems can improve traffic flow and congestion in urban areas. BiosensorNet, comprised of several teams from Imperial College London, hopes to improve the medical industry with state of the art wireless sensors implanted in the body. Cityware, a project including the University of Bath, Imperial College London, and University College London, is studying how new integrated information systems placed in architecture will affect peoples' relationships with their environment. Finally, NEMO, comprised of departments at Lancaster University, is looking at embedding sensors in everyday objects -- called smart artifacts -- in order to enable physical entities to capture and share their "experiences." A new round of WINES funding set to be unleashed next month. Read more: the link
Greater numbers of vehicles equipped for wireless networking present new security challenges due to the short contact times between different mobile nodes and the large size of the networks, according to researchers studying the issue. The German-funded Network on Wheels (NoW) project incorporates security considerations into network development. Researchers say those concerns include continuous system availability (a system is robust even in the presence of malicious or faulty nodes); privacy, including un-traceability of actions to a user and un-linkability of the actions of a node; and secure communication. Current work on NoW includes detecting attacks on the different parts of the system and estimating both their impact and probability, researchers say. Read more: http://www.primidi.com/2006/02/01.html
7. WIRELESS RESCUE SYSTEM TO BE TESTED IN U.S. MINES
Wireless systems that locate trapped miners and send them text messages are being tested by the U.S. Mine Safety and Health Administration (MSHA), including one system which pinpoints the location of individual miners, according to researchers. One of the systems uses a transmitter worn by miners that sends out a signal unique to each individual, researchers say, while another device is a personal receiver that allows rescuers to send text messages to the miners. Both technologies operate on a network of wireless radio transmitters installed in the tunnels, and were developed by the Australian firm Mine Site Technologies. Read more: http://www.physorg.com/news10522.html
13. ENGLAND'S WINES PROJECT AGING NICELY
Four groups funded by England's Wired and Wireless Networked Systems (WINES) program -- which studies the creation of massive-scale ubiquitous and pervasive computing environments -- are examined in this month's issue of IEEE Distributed Systems Online. TIME-EACM, a collaboration between the University of London and Birkbeck College, is studying how wired and wireless systems can improve traffic flow and congestion in urban areas. BiosensorNet, comprised of several teams from Imperial College London, hopes to improve the medical industry with state of the art wireless sensors implanted in the body. Cityware, a project including the University of Bath, Imperial College London, and University College London, is studying how new integrated information systems placed in architecture will affect peoples' relationships with their environment. Finally, NEMO, comprised of departments at Lancaster University, is looking at embedding sensors in everyday objects -- called smart artifacts -- in order to enable physical entities to capture and share their "experiences." A new round of WINES funding set to be unleashed next month. Read more: the link
PASCAL Visual Object Classes Recognition Challenge 2006
Subject: PASCAL Visual Object Classes Recognition Challenge 2006
Date: Fri, 17 Feb 2006 20:32:18 GMT
From: Andrew Zisserman
Dear All,
We are running a second PASCAL Visual Object Classes Recognition Challenge. This time there are more classes (ten), more challenging images, and the possibility of confusion between classes with similar visual appearance (cars/bus, bicycle/motorbike).
As before participants can recognize any or all of the classes, and there is a classficiation and a detection track.
The development kit (Matlab code for evaluation, and baseline algorithms) and training data is now available at:
http://www.pascal-network.org/challenges/VOC/voc2006/index.html
where further details are given. The timetable of the challenge is included below.
It would be great if each of you or your groups could participate.
Best wishes,
Andrew Zisserman
Mark Everingham
Chris Williams
Luc Van Gool
TIMETABLE
* 14 Feb 2006 : Development kit (training and validation data plus evaluation software) made available.
* 31 March 2006: Test set made available
* 21 April 2006: DEADLINE for submission of results
* 7 May 2006: Half-day (afternoon) challenge workshop to be held in conjunction with ECCV06, Graz, Austria.
Date: Fri, 17 Feb 2006 20:32:18 GMT
From: Andrew Zisserman
Dear All,
We are running a second PASCAL Visual Object Classes Recognition Challenge. This time there are more classes (ten), more challenging images, and the possibility of confusion between classes with similar visual appearance (cars/bus, bicycle/motorbike).
As before participants can recognize any or all of the classes, and there is a classficiation and a detection track.
The development kit (Matlab code for evaluation, and baseline algorithms) and training data is now available at:
http://www.pascal-network.org/challenges/VOC/voc2006/index.html
where further details are given. The timetable of the challenge is included below.
It would be great if each of you or your groups could participate.
Best wishes,
Andrew Zisserman
Mark Everingham
Chris Williams
Luc Van Gool
TIMETABLE
* 14 Feb 2006 : Development kit (training and validation data plus evaluation software) made available.
* 31 March 2006: Test set made available
* 21 April 2006: DEADLINE for submission of results
* 7 May 2006: Half-day (afternoon) challenge workshop to be held in conjunction with ECCV06, Graz, Austria.
IEEE Career Alert: Tech Jobs Are Jumping
3. Start Up, Not at the Bottom
The latest trend in entry-level jobs is to avoid them altogether. More and more recent college graduates are heading start-up businesses, writes The Boston Globe. In fact, at high-powered schools like Harvard and Carnegie Mellon, thirty to forty percent of students create their own companies within five years of graduating. For students thinking of leaping to the top of their own corporate ladder, certain skills may come in handy. For one, they may have to learn to market themselves. More advice can be found at:
The latest trend in entry-level jobs is to avoid them altogether. More and more recent college graduates are heading start-up businesses, writes The Boston Globe. In fact, at high-powered schools like Harvard and Carnegie Mellon, thirty to forty percent of students create their own companies within five years of graduating. For students thinking of leaping to the top of their own corporate ladder, certain skills may come in handy. For one, they may have to learn to market themselves. More advice can be found at:
What's New @ IEEE in Signal Processing, February 2006
2. RADAR ON THE SCOPE OF "SIGNAL PROCESSING MAGAZINE" SPECIAL ISSUE
The latest issue of "IEEE Signal Processing Magazine" (v. 23, no. 1) includes a feature section on knowledge-based systems for adaptive radars. Topics covered include Knowledge-based systems for adaptive radar, cognitive radar, space-time adaptive processing as well as several others. The table of contents and abstracts for all articles are available online, where subscribers may also access the full text of all papers: http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=33529
Also now online, the latest issue of "IEEE Signal Processing Letters" (v. 13, no. 3), covering signal modification for ADPCM based on analysis-by-synthesis framework, a new gradient search interpretation of super-exponential algorithms among other topics: http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=33543
7. AIMING FOR MORE ACCURATE FISH POPULATION COUNTS
Many environmentalists and scientists believe the world's fish populations are shrinking, and new developments in signal processing technology seek to arm researchers with techniques that provide more accurate fish population data. Off the coast of Monterey, California, USA, a team of scientists demonstrated a new sonar technique to detect squid egg clusters in the ocean's depths. By towing a sidescan sonar with the California State University Seafloor Mapping Lab's research vessel, the team was able to conduct experiments that tested various ways to tune sound wave frequencies. After signals were drawn out, the sound data was translated into sonar images in the form of seafloor maps which displayed where egg clusters could be found, providing a portrayal of future populations. Meanwhile, researchers at the Massachusetts Institute of Technology have created a remote sensor system that allows scientists to monitor large fish populations over a 10,000-square-kilometer area. While old surveying methods provide a smaller amount of data with high-frequency sonar beams, this new system employs low-frequency sonar beams that can travel farther distances, bringing data back in sharper detail through less intense signals. Read more about these developments:
http://www.eurekalert.org/pub_releases/2006-02/miot-oft013006.php
& http://www.eurekalert.org/pub_releases/2006-02/whoi-nsm020706.php
10. WORLD'S FASTEST CAMERA TO CATCH TRACES OF ELUSIVE PARTICLE
The Regional Calorimeter Trigger, the world's fastest image processor, can analyze a billion proton collisions per second, according to its developers at the University of Wisconsin-Madison, and will be used in the Large Hadron Collider (LHC) in Geneva, Switzerland, to capture traces of the subatomic Higgs-Boson. The US$6 million device is composed of integrated circuits on 300 parallel processing computer cards, researchers say, creating a massive image processor capable of analyzing one trillion bits of data per second. The Higgs-Boson is one of the particles researchers say is necessary to complete the standard model of physics, the evidence for which has been sought for 20 years. When protons crash in a collider the event lasts no more than two-billionths of second, according to researchers. Read more:
http://www.physorg.com/news10589.html
and http://www.primidi.com/2006/02/08.html#a1436
The latest issue of "IEEE Signal Processing Magazine" (v. 23, no. 1) includes a feature section on knowledge-based systems for adaptive radars. Topics covered include Knowledge-based systems for adaptive radar, cognitive radar, space-time adaptive processing as well as several others. The table of contents and abstracts for all articles are available online, where subscribers may also access the full text of all papers: http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=33529
Also now online, the latest issue of "IEEE Signal Processing Letters" (v. 13, no. 3), covering signal modification for ADPCM based on analysis-by-synthesis framework, a new gradient search interpretation of super-exponential algorithms among other topics: http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=33543
7. AIMING FOR MORE ACCURATE FISH POPULATION COUNTS
Many environmentalists and scientists believe the world's fish populations are shrinking, and new developments in signal processing technology seek to arm researchers with techniques that provide more accurate fish population data. Off the coast of Monterey, California, USA, a team of scientists demonstrated a new sonar technique to detect squid egg clusters in the ocean's depths. By towing a sidescan sonar with the California State University Seafloor Mapping Lab's research vessel, the team was able to conduct experiments that tested various ways to tune sound wave frequencies. After signals were drawn out, the sound data was translated into sonar images in the form of seafloor maps which displayed where egg clusters could be found, providing a portrayal of future populations. Meanwhile, researchers at the Massachusetts Institute of Technology have created a remote sensor system that allows scientists to monitor large fish populations over a 10,000-square-kilometer area. While old surveying methods provide a smaller amount of data with high-frequency sonar beams, this new system employs low-frequency sonar beams that can travel farther distances, bringing data back in sharper detail through less intense signals. Read more about these developments:
http://www.eurekalert.org/pub_releases/2006-02/miot-oft013006.php
& http://www.eurekalert.org/pub_releases/2006-02/whoi-nsm020706.php
10. WORLD'S FASTEST CAMERA TO CATCH TRACES OF ELUSIVE PARTICLE
The Regional Calorimeter Trigger, the world's fastest image processor, can analyze a billion proton collisions per second, according to its developers at the University of Wisconsin-Madison, and will be used in the Large Hadron Collider (LHC) in Geneva, Switzerland, to capture traces of the subatomic Higgs-Boson. The US$6 million device is composed of integrated circuits on 300 parallel processing computer cards, researchers say, creating a massive image processor capable of analyzing one trillion bits of data per second. The Higgs-Boson is one of the particles researchers say is necessary to complete the standard model of physics, the evidence for which has been sought for 20 years. When protons crash in a collider the event lasts no more than two-billionths of second, according to researchers. Read more:
http://www.physorg.com/news10589.html
and http://www.primidi.com/2006/02/08.html#a1436
Wednesday, February 22, 2006
Fast Extrinsic Calibration of a Laser Rangefinder to a Camera
{Ranjith Unnikrishnan , Martial Hebert}
Abstract:
External calibration of a camera to a laser rangefinder is a common pre-requisiteon today’s multi-sensor mobile robot platforms. However, the process of doing sois relatively poorly documented and almost always time-consuming. This documentoutlines an easy and portable technique for external calibration of a camera to a laserrangefinder. It describes the usage of the Laser-Camera Calibration Toolbox (LCCT),a MatlabR -based graphical user interface that is meant to accompany this document andfacilitates the calibration procedure. We also summarize the math behind its development.
[Link]
Abstract:
External calibration of a camera to a laser rangefinder is a common pre-requisiteon today’s multi-sensor mobile robot platforms. However, the process of doing sois relatively poorly documented and almost always time-consuming. This documentoutlines an easy and portable technique for external calibration of a camera to a laserrangefinder. It describes the usage of the Laser-Camera Calibration Toolbox (LCCT),a MatlabR -based graphical user interface that is meant to accompany this document andfacilitates the calibration procedure. We also summarize the math behind its development.
[Link]
CMU VASC talk: Learning to Transform Time Series with a Few Examples
Ali Rahimi, Intel Lab Seattle
Monday, Feb 27, 2006
Abstract:
I describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. I apply this algorithm to tracking, where one transforms a time series of observations from sensors to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, I suggest learning a memoryless transformations of time series from a few example input-output mappings. Our algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. I relate this algorithm and its unsupervised extension to nonlinear system identification and manifold learning techniques. I demonstrate it on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences, and tracking a target in a completely uncalibrated network of sensors.
For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account.
Speaker Bio:
Ali Rahimi is interested in developing machine learning tools for solving difficult sensing problems. His focus is on example-based tracking, and efficient approximation methods for estimation. He received a PhD from the MIT Computer Science and AI Lab in 2005, a MS in Media Arts and Science from the MIT Media Lab, and a BS in Electrical Engineering and Computer Science from UC Berkeley.
Monday, Feb 27, 2006
Abstract:
I describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. I apply this algorithm to tracking, where one transforms a time series of observations from sensors to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, I suggest learning a memoryless transformations of time series from a few example input-output mappings. Our algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. I relate this algorithm and its unsupervised extension to nonlinear system identification and manifold learning techniques. I demonstrate it on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences, and tracking a target in a completely uncalibrated network of sensors.
For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account.
Speaker Bio:
Ali Rahimi is interested in developing machine learning tools for solving difficult sensing problems. His focus is on example-based tracking, and efficient approximation methods for estimation. He received a PhD from the MIT Computer Science and AI Lab in 2005, a MS in Media Arts and Science from the MIT Media Lab, and a BS in Electrical Engineering and Computer Science from UC Berkeley.
Tuesday, February 21, 2006
The Boosting Approach to Machine Learning
The Boosting Approach to Machine Learning
An Overview
Robert E. Schapire
AT&T Labs - Research
Shannon Laboratory
Abstract
Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, this chapter overviews some of the recent work on boosting including analyses of AdaBoost’s training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extensions of AdaBoost for multiclass classification problems; methods of incorporating human knowledge into boosting; and experimental and applied work using boosting.
Here is the link
An Overview
Robert E. Schapire
AT&T Labs - Research
Shannon Laboratory
Abstract
Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, this chapter overviews some of the recent work on boosting including analyses of AdaBoost’s training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extensions of AdaBoost for multiclass classification problems; methods of incorporating human knowledge into boosting; and experimental and applied work using boosting.
Here is the link
Monday, February 20, 2006
My talk this week (Casey)
My talk has below parts:
1.The related work: Robust Real-time Object Detection.(Author: Viola & Jones)
2.Detection approach of HandVu System
3.Tracking Approach of HandVu System
4.Recognition
The information of this paper:
It is in IEEE Intl. Conference on Automatic Face and Gesture Recognition, May 2004.
Robust Hand Detection
Mathias K¨olsch and Matthew Turk
Department of Computer Science, University of California, Santa Barbara, CA
Abstract
Vision-based hand gesture interfaces require fast and extremely
robust hand detection. Here, we study view-specic
hand posture detection with an object recognition method
recently proposed by Viola and Jones. Training with this
method is computationally very expensive, prohibiting the
evaluation of many hand appearances for their suitability
to detection. As one contribution of this paper, we present a
frequency analysis-based method for instantaneous estimation
of class separability, without the need for any training.
We built detectors for the most promising candidates, their
receiver operating characteristics conrming the estimates.
Next, we found that classication accuracy increases with
a more expressive feature type. As a third contribution, we
show that further optimization of training parameters yields
additional detection rate improvements. In summary, we
present a systematic approach to building an extremely robust
hand appearance detector, providing an important step
towards easily deployable and reliable vision-based hand
gesture interfaces.
And Below is the autor's Ph.D thesis, "Vision Based Hand Gesture Interfaces for Wearable Computing and Virtual Environments"
You can download these two paper and get the author's information from this link: http://www.movesinstitute.org/~kolsch/publications.html
1.The related work: Robust Real-time Object Detection.(Author: Viola & Jones)
2.Detection approach of HandVu System
3.Tracking Approach of HandVu System
4.Recognition
The information of this paper:
It is in IEEE Intl. Conference on Automatic Face and Gesture Recognition, May 2004.
Robust Hand Detection
Mathias K¨olsch and Matthew Turk
Department of Computer Science, University of California, Santa Barbara, CA
Abstract
Vision-based hand gesture interfaces require fast and extremely
robust hand detection. Here, we study view-specic
hand posture detection with an object recognition method
recently proposed by Viola and Jones. Training with this
method is computationally very expensive, prohibiting the
evaluation of many hand appearances for their suitability
to detection. As one contribution of this paper, we present a
frequency analysis-based method for instantaneous estimation
of class separability, without the need for any training.
We built detectors for the most promising candidates, their
receiver operating characteristics conrming the estimates.
Next, we found that classication accuracy increases with
a more expressive feature type. As a third contribution, we
show that further optimization of training parameters yields
additional detection rate improvements. In summary, we
present a systematic approach to building an extremely robust
hand appearance detector, providing an important step
towards easily deployable and reliable vision-based hand
gesture interfaces.
And Below is the autor's Ph.D thesis, "Vision Based Hand Gesture Interfaces for Wearable Computing and Virtual Environments"
You can download these two paper and get the author's information from this link: http://www.movesinstitute.org/~kolsch/publications.html
Thursday, February 16, 2006
What's New @ IEEE in Computing, February 2006
4. ALGORITHMS TO AID DEVELOPMENT OF PATTERN RECOGNITION SOFTWARE DEVELOPED
Departing from traditional approaches to promoting the development of pattern recognition software, researchers at Ohio State University have created a new method that tests machine vision algorithms to evaluate which algorithm is most successful for a given application. Using two databases, one consisting of objects such as apples and pears and another of faces with various expressions, the researchers found the tasks of sorting objects and identifying expressions to be distinct in such a way that an algorithm could be good at doing one but not the other. The end result allows for a faster, more efficient way to gather data from pattern recognition software, according to Aleix Martinez, assistant professor of electrical and computer engineering at Ohio State. This work may have an affect on research in areas as varied as neuroscience, genetics, and economics. Read more:
http://www.eurekalert.org/pub_releases/2006-01/osu-anw012406.php
5. INTELLIGENT TRANSPORTATION PAPERS NEEDED
The IEEE Pervasive Computing Magazine has announced a call for papers for a special issue on intelligent transportation. Authors are asked to submit articles describing the application of pervasive computing technologies, systems, and applications to vehicles, roads, and other transportation systems. Also encouraged are articles that discuss the security, privacy, social, and human-related issues of intelligent transportation, and case studies of experiences with existing pervasive technologies in use in transportation. Deadline for submission is 31 May 2006. For details, visit:
http://www.computer.org/portal/pages/pervasive/content/cfp4.html
Departing from traditional approaches to promoting the development of pattern recognition software, researchers at Ohio State University have created a new method that tests machine vision algorithms to evaluate which algorithm is most successful for a given application. Using two databases, one consisting of objects such as apples and pears and another of faces with various expressions, the researchers found the tasks of sorting objects and identifying expressions to be distinct in such a way that an algorithm could be good at doing one but not the other. The end result allows for a faster, more efficient way to gather data from pattern recognition software, according to Aleix Martinez, assistant professor of electrical and computer engineering at Ohio State. This work may have an affect on research in areas as varied as neuroscience, genetics, and economics. Read more:
http://www.eurekalert.org/pub_releases/2006-01/osu-anw012406.php
5. INTELLIGENT TRANSPORTATION PAPERS NEEDED
The IEEE Pervasive Computing Magazine has announced a call for papers for a special issue on intelligent transportation. Authors are asked to submit articles describing the application of pervasive computing technologies, systems, and applications to vehicles, roads, and other transportation systems. Also encouraged are articles that discuss the security, privacy, social, and human-related issues of intelligent transportation, and case studies of experiences with existing pervasive technologies in use in transportation. Deadline for submission is 31 May 2006. For details, visit:
http://www.computer.org/portal/pages/pervasive/content/cfp4.html
Tuesday, February 14, 2006
MIT Report : Learning Semantic Scene Models by Trajectory Analysis
Authors
Xiaogang Wang, Kinh Tieu, Eric Grimson
Abstract
In this paper, we describe an unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from long-term observations of moving objects in the scene. First, we introduce two novel similarity measures for comparing trajectories in far-field visual surveillance. The measures simultaneously compare the spatial distribution of trajectories and other attributes, such as velocity and object size, along the trajectories. They also provide a comparison confidence measure which indicates how well the measured image-based similarity approximates true physical similarity. We also introduce novel clustering algorithms which use both similarity and comparison confidence. Based on the proposed similarity measures and clustering methods, a framework to learn semantic scene models by trajectory analysis is developed. Trajectories are first clustered into vehicles and pedestrians, and then further grouped based on spatial and velocity distributions. Different trajectory clusters represent different activities. The geometric and statistical models of structures in the scene, such as roads, walk paths, source and sinks, are automatically learned from the trajectory clusters. Abnormal activities are detected using the semantic scene models. The system is robust to low-level tracking errors.
Link
Xiaogang Wang, Kinh Tieu, Eric Grimson
Abstract
In this paper, we describe an unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from long-term observations of moving objects in the scene. First, we introduce two novel similarity measures for comparing trajectories in far-field visual surveillance. The measures simultaneously compare the spatial distribution of trajectories and other attributes, such as velocity and object size, along the trajectories. They also provide a comparison confidence measure which indicates how well the measured image-based similarity approximates true physical similarity. We also introduce novel clustering algorithms which use both similarity and comparison confidence. Based on the proposed similarity measures and clustering methods, a framework to learn semantic scene models by trajectory analysis is developed. Trajectories are first clustered into vehicles and pedestrians, and then further grouped based on spatial and velocity distributions. Different trajectory clusters represent different activities. The geometric and statistical models of structures in the scene, such as roads, walk paths, source and sinks, are automatically learned from the trajectory clusters. Abnormal activities are detected using the semantic scene models. The system is robust to low-level tracking errors.
Link
MIT talk: Object Class and Subclass Recognition Using Relational Object Models
Speaker: Aharon Bar-Hillel , Hebrew University
Date: Wednesday, February 15 2006
Host: Prof. Tomaso Poggio, M.I.T., McGovern Institute, BCS & CSAIL
Abstract: In the first part of the talk I will present a new learning method for object class recognition, combining a generative constellation model with a discriminative optimization technique. Specifically we use a 'star'-like Bayesian network model, but learn its parameters using an extended boosting technique which iterates between inference and part learning. Learning complexity is linear in the number of model parts and image features, compared to an exponential learning complexity for similar models in a generative framework. This allows the construction of rich models with many distinctive parts, leading to improved classification accuracy.
In the second part of the talk I will address the problem of sub-ordinate class recognition (like the distinction between cross and sport motorcycles), relying on the above-mentioned learning technique. Our approach to this problem is motivated by observations from cognitive psychology, which identify parts as the defining component of basic level categories, while sub-ordinate categories are more often defined by modified parts. Accordingly, we suggest a two-stage algorithm: First a model of the inclusive class is learned (e.g., motorcycles in general) using the technique introduced earlier, and then subclass classification is made based on the part correspondence implied by the model. The two-stage algorithm typically outperforms a competing one-step algorithm, which builds distinct constellation models for each subclass. This performance advantage critically relies on modeling of the spatial relations between parts, and on having models with a large number of parts.
The talk is based on a joint work with Tomer Hertz and Prof. Daphna Weinshall.
Date: Wednesday, February 15 2006
Host: Prof. Tomaso Poggio, M.I.T., McGovern Institute, BCS & CSAIL
Abstract: In the first part of the talk I will present a new learning method for object class recognition, combining a generative constellation model with a discriminative optimization technique. Specifically we use a 'star'-like Bayesian network model, but learn its parameters using an extended boosting technique which iterates between inference and part learning. Learning complexity is linear in the number of model parts and image features, compared to an exponential learning complexity for similar models in a generative framework. This allows the construction of rich models with many distinctive parts, leading to improved classification accuracy.
In the second part of the talk I will address the problem of sub-ordinate class recognition (like the distinction between cross and sport motorcycles), relying on the above-mentioned learning technique. Our approach to this problem is motivated by observations from cognitive psychology, which identify parts as the defining component of basic level categories, while sub-ordinate categories are more often defined by modified parts. Accordingly, we suggest a two-stage algorithm: First a model of the inclusive class is learned (e.g., motorcycles in general) using the technique introduced earlier, and then subclass classification is made based on the part correspondence implied by the model. The two-stage algorithm typically outperforms a competing one-step algorithm, which builds distinct constellation models for each subclass. This performance advantage critically relies on modeling of the spatial relations between parts, and on having models with a large number of parts.
The talk is based on a joint work with Tomer Hertz and Prof. Daphna Weinshall.
MIT Thesis Defense: Learning a Dictionary of Shape-Components in Visual Cortex: Comparison with Neurons, Humans and Machine
Speaker: Thomas Serre , Dept. of Brain & Cognitive Sciences and McGovern Institute for Brain Research
Date: Wednesday, February 15 2006
Host: Prof. Tomaso Poggio, McGovern Institute for Brain Research
Relevant URL: http://web.mit.edu/serre/www/
In this talk I will describe a quantitative model that accounts for the circuits and computations of the feedforward path of the ventral stream of visual cortex. This model is consistent with a general theory of visual processing that extends the hierarchical model of Hubel & Wiesel from primary to extrastriate visual areas and attempts to explain the first few hundred milliseconds of visual processing. One of the key elements in the approach I will describe is the learning of a generic dictionary of shape-components from V2 to IT, which provides an invariant representation to task-specific categorization circuits in higher brain areas. This vocabulary of shape-tuned units is learned in an unsupervised manner from natural images, and constitutes a large and redundant set of image features with different complexities and invariances. This theory significantly extends an earlier approach by Riesenhuber & Poggio (1999) and builds upon several existing neurobiological models and conceptual proposals.
I will present evidence to show that not only can the model duplicate the tuning properties of neurons in various brain areas when probed with artificial stimuli (like the ones typically used in physiology), but it can also handle the recognition of objects in the real-world, to the extent of competing with the best computer vision systems. Following this, I will present a comparison between the performance of the model and the performance of human observers in a rapid animal vs. non-animal recognition task for which recognition is fast and cortical back-projections are likely to be inactive. Results indicate that the model predicts human performance extremely well when the delay between the stimulus and the mask is about 50 ms. These results suggest that cortical back-projections may not play a significant role when the time interval is in this range, and the model may therefore provide a satisfactory description of the feedforward path.
Taken together, the evidence I will present shows that we may have the skeleton of a successful theory of visual cortex. In addition, this may be the first time that a neurobiological model, faithful to the physiology and the anatomy of visual cortex, not only competes with some of the best computer vision systems thus providing a realistic alternative to engineered artificial vision systems, but also achieves performance close to that of humans in a categorization task involving complex natural images.
Date: Wednesday, February 15 2006
Host: Prof. Tomaso Poggio, McGovern Institute for Brain Research
Relevant URL: http://web.mit.edu/serre/www/
In this talk I will describe a quantitative model that accounts for the circuits and computations of the feedforward path of the ventral stream of visual cortex. This model is consistent with a general theory of visual processing that extends the hierarchical model of Hubel & Wiesel from primary to extrastriate visual areas and attempts to explain the first few hundred milliseconds of visual processing. One of the key elements in the approach I will describe is the learning of a generic dictionary of shape-components from V2 to IT, which provides an invariant representation to task-specific categorization circuits in higher brain areas. This vocabulary of shape-tuned units is learned in an unsupervised manner from natural images, and constitutes a large and redundant set of image features with different complexities and invariances. This theory significantly extends an earlier approach by Riesenhuber & Poggio (1999) and builds upon several existing neurobiological models and conceptual proposals.
I will present evidence to show that not only can the model duplicate the tuning properties of neurons in various brain areas when probed with artificial stimuli (like the ones typically used in physiology), but it can also handle the recognition of objects in the real-world, to the extent of competing with the best computer vision systems. Following this, I will present a comparison between the performance of the model and the performance of human observers in a rapid animal vs. non-animal recognition task for which recognition is fast and cortical back-projections are likely to be inactive. Results indicate that the model predicts human performance extremely well when the delay between the stimulus and the mask is about 50 ms. These results suggest that cortical back-projections may not play a significant role when the time interval is in this range, and the model may therefore provide a satisfactory description of the feedforward path.
Taken together, the evidence I will present shows that we may have the skeleton of a successful theory of visual cortex. In addition, this may be the first time that a neurobiological model, faithful to the physiology and the anatomy of visual cortex, not only competes with some of the best computer vision systems thus providing a realistic alternative to engineered artificial vision systems, but also achieves performance close to that of humans in a categorization task involving complex natural images.
CMU VASC talk: Video visualization - Beyond pixels and frames
Yaron Capsi, Tel Aviv University
Monday, Feb 20, 2006
Abstract: Video data is represented by pixels and frames. This restricts the way it is captured, accessed and visualized. On one hand, visual information is distributed across all frames, and therefore, in order to depict the visual information, the entire video sequence must be viewed sequentially, frame by frame. On the other hand, important visual information is lost by the limited frame rate. Similarly in the spatial domain, sensor and optics limit the capturing process, while huge redundancy prevents an efficient visualization of information. In this talk I will show how to exceed both limitations of capturing devices and of visual displays. In particular, how fusion of information from multiple sources allows to exceed temporal and spatial limitations, and how visualization of video data can benefit from importance ranking. I will describe a process that depicts the essence of video or animation, by embedding high dimensional data in low dimensional Euclidean space. I will also show how super-pixels (in contrast to pixels) contribute to the exploitation of temporal redundancy for the task of spatial segmentation of regions with high importance.
Monday, Feb 20, 2006
Abstract: Video data is represented by pixels and frames. This restricts the way it is captured, accessed and visualized. On one hand, visual information is distributed across all frames, and therefore, in order to depict the visual information, the entire video sequence must be viewed sequentially, frame by frame. On the other hand, important visual information is lost by the limited frame rate. Similarly in the spatial domain, sensor and optics limit the capturing process, while huge redundancy prevents an efficient visualization of information. In this talk I will show how to exceed both limitations of capturing devices and of visual displays. In particular, how fusion of information from multiple sources allows to exceed temporal and spatial limitations, and how visualization of video data can benefit from importance ranking. I will describe a process that depicts the essence of video or animation, by embedding high dimensional data in low dimensional Euclidean space. I will also show how super-pixels (in contrast to pixels) contribute to the exploitation of temporal redundancy for the task of spatial segmentation of regions with high importance.
Monday, February 13, 2006
Paper: Computer Vision for Music Identification
Y. Ke, D. Hoiem, and R. Sukthankar. In Proceedings of Computer Vision and Pattern Recognition, 2005.
Abstract:
We describe how certain tasks in the audio domain can be effectively addressed using computer vision approaches. This paper focuses on the problem of music identification, where the goal is to reliably identify a song given a few seconds of noisy audio. Our approach treats the spectrogram of each music clip as a 2-D image and transforms music identification into a corrupted sub-image retrieval problem. By employing pairwise boosting on a large set of Viola-Jones features, our system learns compact, discriminative, local descriptors that are amenable to efficient indexing. During the query phase, we retrieve the set of song snippets that locally match the noisy sample and employ geometric verification in conjunction with an EM-based “occlusion” model to identify the song that is most consistent with the observed signal. We have implemented our algorithm in a practical system that can quickly and accurately recognize music from short audio samples in the presence of distortions such as poor recording quality and significant ambient noise. Our experiments demonstrate that this approach significantly outperforms the current state-of-the-art in content-based music identification.
project link
Abstract:
We describe how certain tasks in the audio domain can be effectively addressed using computer vision approaches. This paper focuses on the problem of music identification, where the goal is to reliably identify a song given a few seconds of noisy audio. Our approach treats the spectrogram of each music clip as a 2-D image and transforms music identification into a corrupted sub-image retrieval problem. By employing pairwise boosting on a large set of Viola-Jones features, our system learns compact, discriminative, local descriptors that are amenable to efficient indexing. During the query phase, we retrieve the set of song snippets that locally match the noisy sample and employ geometric verification in conjunction with an EM-based “occlusion” model to identify the song that is most consistent with the observed signal. We have implemented our algorithm in a practical system that can quickly and accurately recognize music from short audio samples in the presence of distortions such as poor recording quality and significant ambient noise. Our experiments demonstrate that this approach significantly outperforms the current state-of-the-art in content-based music identification.
project link
Sunday, February 12, 2006
My talk this week
1. Structure from sound (review and more details)
S. Thrun. Affine Structure From Sound In Proceedings of the 2005 Conference on Neural Information Processing Systems (NIPS). MIT Press, 2006.
link
2.Sound Object Localization and Retrieval in Complex Audio Environments
D. Hoiem, Y. Ke, and R. Sukthankar, "SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2005.
link
ABSTRACT:
The ability to identify sounds in complex audio environ-ments is highly useful for multimedia retrieval, security, and many mobile robotic applications, but very little work has been done in this area. We present the SOLAR sys-tem, a system capable of finding sound objects, such as dog barks or car horns, in complex audio data extracted from movies. SOLAR avoids the need for segmentation by scanning over the audio data in fixed increments and classifying each short audio window separately. SOLAR employs boosted decision tree classifiers to select suitable features for modeling each sound object and to discrimi-nate between the object of interest and all other sounds. We demonstrate the effectiveness of our approach with experiments on thirteen sound object classes trained using only tens of positive examples and tested on hours of audio data extracted from popular movies.
S. Thrun. Affine Structure From Sound In Proceedings of the 2005 Conference on Neural Information Processing Systems (NIPS). MIT Press, 2006.
link
2.Sound Object Localization and Retrieval in Complex Audio Environments
D. Hoiem, Y. Ke, and R. Sukthankar, "SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2005.
link
ABSTRACT:
The ability to identify sounds in complex audio environ-ments is highly useful for multimedia retrieval, security, and many mobile robotic applications, but very little work has been done in this area. We present the SOLAR sys-tem, a system capable of finding sound objects, such as dog barks or car horns, in complex audio data extracted from movies. SOLAR avoids the need for segmentation by scanning over the audio data in fixed increments and classifying each short audio window separately. SOLAR employs boosted decision tree classifiers to select suitable features for modeling each sound object and to discrimi-nate between the object of interest and all other sounds. We demonstrate the effectiveness of our approach with experiments on thirteen sound object classes trained using only tens of positive examples and tested on hours of audio data extracted from popular movies.
Robot Dream Exposition Taiwan 2006

I went to the robot exposition yesterday and took some photos. You can access these photos at this link.
Saturday, February 11, 2006
CNN: Toy makers hawk robotic playmates
Toy fair to feature robotic pets, 'Let's Dance' Barbie
Friday, February 10, 2006; Posted: 6:38 p.m. EST (23:38 GMT)
NEW YORK (AP) -- If children didn't get their fill of high-tech toys during the 2005 holiday season, they should brace themselves for more wizardry later this year.
With young consumers growing out of toys faster and preferring iPod digital music players and video games, the nation's toy makers are working harder to come up with more high-tech products, particularly robotic playmates.
Such robotic toys, which are even more lifelike than a year ago, are among the thousands of toys to be featured at American International Toy Fair, officially beginning Sunday.
See more.
Friday, February 10, 2006
Robot news & videos...
ROBOTS - our cutting-edge new Special Report
From our homes to the operating theatre, from war zones into space, robots are on the march. Follow their progress, plus our Expert Guide including an Instant Expert, robot video Top Ten and more...
http://www.newscientist.com/channel/mech-tech/robots
From our homes to the operating theatre, from war zones into space, robots are on the march. Follow their progress, plus our Expert Guide including an Instant Expert, robot video Top Ten and more...
http://www.newscientist.com/channel/mech-tech/robots
Thursday, February 09, 2006
Cognitive-Developmental Learning for a Humanoid Robot: A Caregiver's Gift
Title: Cognitive-Developmental Learning for a Humanoid Robot: A Caregiver's Gift
Authors: Arsenio, Artur Miguel
Keywords: AI, Humanoid Robots; Developmental Learning; Perception; Human-robot Interactions
Issue Data: 22-Dec-2005
Series no: Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
Abstract:
The goal of this work is to build a cognitive system for the humanoid robot, Cog, that exploits human caregivers as catalysts to perceive and learn about actions, objects, scenes, people, and the robot itself. This thesis addresses a broad spectrum of machine learning problems across several categorization levels. Actions by embodied agents are used to automatically generate training data for the learning mechanisms, so that the robot develops categorization autonomously. Taking inspiration from the human brain, a framework of algorithms and methodologies was implemented to emulate different cognitive capabilities on the humanoid robot Cog. This framework is effectively applied to a collection of AI, computer vision, and signal processing problems. Cognitive capabilities of the humanoid robot are developmentally created, starting from infant-like abilities for detecting, segmenting, and recognizing percepts over multiple sensing modalities. Human caregivers provide a helping hand for communicating such information to the robot. This is done by actions that create meaningful events (by changing the world in which the robot is situated) thus inducing the "compliant perception" of objects from these human-robot interactions. Self-exploration of the world extends the robot's knowledge concerning object properties.This thesis argues for enculturating humanoid robots using infant development as a metaphor for building a humanoid robot's cognitive abilities. A human caregiver redesigns a humanoid's brain by teaching the humanoid robot as she would teach a child, using children's learning aids such as books, drawing boards, or other cognitive artifacts. Multi-modal object properties are learned using these tools and inserted into several recognition schemes, which are then applied to developmentally acquire new object representations. The humanoid robot therefore sees the world through the caregiver's eyes.Building an artificial humanoid robot's brain, even at an infant's cognitive level, has been a long quest which still lies only in the realm of our imagination. Our efforts towards such a dimly imaginable task are developed according to two alternate and complementary views: cognitive and developmental.
Check here for details.
Authors: Arsenio, Artur Miguel
Keywords: AI, Humanoid Robots; Developmental Learning; Perception; Human-robot Interactions
Issue Data: 22-Dec-2005
Series no: Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
Abstract:
The goal of this work is to build a cognitive system for the humanoid robot, Cog, that exploits human caregivers as catalysts to perceive and learn about actions, objects, scenes, people, and the robot itself. This thesis addresses a broad spectrum of machine learning problems across several categorization levels. Actions by embodied agents are used to automatically generate training data for the learning mechanisms, so that the robot develops categorization autonomously. Taking inspiration from the human brain, a framework of algorithms and methodologies was implemented to emulate different cognitive capabilities on the humanoid robot Cog. This framework is effectively applied to a collection of AI, computer vision, and signal processing problems. Cognitive capabilities of the humanoid robot are developmentally created, starting from infant-like abilities for detecting, segmenting, and recognizing percepts over multiple sensing modalities. Human caregivers provide a helping hand for communicating such information to the robot. This is done by actions that create meaningful events (by changing the world in which the robot is situated) thus inducing the "compliant perception" of objects from these human-robot interactions. Self-exploration of the world extends the robot's knowledge concerning object properties.This thesis argues for enculturating humanoid robots using infant development as a metaphor for building a humanoid robot's cognitive abilities. A human caregiver redesigns a humanoid's brain by teaching the humanoid robot as she would teach a child, using children's learning aids such as books, drawing boards, or other cognitive artifacts. Multi-modal object properties are learned using these tools and inserted into several recognition schemes, which are then applied to developmentally acquire new object representations. The humanoid robot therefore sees the world through the caregiver's eyes.Building an artificial humanoid robot's brain, even at an infant's cognitive level, has been a long quest which still lies only in the realm of our imagination. Our efforts towards such a dimly imaginable task are developed according to two alternate and complementary views: cognitive and developmental.
Check here for details.
Rats Smell in Stereo
By Larry O'Hanlon, Discovery News
Feb. 8, 2006— Rats need only one sniff to take their bearings on a tasty morsel, say researchers who have discovered what may be the olfactory equivalent to stereo hearing in the common rodents.
It turns out that rats use their two nostrils with what appears to be far more efficiency than humans do, and may be a lot like some other scent-oriented animals.
Read more.
What's New @ IEEE in Communications, February 2006
11. COMPACT WIRELESS PROJECT AIMS TO UNITE CAMPUSES
A new social computing research project called SmartCampus plans to unite students through compact wireless communication devices, in an effort to facilitate social interaction on campus. A team of experts at the New Jersey Institute of Technology has brought together a group of faculty and students from diverse fields to integrate their resources and spearhead the project. The project identifies places where students are likely to gather through the use of software that allows access to participant profiles taken from mobile communication devices.
Read more: http://www.physorg.com/news10224.html
A new social computing research project called SmartCampus plans to unite students through compact wireless communication devices, in an effort to facilitate social interaction on campus. A team of experts at the New Jersey Institute of Technology has brought together a group of faculty and students from diverse fields to integrate their resources and spearhead the project. The project identifies places where students are likely to gather through the use of software that allows access to participant profiles taken from mobile communication devices.
Read more: http://www.physorg.com/news10224.html
What's New @ IEEE for Students, February 2006
10. MATHEMATICAL METHOD MAKES FOR ROBOTS WITH INCREASED STRENGTH, MOTION
A spherical robot that contains three curved plates within each other is just one of many new designs that could aid in applications for compact structures that expand into larger structures. Two engineers from opposite sides of the globe have bridged the gap between kinematics and statics, using the mathematics of the two theorems to improve the design process of computer-controlled robots. Gordon R. Pennock, a mechanical engineer at Purdue University, USA, and Offer Shai, a civil engineer at Tel Aviv University in Israel believe the new theorems represent a common language which reflects the connections between kinematics and statics, emphasizing the benefits of creating robots with enhanced stability and motion. Engineers can also use this knowledge for creating structures that are more resistant to damage from motion. The theorems offer the possibility of creating a new class of functional "multiple-platform robots" that retain their structure and stability even after being damaged or reconfigured.
Read more:
http://www.sciencedaily.com/releases/2006/01/060112034757.htm
Tuesday, February 07, 2006
CMU project: SOLAR
SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments
Our goal is to detect and identify sound objects, such as car horns or dog barks, in audio. Our system, called SOLAR (sound object localization and retrieval) is the first, to our knowledge, that is capable of finding a large variety of sounds in audio data from movies and other complex audio environments. Our approach is to perform a windowed scan over audio data and classify each window using a cascade of boosted decision tree classifiers. See the presentations section for a good overview of our system. This work is performed by Derek Hoiem, Yan Ke, and Rahul Sukthankar and is supported by Intel Research Pittsburgh.
click this LINK
Our goal is to detect and identify sound objects, such as car horns or dog barks, in audio. Our system, called SOLAR (sound object localization and retrieval) is the first, to our knowledge, that is capable of finding a large variety of sounds in audio data from movies and other complex audio environments. Our approach is to perform a windowed scan over audio data and classify each window using a cascade of boosted decision tree classifiers. See the presentations section for a good overview of our system. This work is performed by Derek Hoiem, Yan Ke, and Rahul Sukthankar and is supported by Intel Research Pittsburgh.
click this LINK
Confidence weighted classifier combination for multi-modal human identification
Title:
Confidence weighted classifier combination for multi-modal human identification
Authors:
Ivanov, YuriSerre, ThomasBouvrie, Jacob
Issue Date:
22-Dec-2005
Series/Report no.:
Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
Abstract:
In this paper we describe a technique of classifier combination used in a human identification system. The system integrates all available features from multi-modal sources within a Bayesian framework. The framework allows representinga class of popular classifier combination rules and methods within a single formalism. It relies on a per-class measure of confidence derived from performance of each classifier on training data that is shown to improve performance on a synthetic data set. The method is especially relevant in autonomous surveillance setting where varying time scales and missing features are a common occurrence. We show an application of this technique to the real-world surveillance database of video and audio recordings of people collected over several weeks in the office setting.
pdf file can be found at this page
https://dspace.mit.edu/handle/1721.1/30590
Confidence weighted classifier combination for multi-modal human identification
Authors:
Ivanov, YuriSerre, ThomasBouvrie, Jacob
Issue Date:
22-Dec-2005
Series/Report no.:
Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
Abstract:
In this paper we describe a technique of classifier combination used in a human identification system. The system integrates all available features from multi-modal sources within a Bayesian framework. The framework allows representinga class of popular classifier combination rules and methods within a single formalism. It relies on a per-class measure of confidence derived from performance of each classifier on training data that is shown to improve performance on a synthetic data set. The method is especially relevant in autonomous surveillance setting where varying time scales and missing features are a common occurrence. We show an application of this technique to the real-world surveillance database of video and audio recordings of people collected over several weeks in the office setting.
pdf file can be found at this page
https://dspace.mit.edu/handle/1721.1/30590
A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images
| Title: | A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images |
| Authors: | Zollei, Lilla |
| Advisors: | Eric Grimson |
| Other Contributors: | Vision |
| Keywords: | population alignment, spatial normalization, congealing |
| Issue Date: | 25-Jan-2006 |
| Series/Report no.: | Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory |
News: Dino-robot is latest toy from Furby creator
By Dean Takahashi
Mercury News
When Caleb Chung has a big idea, the toy world pays attention. The co-inventor of the Furby doll created a sensation in 1998 that sold 40 million of the talking furry creatures.
Now at a Bay Area start-up, he is launching a new dinosaur robot for kids that he hopes will build upon his dream of creating lifelike, emotionally responsive mechanical animals.
Chung's new brainstorm is called Pleo and it will debut this fall. He is unveiling it today at the Demo conference in Scottsdale, Ariz., and has also taken the wraps off his Emeryville-based company, Ugobe, which is making Pleo.
``People are in love with robots, but the feedback we have is people need to have a more engaging relationship with their products,'' said Bob Christopher, chief executive of Ugobe. ``They want to treat something like a pet. So we need robots that show and feel emotion and that evolve over time.''
See the full article.
CNN News: Google Earth-VW plan navigation system
Monday, February 6, 2006; Posted: 11:45 a.m. EST (16:45 GMT)
(Reuters) -- Volkswagen of America Inc., said Friday it is working on a prototype vehicle which features Google Inc.'s satellite mapping software to give drivers a bird's eye view of the road ahead.
The two companies are working with the graphics chipmaker Nvidia Corp.to build an in-car navigation map system and a three-dimensional display so passengers can recognize where they are in relation to the surrounding topography.
See the full article.
Sunday, February 05, 2006
My Talk (2006/02/08)
1.Description of the calibration parameters
2.Paper:Autocalibration of a Projector-Camera System
Paper abstract:
This paper presents a method for calibrating a projector-camera system that consists of multiple projectors (or multipleposes of a single projector), a camera, and a planar screen. We consider the problem of estimating the homography between thescreen and the image plane of the camera or the screen-camera homography, in the case where there is no prior knowledge regardingthe screen surface that enables the direct computation of the homography. It is assumed that the pose of each projector is unknownwhile its internal geometry is known. Subsequently, it is shown that the screen-camera homography can be determined from only theimages projected by the projectors and then obtained by the camera, up to a transformation with four degrees of freedom. Thistransformation corresponds to arbitrariness in choosing a two-dimensional coordinate system on the screen surface and when thiscoordinate system is chosen in some manner, the screen-camera homography as well as the unknown poses of the projectors can beuniquely determined. A noniterative algorithm is presented, which computes the homography from three or more images. Severalexperimental results on synthetic as well as real images are shown to demonstrate the effectiveness of the method.
Link
2.Paper:Autocalibration of a Projector-Camera System
Paper abstract:
This paper presents a method for calibrating a projector-camera system that consists of multiple projectors (or multipleposes of a single projector), a camera, and a planar screen. We consider the problem of estimating the homography between thescreen and the image plane of the camera or the screen-camera homography, in the case where there is no prior knowledge regardingthe screen surface that enables the direct computation of the homography. It is assumed that the pose of each projector is unknownwhile its internal geometry is known. Subsequently, it is shown that the screen-camera homography can be determined from only theimages projected by the projectors and then obtained by the camera, up to a transformation with four degrees of freedom. Thistransformation corresponds to arbitrariness in choosing a two-dimensional coordinate system on the screen surface and when thiscoordinate system is chosen in some manner, the screen-camera homography as well as the unknown poses of the projectors can beuniquely determined. A noniterative algorithm is presented, which computes the homography from three or more images. Severalexperimental results on synthetic as well as real images are shown to demonstrate the effectiveness of the method.
Link
Thursday, February 02, 2006
IEEE Tech Alert for 1 Feb 2006
4. Higher-speed Martian communications
When NASA's Mars Reconnaissance Orbiter reaches the Red Planet this month, it will immediately seek out areas where water once flowed, try to identify habitats where ancient life might have thrived, and start mapping the entire planet in unprecedented detail. But the orbiter's arrival at Mars will also set the stage for a new epoch in spacecraft telecommunications. Its onboard Electra UHF relay transceiver will serve as an engineering test bed for new communications and navigation technology that will be required for all future orbiters, landers, and rovers, to provide the faster data rates required for transfer of information from rovers and landers on the Martian surface to orbiters circling above.
See "Mars Gets Broadband Connection," by Barry E. DiGregorio: http://www.spectrum.ieee.org/feb06/2810
5. Solving Sudokus for fun and mathematical profit
Millions of people around the world are tackling one of the hardest problems in computer science -- without even knowing it. The logic game Sudoku is a miniature version of a longstanding mathematical challenge, and it entices both puzzlers, who see it as an enjoyable plaything, and researchers, who see it as a laboratory for algorithm design. This is because the Sudoku is representative of a fundamental mathematical challenge known as P = NP, where, roughly speaking, P stands for tasks that can be solved efficiently, and NP stands for tasks whose solution can be verified efficiently.
See "Sudoku Science," by Lauren Anderson: http://www.spectrum.ieee.org/feb06/2809
When NASA's Mars Reconnaissance Orbiter reaches the Red Planet this month, it will immediately seek out areas where water once flowed, try to identify habitats where ancient life might have thrived, and start mapping the entire planet in unprecedented detail. But the orbiter's arrival at Mars will also set the stage for a new epoch in spacecraft telecommunications. Its onboard Electra UHF relay transceiver will serve as an engineering test bed for new communications and navigation technology that will be required for all future orbiters, landers, and rovers, to provide the faster data rates required for transfer of information from rovers and landers on the Martian surface to orbiters circling above.
See "Mars Gets Broadband Connection," by Barry E. DiGregorio: http://www.spectrum.ieee.org/feb06/2810
5. Solving Sudokus for fun and mathematical profit
Millions of people around the world are tackling one of the hardest problems in computer science -- without even knowing it. The logic game Sudoku is a miniature version of a longstanding mathematical challenge, and it entices both puzzlers, who see it as an enjoyable plaything, and researchers, who see it as a laboratory for algorithm design. This is because the Sudoku is representative of a fundamental mathematical challenge known as P = NP, where, roughly speaking, P stands for tasks that can be solved efficiently, and NP stands for tasks whose solution can be verified efficiently.
See "Sudoku Science," by Lauren Anderson: http://www.spectrum.ieee.org/feb06/2809
CMU VASC talk: Photo Quality Assessment: Classifying Between Professional Photos and Amateur Snapshots
Yan Ke, CMU
Monday, Feb 6, 2006
We propose a principled method for designing high level features for photo quality assessment. Our resulting system can classify between high quality professional photos and low quality snapshots. Instead of using the bag of low-level features approach, we first determine the perceptual factors that distinguish between professional photos and snapshots. Then, we design high level semantic features to measure the perceptual differences. We test our features on a large and diverse dataset and our system is able to achieve a classification rate of 72% on this difficult task. Since our system is able to achieve a precision of over 90% in low recall scenarios, we show excellent results in a web image search application.
Bio:
Yan Ke is a fourth year graduate student in the CMU Computer Science Department. His interests are in computer vision. He spent four months in Beijing, China. When he was not busy touring China and eating good food, he worked on the photo quality assessment project at Microsoft Research Asia.
Monday, Feb 6, 2006
We propose a principled method for designing high level features for photo quality assessment. Our resulting system can classify between high quality professional photos and low quality snapshots. Instead of using the bag of low-level features approach, we first determine the perceptual factors that distinguish between professional photos and snapshots. Then, we design high level semantic features to measure the perceptual differences. We test our features on a large and diverse dataset and our system is able to achieve a classification rate of 72% on this difficult task. Since our system is able to achieve a precision of over 90% in low recall scenarios, we show excellent results in a web image search application.
Bio:
Yan Ke is a fourth year graduate student in the CMU Computer Science Department. His interests are in computer vision. He spent four months in Beijing, China. When he was not busy touring China and eating good food, he worked on the photo quality assessment project at Microsoft Research Asia.
CMU FRC talk: Geographic Routing in Autonomous Sensor Systems without Location Information
Speaker: Bin Yu, Postdoctoral Fellow, Robotics Institute, Carnegie Mellon University
Date: Thursday, February 2, 2006
Autonomous sensor systems of the near future are envisioned to consist of hundreds of robots and UAVs (unmanned aerial vehicles). These networked autonomous sensors play strong roles in civilian and military operations, such as disaster rescue and battlefield surveillance. One of the important problems in autonomous sensor systems is data fusion, as the raw data from each sensor cannot be used directly for team coordination and needs to be fused with other relevant data in the system. In this talk I will discuss several routing algorithms for distributed data fusion in an autonomous sensor system with group mobility, including a geographic routing algorithm without the use of location information. Moreover, I will provide a detailed analysis of the effectiveness of the routing algorithms for data fusion. The simulation results show that controlled data flows significantly increase the probability of relevant data being fused.
Speaker Bio:
Dr. Bin Yu is a Postdoctoral Fellow in the School of Computer Science at CMU. He received his Ph.D. in Computer Science from North Carolina State University in 2002. His research interests lie in the areas of artificial intelligence and distributed sensor systems, with an emphasis on multiagent and multirobot systems. Dr. Yu has authored more than 20 technical papers in artificial intelligence, peer-to-peer systems, and distributed sensor systems. One of his papers appeared at the Fourth International Conference on Agents and Multiagent Systems (AAMAS-05) and was nominated for the best paper award.
Date: Thursday, February 2, 2006
Autonomous sensor systems of the near future are envisioned to consist of hundreds of robots and UAVs (unmanned aerial vehicles). These networked autonomous sensors play strong roles in civilian and military operations, such as disaster rescue and battlefield surveillance. One of the important problems in autonomous sensor systems is data fusion, as the raw data from each sensor cannot be used directly for team coordination and needs to be fused with other relevant data in the system. In this talk I will discuss several routing algorithms for distributed data fusion in an autonomous sensor system with group mobility, including a geographic routing algorithm without the use of location information. Moreover, I will provide a detailed analysis of the effectiveness of the routing algorithms for data fusion. The simulation results show that controlled data flows significantly increase the probability of relevant data being fused.
Speaker Bio:
Dr. Bin Yu is a Postdoctoral Fellow in the School of Computer Science at CMU. He received his Ph.D. in Computer Science from North Carolina State University in 2002. His research interests lie in the areas of artificial intelligence and distributed sensor systems, with an emphasis on multiagent and multirobot systems. Dr. Yu has authored more than 20 technical papers in artificial intelligence, peer-to-peer systems, and distributed sensor systems. One of his papers appeared at the Fourth International Conference on Agents and Multiagent Systems (AAMAS-05) and was nominated for the best paper award.
Latest News from IVsource (February 1, 2006)
Continental’s media department has painted an ambitious picture of tomorrow’s cars based on their active distance sensor technology.
The Institution of Electrical Engineers (UK) has scheduled their second annual conference on Automotive Electronics for March 20-21 in London.
Nissan Motor Co. is developing a third-generation Advanced Safety Vehicle (ASV) installed with a Nissan-developed vehicle-to-vehicle communications system which alerts the driver to potential collisions in five common driving scenarios.
CyberCars 2 is all about development and demonstration of co-operative systems for automated vehicles (cyber-cars) to improve transport capacity and safety.
The Institution of Electrical Engineers (UK) has scheduled their second annual conference on Automotive Electronics for March 20-21 in London.
Nissan Motor Co. is developing a third-generation Advanced Safety Vehicle (ASV) installed with a Nissan-developed vehicle-to-vehicle communications system which alerts the driver to potential collisions in five common driving scenarios.
CyberCars 2 is all about development and demonstration of co-operative systems for automated vehicles (cyber-cars) to improve transport capacity and safety.
Wednesday, February 01, 2006
MIT Thesis Defense: Hyperglue, An infrastructure for Human-Centered Computing in Distributed Intelligent Environments
Speaker: Stephen Peters , MIT CSAIL
Date: Wednesday, February 1 2006
Contact: Stephen Peters, 617-253-8338, slp@csail.mit.edu
As intelligent environments (IEs) move from simple kiosks and meeting rooms into the everyday offices, kitchens, and living spaces we use, the need for these spaces to communicate not only with users, but also with each other, will become increasingly important. Users will want to be able to shift their work environment between localities easily, and will also need to communicate with others as they move about. These IEs will thus require knowledge representations which can keep track of people and their relationships to the world; and communication mechanisms that can mediate interactions.
This thesis seeks to define and explore one way of creating this infrastructure, by creating societies of agents that can act on behalf of real-world entities such as users, physical spaces, or informal groups of people. Just as users interact with each other and with objects in their physical location, the agent societies interact with each other along communication channels organized along these same relationships. By organizing the infrastructure through analogies to the real world, we hope to achieve a simpler conceptual model for the users, as well as a communication hierarchy which can be realized efficiently.
Date: Wednesday, February 1 2006
Contact: Stephen Peters, 617-253-8338, slp@csail.mit.edu
As intelligent environments (IEs) move from simple kiosks and meeting rooms into the everyday offices, kitchens, and living spaces we use, the need for these spaces to communicate not only with users, but also with each other, will become increasingly important. Users will want to be able to shift their work environment between localities easily, and will also need to communicate with others as they move about. These IEs will thus require knowledge representations which can keep track of people and their relationships to the world; and communication mechanisms that can mediate interactions.
This thesis seeks to define and explore one way of creating this infrastructure, by creating societies of agents that can act on behalf of real-world entities such as users, physical spaces, or informal groups of people. Just as users interact with each other and with objects in their physical location, the agent societies interact with each other along communication channels organized along these same relationships. By organizing the infrastructure through analogies to the real world, we hope to achieve a simpler conceptual model for the users, as well as a communication hierarchy which can be realized efficiently.
CMU Talk: Human System Integration in the DoD: Challenges and Opportunities
Greg Zacharias (the talk link)
The ability of humans to cope with information processing demands has become a limiting factor on system performance, especially as systems have become more complex, layered with automation, and fielded in more demanding dynamic environments, all while the roles and responsibilities of the human operator have evolved in the face of greater computational and communications capabilities. The ability to successfully deal with these challenges has important implications not only for individual operator performance, but also for team performance, safety, organizational staffing requirements, and overall human-system effectiveness of large scale systems. This is especially true in the Department of Defense (DoD). To illustrate, we summarize a recent study conducted for the Air Force to assess the state of the art in applying Human Systems Integration (HSI) practices to modern weapons systems design and acquisition, and to recommend improvements in the overall process. We then provide a brief overview of Charles River Analytics (www.cra.com ) which has been providing HSI tools and services to the DoD since its inception in the mid 80’s, and describe some of our current design projects that attempt to address some of the critical information processing demands facing today’s soldier.
The ability of humans to cope with information processing demands has become a limiting factor on system performance, especially as systems have become more complex, layered with automation, and fielded in more demanding dynamic environments, all while the roles and responsibilities of the human operator have evolved in the face of greater computational and communications capabilities. The ability to successfully deal with these challenges has important implications not only for individual operator performance, but also for team performance, safety, organizational staffing requirements, and overall human-system effectiveness of large scale systems. This is especially true in the Department of Defense (DoD). To illustrate, we summarize a recent study conducted for the Air Force to assess the state of the art in applying Human Systems Integration (HSI) practices to modern weapons systems design and acquisition, and to recommend improvements in the overall process. We then provide a brief overview of Charles River Analytics (www.cra.com ) which has been providing HSI tools and services to the DoD since its inception in the mid 80’s, and describe some of our current design projects that attempt to address some of the critical information processing demands facing today’s soldier.
CMU talk: Scalable Approaches to Deploying Swarms of Vehicles and Sensors
Vijay Kumar
Department of Mechanical Engineering and Applied Mechanics
University of Pennsylvania
The talk will address the fundamental problems and practical issues underlying the deployment of large numbers of autonomously functioning vehicles, with insights from field experiments with UAVs and UGVs in urban environments. I will present decentralized controllers and estimators that allow large numbers of robots to maintain a desired shape (formation) while following a desired trajectory. Finally, I will describe our ongoing SWARMS project whose goals are to develop a framework and methodology for the analysis of swarming behavior in biology and the synthesis of bio-inspired swarming behavior for engineered systems.
The link.
Department of Mechanical Engineering and Applied Mechanics
University of Pennsylvania
The talk will address the fundamental problems and practical issues underlying the deployment of large numbers of autonomously functioning vehicles, with insights from field experiments with UAVs and UGVs in urban environments. I will present decentralized controllers and estimators that allow large numbers of robots to maintain a desired shape (formation) while following a desired trajectory. Finally, I will describe our ongoing SWARMS project whose goals are to develop a framework and methodology for the analysis of swarming behavior in biology and the synthesis of bio-inspired swarming behavior for engineered systems.
The link.
Thursday, January 26, 2006
Human-Robot Interaction Conference 2006
The advanced program is available. Check out what the state-of-the-art research of this field is.
Wednesday, January 25, 2006
CMU talk: Game theory, biology, and the binding game
Time: 3:30pm, Tuesday Jan. 24
Location: Wean 5409
Speaker: Tommi Jaakkola, MIT CSAIL
Link
Abstract:
Biological processes span across vastly different scales and necessarily have to be understood at multiple levels of abstraction. Towards clarifying the role that computation plays in such understanding, we have recently developed a class of game theoretic models for capturing coordinate operation of DNA binding regulators. Our work builds in part on the argument that the roles of various molecular interactions cannot be understood in isolation but that it is necessary to also capture the context provided by other mutually constraining processes. Our game theoretic model allocates proteins to neighborhoods of sites, and to sites themselves, in a resource constrained manner, while explicitly capturing coordinate and competitive relations among proteins with affinity to the site or region. We provide examples of known biological subsystems that are naturally translated into our framework, and illustrate predictions that can be derived from the model. The focus of the talk will be on mathematical foundations of the modeling approach and requires little or no biological background. This is joint work with Luis Perez-Breva, Luis Ortiz, and Chen-Hsiang Yeang.
Speaker Bio:
Tommi S. Jaakkola received the M.Sc. degree in theoretical physics from Helsinki University of Technology, Finland, and Ph.D. from MIT in computational neuroscience. Following a postdoctoral position in computational molecular biology (DOE/Sloan fellow, UCSC) he joined the MIT EECS faculty 1998. His research interests include many aspects of machine learning, statistical inference and estimation in the context of graphical models, and analysis and development of algorithms for various modern estimation problems such as those involving multiple predominantly incomplete data sources. His applied research focuses on problems in computational biology such as transcriptional regulation.
Location: Wean 5409
Speaker: Tommi Jaakkola, MIT CSAIL
Link
Abstract:
Biological processes span across vastly different scales and necessarily have to be understood at multiple levels of abstraction. Towards clarifying the role that computation plays in such understanding, we have recently developed a class of game theoretic models for capturing coordinate operation of DNA binding regulators. Our work builds in part on the argument that the roles of various molecular interactions cannot be understood in isolation but that it is necessary to also capture the context provided by other mutually constraining processes. Our game theoretic model allocates proteins to neighborhoods of sites, and to sites themselves, in a resource constrained manner, while explicitly capturing coordinate and competitive relations among proteins with affinity to the site or region. We provide examples of known biological subsystems that are naturally translated into our framework, and illustrate predictions that can be derived from the model. The focus of the talk will be on mathematical foundations of the modeling approach and requires little or no biological background. This is joint work with Luis Perez-Breva, Luis Ortiz, and Chen-Hsiang Yeang.
Speaker Bio:
Tommi S. Jaakkola received the M.Sc. degree in theoretical physics from Helsinki University of Technology, Finland, and Ph.D. from MIT in computational neuroscience. Following a postdoctoral position in computational molecular biology (DOE/Sloan fellow, UCSC) he joined the MIT EECS faculty 1998. His research interests include many aspects of machine learning, statistical inference and estimation in the context of graphical models, and analysis and development of algorithms for various modern estimation problems such as those involving multiple predominantly incomplete data sources. His applied research focuses on problems in computational biology such as transcriptional regulation.
Hi ! Everybody
I'm the new lab member, Stanley.
I'm now trying to be familiar with using Blog.
My families and I wiil go back to I-Lan tomorrow.
Happy (Chinese) new year to all of you.
-----------------------------------
My email/MSN: b90203019@ntu.edu.tw
I'm now trying to be familiar with using Blog.
My families and I wiil go back to I-Lan tomorrow.
Happy (Chinese) new year to all of you.
-----------------------------------
My email/MSN: b90203019@ntu.edu.tw
Tuesday, January 24, 2006
Design Once for Both FPGA & Structured ASIC
FYI.
Only Altera allows you to develop your high-density logic design using Stratix II FPGAs and then migrate to a HardCopy II structured ASIC without any need for redesign or additional timing closure efforts. Learn how Stratix II FPGAs and HardCopy II structured ASICs together provide a unique synergy from design to production.
http://boldfish.ieee.org:80/u/1657/41409275
Only Altera allows you to develop your high-density logic design using Stratix II FPGAs and then migrate to a HardCopy II structured ASIC without any need for redesign or additional timing closure efforts. Learn how Stratix II FPGAs and HardCopy II structured ASICs together provide a unique synergy from design to production.
http://boldfish.ieee.org:80/u/1657/41409275
My Talk (2005/01/24)
Paper:
Detection and Tracking of Moving Objects from a Moving Platform in Presence of Strong Parallax
Abstract :
We present a novel approach to detect and track independently moving regions in a 3D scene observed by a moving camera in the presence of strong parallax. Detected moving pixels are classified into independently moving regions or parallax regions by analyzing two geometric constraints: the commonly used epipolar constraint, and the structure consistency constraint. The second constraint is implemented within a “Plane+Parallax” framework and represented by a bilinear relationship which relates the image points to their relative depths. This newly derived relationship
is related to trilinear tensor, but can be enforced into more than three frames. It does not assume a constant reference plane in the scene and therefore eliminates
the need for manual selection of reference plane. Then, a robust parallax filtering scheme is proposed to accumulate the geometric constraint errors within a sliding window and estimate a likelihood map for pixel classification. The likelihood
map is integrated into our tracking framework based on the spatio-temporal Joint Probability Data Association Filter (JPDAF). This tracking approach infers the trajectory and bounding box of the moving objects by searching the optimal path with maximum joint probability within a fixed size of buffer. We demonstrate the performance of the proposed approach on real video sequences where parallax effects are significant.
Link
Detection and Tracking of Moving Objects from a Moving Platform in Presence of Strong Parallax
Abstract :
We present a novel approach to detect and track independently moving regions in a 3D scene observed by a moving camera in the presence of strong parallax. Detected moving pixels are classified into independently moving regions or parallax regions by analyzing two geometric constraints: the commonly used epipolar constraint, and the structure consistency constraint. The second constraint is implemented within a “Plane+Parallax” framework and represented by a bilinear relationship which relates the image points to their relative depths. This newly derived relationship
is related to trilinear tensor, but can be enforced into more than three frames. It does not assume a constant reference plane in the scene and therefore eliminates
the need for manual selection of reference plane. Then, a robust parallax filtering scheme is proposed to accumulate the geometric constraint errors within a sliding window and estimate a likelihood map for pixel classification. The likelihood
map is integrated into our tracking framework based on the spatio-temporal Joint Probability Data Association Filter (JPDAF). This tracking approach infers the trajectory and bounding box of the moving objects by searching the optimal path with maximum joint probability within a fixed size of buffer. We demonstrate the performance of the proposed approach on real video sequences where parallax effects are significant.
Link
Saturday, January 21, 2006
CMU talk: Representations and Algorithms for Monitoring Dynamic Systems
Time: 3:30pm Tue., Jan. 17
Location: Wean Hall 5409.
Speaker: Avi Pfeffer, Associate Professor at Computer Science,Harvard University
Check the link for details about CMU AI Seminar.
Check the link for details about the speaker.
Abstract:
Continually monitoring the state of a dynamic system is an important problem for artificial intelligence. Dynamic Bayesian networks (DBNs) provide for compact representation of probabilistic dynamic models. However the monitoring task is extremely difficult even for well-factored DBNs. Therefore approximate monitoring algorithms are needed. One family of approximate monitoring algorithms is based on the idea of factoring the joint distribution over the state of the system into a product of distributions over factors consisting of subsets of variables. Factoring relies on the notion of weak interaction between subsystems. We identify a new notion of weak interaction called separability, and show that it leads to the property that, in order to compute the factor distributions at one point in time, only the factored distributions at the previous time point are needed. We also define an approximate form of separability. We show that separability and approximate separability lead to very good approximations for the monitoring task. Unfortunately, sometimes the factoring approach is computationally infeasible. An alternative approach to approximate monitoring is particle filtering (PF), in which the joint distribution over the state of the system is approximated by a set of samples, or particles. In high dimensional spaces, the variance of PF is high and too many particles are required to provide good performance. We improve the performance of PF by introducing factoring, maintaining particles over factors instead of the global state space. This has the effect of reducing the variance of PF and so reducing its error. Maintaining factored particles also allows us to improve PF by looking ahead to future evidence before deciding which particles to propagate, thus leading to much better accuracy.
Speaker bio:
Avi Pfeffer is Associate Professor at Computer Science atHarvard University . His research is directed towards achieving rational behavior in intelligent systems, based on the principles of probability theory, decision theory, Bayesian learning and game theory. He received his PhD in 2000 from Stanford University , where his dissertation on probabilistic reasoning received the Arthur Samuel Thesis Award. Dr Pfeffer has published technical papers on probabilistic reasoning, strategic reasoning, agent modeling, temporal reasoning, and database systems. He was awarded the NSF Career Award in 2001 for work on strategic reasoning, and the Alfred P. Sloan Foundation Research Fellowship in 2002. Dr Pfeffer serves on the editorial board of the Journal of Artificial Intelligence Research, and on the program committees of a number of leading conferences in artificial intelligence.
Location: Wean Hall 5409.
Speaker: Avi Pfeffer, Associate Professor at Computer Science,
Check the link for details about CMU AI Seminar.
Check the link for details about the speaker.
Abstract:
Speaker bio:
Avi Pfeffer is Associate Professor at Computer Science at
Wednesday, January 18, 2006
The Information-Form Data Association Filter
The Information-Form Data Association Filter
Brad Schumitch, Sebastian Thrun, Gary Bradski, and Kunle Olukotun
This paper presents a filter for online data association problems in high-dimensional spaces. The key innovation is a representation of the data association posterior in information form, in which the "proximity'' of objects and tracks are expressed by a numerical links. Updating these links requires linear time, compared to exponential time required for computing posterior probabilities. The paper derives the algorithm formally, and provides comparative results for using data obtained by real-world camera array and by a large-scale sensor network simulation.
The full paper is available in PDF
Brad Schumitch, Sebastian Thrun, Gary Bradski, and Kunle Olukotun
This paper presents a filter for online data association problems in high-dimensional spaces. The key innovation is a representation of the data association posterior in information form, in which the "proximity'' of objects and tracks are expressed by a numerical links. Updating these links requires linear time, compared to exponential time required for computing posterior probabilities. The paper derives the algorithm formally, and provides comparative results for using data obtained by real-world camera array and by a large-scale sensor network simulation.
The full paper is available in PDF
Tuesday, January 17, 2006
News: A genuine milestone for artificial intelligence
January 16, 2006
Richard Macey, the link
ROBOTS have a reason to party: this year is the 50th anniversary of artificial intelligence.
In 1956 John McCarthy, a scientist at the Massachusetts Institute of Technology, convened a meeting of computer specialists at Dartmouth College, New Hampshire.
"It was the dawn of the computing era," said Claude Sammut, professor of computer science and engineering at the University of NSW and leader of its Artificial Intelligence Research Group.
Professor McCarthy's meeting "brought together the small number of people who were writing AI programs. He had to invent some new name for what they were doing, so he called it artificial intelligence". The name stuck.
"When most people think of artificial intelligence they think of robots like C3PO in Stars Wars," Professor Sammut said. But intelligent robots did not always adopt humanoid shapes.
He noted a Brisbane container terminal had been fully automated, with cranes programmed to locate and collect containers. "The cranes are basically robots. They can operate without a driver and know which containers have to be taken off which ships.
"And there's a company in Sydney that makes programs that help pathologists interpret blood tests. The computer generates a detailed report. The pathologist is still there, checking, but it speeds up the pathologist's job."
Richard Macey, the link
ROBOTS have a reason to party: this year is the 50th anniversary of artificial intelligence.
In 1956 John McCarthy, a scientist at the Massachusetts Institute of Technology, convened a meeting of computer specialists at Dartmouth College, New Hampshire.
"It was the dawn of the computing era," said Claude Sammut, professor of computer science and engineering at the University of NSW and leader of its Artificial Intelligence Research Group.
Professor McCarthy's meeting "brought together the small number of people who were writing AI programs. He had to invent some new name for what they were doing, so he called it artificial intelligence". The name stuck.
"When most people think of artificial intelligence they think of robots like C3PO in Stars Wars," Professor Sammut said. But intelligent robots did not always adopt humanoid shapes.
He noted a Brisbane container terminal had been fully automated, with cranes programmed to locate and collect containers. "The cranes are basically robots. They can operate without a driver and know which containers have to be taken off which ships.
"And there's a company in Sydney that makes programs that help pathologists interpret blood tests. The computer generates a detailed report. The pathologist is still there, checking, but it speeds up the pathologist's job."
Sunday, January 15, 2006
Emergency Response/Robotics Joint Topical Meeting Feb. 11-16 in Salt Lake City
The American Nuclear Society, led by the Idaho Section, is sponsoring a forum in Salt Lake City Feb. 11-16 on emergency response preparations and robotics. This topical will discuss solutions to challenges that often cut across boundaries, applications, circumstances, markets and technologies. The Environmental Sciences Division 9th Topical Meeting and the Robotics and Remote Systems Division 11th Topical Meeting have joined together around the theme "Sharing Solutions for Emergencies and Hazardous Environments." This is the first time two such ANS topical meetings have been held jointly. You can learn more about this forum and register at www.2006sharingsolutions.com
Papers, panels, workshops, exhibits and demonstrations will offer research and practical field topics that will provide appeal and will promote provocative and beneficial interactions for most robotics professionals. Featured speakers will include Admiral Joseph Krol, associate administrator of emergency operations at the National Nuclear Security Administration, Ken Brockman of the International Atomic Energy Agency, Dr. Harold McFarlane of Idaho National Laboratory (INL) (who is president-elect of the American Nuclear Society) and Dr. Harold Blackman of INL. For more information, visit the above Web site or contact co-chairs Eric Loewen (208-526-9404, Eric.Loewen@inl.gov) or Ron Lujan (208-526-4045, Ronald.Lujan@icp.doe.gov).
Papers, panels, workshops, exhibits and demonstrations will offer research and practical field topics that will provide appeal and will promote provocative and beneficial interactions for most robotics professionals. Featured speakers will include Admiral Joseph Krol, associate administrator of emergency operations at the National Nuclear Security Administration, Ken Brockman of the International Atomic Energy Agency, Dr. Harold McFarlane of Idaho National Laboratory (INL) (who is president-elect of the American Nuclear Society) and Dr. Harold Blackman of INL. For more information, visit the above Web site or contact co-chairs Eric Loewen (208-526-9404, Eric.Loewen@inl.gov) or Ron Lujan (208-526-4045, Ronald.Lujan@icp.doe.gov).
Monday, January 09, 2006
Paper: DETECTING GROUP INTEREST-LEVEL IN MEETINGS
Daniel Gatica-Perez, Iain McCowan, Dong Zhang, and Samy Bengio
ICASSP 2005, pp I489-492
Abstract:
Finding relevant segments in meeting recordings is important for summarization, browsing, and retrieval purposes. In this paper, we define relevance as the interest-level that meeting participants manifest as a group during the course of their interaction (as perceived by an external observer), and investigate the automatic detection of segments of high-interest from audio-visual cues. This is motivated by the assumption that there is a relationship between segments of interest to participants, and those of interest to the end user, e.g. of a meeting browser. We first address the problem of human annotation of group interest-level. On a 50-meeting corpus, recorded in a room equipped with multiple cameras and microphones, we found that the annotations generated by multiple people exhibit a good degree of consistency, providing a stable ground-truth for automatic methods. For the automatic detection of high-interest segments, we investigate a methodology based on Hidden Markov Models (HMMs) and a number of audio and visual features. Single- and multi-stream approaches were studied. Using precision and recall as performance measures, the results suggest that the automatic detection of group interest-level is promising, and that while audio in general constitutes the predominant modality in meetings, the use of a multi-modal approach is beneficial.
PDF
ICASSP 2005, pp I489-492
Abstract:
Finding relevant segments in meeting recordings is important for summarization, browsing, and retrieval purposes. In this paper, we define relevance as the interest-level that meeting participants manifest as a group during the course of their interaction (as perceived by an external observer), and investigate the automatic detection of segments of high-interest from audio-visual cues. This is motivated by the assumption that there is a relationship between segments of interest to participants, and those of interest to the end user, e.g. of a meeting browser. We first address the problem of human annotation of group interest-level. On a 50-meeting corpus, recorded in a room equipped with multiple cameras and microphones, we found that the annotations generated by multiple people exhibit a good degree of consistency, providing a stable ground-truth for automatic methods. For the automatic detection of high-interest segments, we investigate a methodology based on Hidden Markov Models (HMMs) and a number of audio and visual features. Single- and multi-stream approaches were studied. Using precision and recall as performance measures, the results suggest that the automatic detection of group interest-level is promising, and that while audio in general constitutes the predominant modality in meetings, the use of a multi-modal approach is beneficial.
Thursday, January 05, 2006
Image Parsing: Unifying Segmentation, Detection, and Recognition
Zhuowen Tu, Xiangrong Chen, Alan L. Yuille, Song-Chun Zhu
University of California, Los Angeles
Abstract
We propose a general framework for parsing images into regions and objects. In this framework, the detection and recognition of objects proceed simultaneously with image segmentation in a competitive and cooperative manner. We illustrate our approach on natural images of complex city scenes where the objects of primary interest are faces and text. This method makes use of bottom-up proposals combined with top-down generative models using the Data Driven Markov Chain Monte Carlo (DDMCMC) algorithm which is guaranteed to converge to the optimal estimate asymptotically. More precisely, we define generative models for faces, text, and generic regions– e.g. shading, texture, and clutter. These models are activated by bottom-up proposals. The proposals for faces and text are learnt using a probabilistic version of AdaBoost. The DDMCMC combines reversible jump and diffusion dynamics to enable the generative models to explain the input images in a competitive and cooperative manner. Our experiments illustrate the advantages and importance of combining bottom-up and top-down models and of performing segmentation and object detection/recognition simultaneously.
Link Here
University of California, Los Angeles
Abstract
We propose a general framework for parsing images into regions and objects. In this framework, the detection and recognition of objects proceed simultaneously with image segmentation in a competitive and cooperative manner. We illustrate our approach on natural images of complex city scenes where the objects of primary interest are faces and text. This method makes use of bottom-up proposals combined with top-down generative models using the Data Driven Markov Chain Monte Carlo (DDMCMC) algorithm which is guaranteed to converge to the optimal estimate asymptotically. More precisely, we define generative models for faces, text, and generic regions– e.g. shading, texture, and clutter. These models are activated by bottom-up proposals. The proposals for faces and text are learnt using a probabilistic version of AdaBoost. The DDMCMC combines reversible jump and diffusion dynamics to enable the generative models to explain the input images in a competitive and cooperative manner. Our experiments illustrate the advantages and importance of combining bottom-up and top-down models and of performing segmentation and object detection/recognition simultaneously.
Link Here
Wednesday, January 04, 2006
CNN: Mars rovers keep exploring Red Planet
Twin robots mark second anniversary
Monday, January 2, 2006; Posted: 3:23 p.m. EST (20:23 GMT)
LOS ANGELES, California (AP) -- The warranty expired long ago on NASA's twin robots motoring around Mars.
In two years, they have traveled a total of seven miles. Not impressed? Try keeping your car running in a climate where the average temperature is well below zero and where dust devils can reach 100 mph.
These two golf cart-sized vehicles were only expected to last three months.
The full article.
Tuesday, January 03, 2006
[IVsource.net]: Latest News from IVsource.net (January 2, 2006)
COOPERS Project to Address Road-to-Vehicle Comms Techniques
The European Commission recently funded the COOPERS Integrated Project as one of a suite of projects addressing cooperative vehicle-highway systems.
INTERSAFE Reports Impressive Test Results
The PReVENT INTERSAFE project for intersection collision avoidance recently completed testing of its advanced intersection positioning and dynamic object detection system.
PReVENT UseRCams Plans Testing of 3D Camera on Vehicles
The UseRCams team held a workshop recently in Lindau, Germany.
INSAFES Defines Functions for Demonstrator Vehicles
The PReVENT INSAFES team held their second plenary meeting in early December.
ProFusion2 Publishes New Requirements
The PReVENT subproject ProFusion2 has recently published its requirements for sensor data fusion deployment in active/preventive safety applications. This report defines and presents the system requirements, use cases and test scenarios for the sensor data fusion framework to be developed in ProFusion2.
The European Commission recently funded the COOPERS Integrated Project as one of a suite of projects addressing cooperative vehicle-highway systems.
INTERSAFE Reports Impressive Test Results
The PReVENT INTERSAFE project for intersection collision avoidance recently completed testing of its advanced intersection positioning and dynamic object detection system.
PReVENT UseRCams Plans Testing of 3D Camera on Vehicles
The UseRCams team held a workshop recently in Lindau, Germany.
INSAFES Defines Functions for Demonstrator Vehicles
The PReVENT INSAFES team held their second plenary meeting in early December.
ProFusion2 Publishes New Requirements
The PReVENT subproject ProFusion2 has recently published its requirements for sensor data fusion deployment in active/preventive safety applications. This report defines and presents the system requirements, use cases and test scenarios for the sensor data fusion framework to be developed in ProFusion2.
Monday, January 02, 2006
( 1/4 )my talk in lab meeting
Hi all,
This techenical report was written by Sven Ginka, who was Bob's visiting student @ ACFR, Univ. of Sydney, March-July, 2005. Also this report is mainly about Bob's research. Besides, Sven also provided solutions for some problems in SLAMMOT.
This report is available in our ftp server.
-tailion
This techenical report was written by Sven Ginka, who was Bob's visiting student @ ACFR, Univ. of Sydney, March-July, 2005. Also this report is mainly about Bob's research. Besides, Sven also provided solutions for some problems in SLAMMOT.
This report is available in our ftp server.
-tailion
Sunday, January 01, 2006
EPFL SmartRob Contest
The SmartRob Contest is a national competition open to students of the Swiss Federal Institute of Technology in Lausanne (EPFL) and other technical high-schools. The competition is jointly organized every year by the Autonomous Systems Laboratory (ASL) and the Laboratory of Intelligent Systems (LIS).
The link.
The 50 Best Robots Ever
Wired Magazine, January 2006
They're exploring the deep sea and distant planets. They're saving lives in the operating room and on the battlefield. They're transforming factory floors and filmmaking. They're - oh c'mon, they're just plain cool! From Qrio to the Terminator, here are our absolute favorites (at least for now).
The link
They're exploring the deep sea and distant planets. They're saving lives in the operating room and on the battlefield. They're transforming factory floors and filmmaking. They're - oh c'mon, they're just plain cool! From Qrio to the Terminator, here are our absolute favorites (at least for now).
The link
Subscribe to:
Posts (Atom)