Robot Perception and Learning: February 2007

Tuesday, February 27, 2007

News: DIY Bluetooth Accelerometer

College Of Computing at Georgia Tech have a great DIY Bluetooth accelerometer with source and schematic included.

Overview:

This is a small wireless sensor platform providing a bluetooth SPP link to three axes of accelerometer data. The accelerometers are sampled by a PIC microcontroller (onboard ADC) at roughly 100Hz (rate can be changed via firmware). Data from the ADC conversion is sent to a remote computer using the PIC's UART in conjunction with a drop-in bluetooth serial part. Even with two dual-axis accelerometers onboard there are up to 17 free I/O lines and two additional ADC channels depending on the device configuration. Three sockets provide access to all PIC signals. Over-the-air programming allows for easy firmware updates and rapid prototyping without the need to have a PIC programmer or special cable. Schematics, parts lists, and firmware sources are available online.

Feature List:

Bluetooth serial port profile (SPP) for standardized interface
3-Axis accelerometer data, 3.9mg resolution (ADXL202JE)
Reprogrammable PIC (16LF873-04I)
17 free I/O lines
In-Circuit programming connector
Over-the-air programming via bootloader
Battery life ~60hrs on 840mAh 3.7v battery, full TX mode
13mA @ 3.7v TX mode, 3mA @ 3.7v standby
Very Simple Firmware
35mm X 35mm X 5mm

Contextual Computing Group: Bluetooth Accelerometer - Link

CMU VASC talk: Clustering and Classification via Lossy Data Compression

Clustering and Classification via Lossy Data Compression

Yi Ma, UIUC
Monday, Feb 26

For many problems in computer vision, image processing, and pattern recognition, we need to process and analyze massive amount of high-dimensional mixed data such as images and gene expression data. By mixed data, we mean that the given data set consists of multiple heterogeneous subsets (which have different geometric or statistical characteristics) but each subset can be more easily modeled or representedthan the whole data set together.

In this talk, we address two fundamental questions: .How to cluster and classify such high-dimensional mixed data?. We contend that both the (unsupervised) clustering and (supervised) classification problems can be cast as a lossy data compression problem and solved efficiently within a unified mathematical framework. In theory, this approach offers some distinguished advantages over conventional methods for clustering and classification, especially in dealing with several difficult issues that often arise in practice: regularization of degenerate distributions, selection of models with different complexities, and rejection of outliers.

Our work establishes a strong connection between information theory, especially the rate-distortion theory, with data clustering and classification, and it leads to extremely simple but effective algorithms. We will demonstrate the success of these algorithms in a few popular but difficult problems, including but not limited to natural image segmentation, microarray data clustering, handwritten digits and face
recognition.

Monday, February 26, 2007

CMU talk: Interacting Physically with Robots and Virtually on the Global Digital Campus

SCS DISTINGUISHED LECTURE SERIES
Yuichiro Anzai
President, Keio University

Abstract:
The lecture presents two topics: one for the research on human-robot interaction conducted in the Anzai-Imai laboratory and the other for the activities of the Research Institute for Digital Media and Content, both at Keio University.
Our research on human-robot interaction, embarked upon in 1991, is concerned with designing technologies that facilitate the smooth interaction of humans with robots. We initially started by designing software and hardware systems that support human-robot interaction, and then moved forward, with Michita Imai and others, to designing robots that can smoothly interact with humans. In some cases we conducted behavioral experiments to find out how a human behaves in an interaction with a robot, and fed the results back to engineering. The first part of the lecture provides a summary of the efforts at our lab during these fifteen years.
The second part of the lecture will focus on the activities of the Research Institute for Digital Media and Content, established in 2004. One of its goals is to use various technologies to extend the reach of our physical campus so that students and faculty members can distribute their academic knowledge to a global audience, interact with people around the world, and have convenient access to globally shared knowledge. We have already set up what we call Global Digital Studios in Tokyo, Seoul, Beijing, Cambridge (UK) and San Francisco, with others scheduled to open in New York and some other locations. Twenty-four higher learning institutions in twelve South-East Asian countries are also tied to this network via satellite Internet. The Studios and sites can be connected online at any time, and are used for many different purposes: the network can be regarded as an early version of our Global Digital Campus. The second part of the lecture gives a glimpse of this effort at Keio University.

Bio:
Born in 1946 in Tokyo, Yuichiro Anzai received his Ph.D. in engineering from Keio University in 1974. After serving at Keio as an assistant professor until 1985, he joined the faculty of Hokkaido University as an associate professor in behavioral science. In 1988 he returned to Keio as a professor in electrical engineering, and became the dean of the Faculty of Science and Technology in 1993. He worked on the reform of the undergraduate departments and graduate programs for more than seven years, and launched new educational and research programs with an innovative structure. Since 2001, he has served as president of Keio University, the oldest modern institution of higher learning in Japan (http://www.keio.ac.jp/index-en.htm) that will celebrate its 150th anniversary in 2008, as well as a professor in the Department of Information and Computer Science and the School of Open and Environmental Systems. At present, much of his time is devoted to driving forward the commemorative fundraising campaign and associated programs (http://keio150.jp/english).
Professor Anzai was a post-doc in the Departments of Psychology and Computer Science, and a visiting assistant professor in the Department of Psychology, Carnegie-Mellon University, in 1976-78, and in 1981-82 respectively. He was also a visiting professor at the Center for Medical Education, McGill University, in 1990. His fields of research include cognitive science and computer science, particularly cognitive processes in learning and problem solving, and human-robot-computer interaction. He has published about 20 books, single- and co-authored, and more than 120 technical papers in those fields. For public service, he is serving as president of the Information Processing Society of Japan, as president of the Japan Association of Private Universities and Colleges, as a member of the Science Council of Japan, and as a member of the Central Council for Education.

CMU talk: The Structure and Acquisition of Semantic Knowledge

The Structure and Acquisition of Semantic Knowledge

Charles Kemp
Department of Brain & Cognitive Sciences
Massachusetts Institute of Technology

Thursday, February 22, 2007

Humans regularly make inferences that go beyond the data they have seen. Two questions immediately arise: what is the knowledge that supports these inferences, and how is this knowledge acquired? I will present a hierarchical Bayesian approach to inductive reasoning that addresses both questions. When making inferences about the distribution of a novel property, people draw on rich semantic knowledge that can often be captured using structured representations of the relationships between the entities in a domain. For instance, given that gazelles have T4 cells and carry E. spirus bacteria, taxonomic relations are useful for predicting which other animals are likely to have T4 cells, but predator-prey relations are more useful when reasoning about the distribution of E. spirus bacteria. I will show that our formal framework provides close quantitative fits to human inferences about several kinds of properties when supplied with appropriate knowledge representations for each task. Different inductive tasks often draw on different kinds of knowledge which are best captured by qualitatively different kinds of representations. For instance, anatomical features of biological species are best captured by a taxonomic tree, political views are best captured by a linear spectrum, and friendship relations are best captured by a set of discrete cliques. Our hierarchical framework helps to explain how humans can discover the best kind of representation for a given inductive context.

CMU talk: Adaptive Online Allocation Mechanisms for Single-Valued Domains

Adaptive Online Allocation Mechanisms for Single-Valued Domains

David C. Parkes, Harvard University

Abstract: Mechanism design studies the problem of designing protocols that will implement desirable outcomes in multi-agent systems with self-interest and private information. Many interesting domains are inherently dynamic with uncertainty about both supply and demand; e.g., selling seats on an airplane, adverts on a search engine, computational resources. The classic Vickrey-Clarke-Groves mechanism extends to dynamic environments but is non-adaptive and much less robust than when used offline. Our interest in this talk is in the design of adaptive, online allocation mechanisms, that are able to leverage a probabilistic (perhaps incorrect) model of the environment. We focus on single-valued domains in which agents are indifferent across one of a set of equivalent allocations. A truthful, online stochastic optimization algorithm coupled with historical sampling and an ``ironing" procedure is presented, along with examples to show that the optimal policy is generally not truthfully implementable. Simulation analysis illustrates the cost of truthfulness in application to selling a computational resource.

Bio: David C. Parkes is the John L. Loeb Associate Professor of the Natural Sciences and Associate Professor of Computer Science at Harvard University. He received his Ph.D. degree in Computer and Information Science from the University of Pennsylvania in 2001, and an M.Eng. (First class) in Engineering and Computing Science from Oxford University in 1995. He was awarded the prestigious NSF CAREER Award in 2002, an IBM Faculty Partnership Award in 2002 and 2003, and the Alfred P. Sloan Fellowship in 2005. Parkes has published extensively on topics related to electronic markets, computational mechanism design, auction theory, and multi-agent systems. He serves on the editorial board of the Journal of Artificial Intelligence Research and the Electronic Commerce Research Journal, and has served on the Program Committee of a number of leading conferences in artificial intelligence, multiagent systems and electronic commerce, including ACM-EC, AAAI, IJCAI, UAI and AAMAS. Parkes is the co program-chair of the ACM Conference on Electronic commerce, 2007 and the Int. Conf. on Autonomous Agents and Multiagent systems, 2008.

http://www.eecs.harvard.edu/~parkes

[Thesis Proposal] Predictive Exploration for Autonomous Science

Speaker: David Thompson

Abstract:
Planetary science is entering a new era in which exploration robots can outrun their own ability to collect science data. Autonomous navigation will soon permit single-command traverses of multiple kilometers. Nevertheless, the time for taking measurements and the bandwidth available for transmitting them to Earth will remain relatively constant. A growing body of research addresses these bottlenecks with onboard data understanding. Autonomous rovers can use pattern recognition, learning and planning technologies to place instruments and take measurements without human supervision. These robots autonomously choose the most important features to observe and transmit, traveling longer distances without sacrificing our understanding of the visited terrain.

I argue that intelligent explorer agents must exploit structure in their environment. In other words, they must be mapmakers. Maps can represent spatial structure (similarities from one locale to the next) and inter-sensor structure (correlations between different sensing modes). “Predictive exploration” formulates mapmaking as an experimental design problem. Generative spatial models guide the agent to informative areas while minimizing redundant measurements. Information gain over the map determines exploration decisions, while a similar criterion suggests the best data products for downlink. We will demonstrate these principles with a rover system that autonomously builds kilometer-scale geologic maps.

A copy of the thesis proposal document:
http://www.cs.cmu.edu/~drt/ThompsonProposal.pdf.

Sunday, February 25, 2007

News: European Researchers Developing 'Emotional Robot'

The link. February 24, 2007 2:35 p.m. EST

Som Patidar - All Headline News Staff Writer

London, Britain (AHN) - A joint research project by a European team-led by British researchers is developing a robot that can interact with people emotionally.

The research project, Feelix Growing, involves six countries and 25 roboticists, developmental psychologists and neuroscientists.

The project's coordinator Lola Canamero, from Britain's University of Hertfordshire, said that the aim is to develop robots that grow up and adapt to humans in everyday environments.

"If robots are to be truly integrated in humans everyday lives as companions or careers, they cannot be just taken off the shelf and put into a real-life setting, they need to adapt to their environment," Canamero said.

Thursday, February 22, 2007

News: SheekGeek educational kits

The W.A.S.P Original Robotic Kit is designed to help introduce children age 12 and up to robotics, electronics, and mechanics.

The name W.A.S.P stands for "Wiggling and Spinning Photovore" which describes the action and the type of the robot in the W.A.S.P. robot kit. A photovore is a light-seeking robot. The W.A.S.P photovore robot wiggles towards a light source and spins in circles when it finds the brightest spot.

The activity of the W.A.S.P is very reactive. It will follow the beam of a flashlight closely allowing the user to control where they want the W.A.S.P to go. The W.A.S.P is also very quick (especially with new batteries). It can travel 5 feet in 5 seconds!

Setup of the W.A.S.P is great for beginners. The circuit is very simple and many of the pieces are everyday, recognizable items like chenille stems/pipe cleaners and cable ties.

SheekGeek Educational Kits - Link
SheekGeek Original W.A.S.P. Robotic Kit - Link
The W.A.S.P. Original Robotic Kit Contents - Link

Saturday, February 17, 2007

Call For Papers: Special Issue on Network Robot Systems (Robotics and Autonomous System Journal)

**************************************************
SPECIAL ISSUE ON NETWORK ROBOT SYSTEMS (NRS)
(ROBOTICS AND AUTONOUMOUS SYSTEMS JOURNAL)
**************************************************

The last decade has witnessed unprecedented interaction between technological developments in computing and communications, which have led to the design and implementation of robotic and automation systems consisting of networked vehicles, sensors and actuators systems. These developments enable researchers and engineers not only to design new robotics systems but also to develop systems that could have not been imagined before. Now, there is a need for a unifying paradigm within the robotics community to address the design of these networked automation systems.

The name Networked Robots (NR) was created in May 2004 within the IEEE RAS Technical Committee, as a consequence of the preliminary work on Internet-based tele-operated robots initiated in 2001, and its expansion to reflect a broader set of problems and applications. There are several definitions of NRS, coming from US and Japan, but a simple and comprehensive definition of NRS is:

“A Network Robot System is a group of artificial autonomous systems that are mobile and that make important use of wireless communications among them or with the environment and living systems in order to fulfil their tasks”.

Network Robot Systems (NRS) call for the integration of several fields: robotics, perception (sensor systems), ubiquitous computing, and network communications. Some of the key issues that must be addressed in the design of Network Robot Systems are cooperative localization and navigation, cooperative environment perception, cooperative map building, cooperative planning and planning for cooperation, human-robot interaction, network tele-operation, and communications.

The topic Network Robot Systems transcends “conventional” robotics, in the sense that there exists for these type of distributed heterogeneous systems, an interrelation among a community of robots, environment sensors and humans. Applications include network robot teams (for example for space applications), human-robot networked teams (for example a community of robots that assist people), robots networked with the environment (for example for tasks on urban settings or rescue) or geminoid robots (a replication of a human with own autonomy and being partially tele-operated through the network).

The topics of interest include, but are not limited to:
- human robot symbiosis
- networked environing sensing/actuation
- distributed environing system
- interaction between human and environing components
- networked human-robot interaction
- coordination and cooperation among multiple types of robots
- self-configuration of a network robot system
- monitoring and self-repair of a network robot system
- network robot platform
- security for network robot systems
- socially situated network robots
- applications of network robot systems

IMPORTANT DATES
First Call for papers: 15 February 2007
Paper submission deadline: 15 May 2007
Revised notification: 10 September 2007
Final paper submission: 22 October 2007
Final decision notification: 16 November 2007

REVIEWING PROCESS
Expected contributions should be around 12 pages long. Submissions have to be sent to the Guest Editors (sanfeliu@iri.upc.es; hagita@atr.jp, asaffio@aass.oru.se) in electronic form (PDF files). The Guest Editors will first evaluate all manuscripts. Manuscripts meeting the minimum criteria are passed on for peer review, to be accomplished by two external experts. The method of review in this special issue will employ single blind review, where the referee remains anonymous throughout the process. The Guest Editors board is responsible for the final decision to accept
or reject the articles, based on the recommendations of the reviewers. Accepted papers will have to be sent to the Guest Editors electronically, both as source files (LaTex, MS Word, including all original Figures/Tables and References) and in printable version (PDF). Please follows the instructions in http://ees.elsevier.com/robot/.

GUEST EDITORS
• Alberto Sanfeliú, Technical University of Catalonia, Spain, sanfeliu@iri.ups.es
• Norihiro Hagita, ATR Intelligent Robotics and Communication Laboratories, Japan, hagita@atr.jp
• Alessandro Saffiotti, Örebro University, Sweden, asaffio@aass.oru.se

RELATED LINKS:
- Research Atelier on Network Robot Systems, http://turina.upc.es/nrs
- Japan Network Robot Forum, http://www.scat.or.jp/nrf/English/
- IEEE RAS Technical Committee, http://www.informatik.uni-freiburg.de/~burgard/tc/

Friday, February 16, 2007

Non-rigid point set registration : Coherent Point Drift

Author :

Andriy Myronenko
Xubo Song
Miguel A´ . Carreira-Perpin˜a´n

OGI School of Science and Engineering
Oregon Health and Science University

Title :

Non-rigid point set registration : Coherent Point Drift

Abstract :

We introduce Coherent Point Drift (CPD), a novel probabilistic method for nonrigid registration of point sets. The registration is treated as a Maximum Likelihood (ML) estimation problem with motion coherence constraint over the velocity field such that one point set moves coherently to align with the second set. We formulate the motion coherence constraint and derive a solution of regularized ML estimation through the variational approach, which leads to an elegant kernel form. We also derive the EM algorithm for the penalized ML optimization with deterministic annealing. The CPD method simultaneously finds both the non-rigid transformation and the correspondence between two point sets without making any prior assumption of the transformation model except that of motion coherence. This method can estimate complex non-linear non-rigid transformations, and is shown to be accurate on 2D and 3D examples and robust in the presence of outliers and missing points.

Link :
paper
project page

Patent: Underground GPS

13:48 12 February 2007
NewScientist.com news service
Barry Fox

Underground GPS

Satellite navigation is becoming a vital tool for the modern motorist. But GPS (Global Positioning System) receivers need to compare signals from at least three orbiting satellites to determine their position. This means satellite navigation does not normally work inside a tunnel, underground or in a heavily built up area.
Two inventors from Cambridge in the UK are now patenting a system that could let satellite equipment acquire positioning information even when satellite signals are blocked. The roof of the building, or the ground above the tunnel, is fitted with at least four directional antennae focused on different patches of the sky. These antennae receive GPS signals, then amplify and re-broadcast them using transmitters positioned at specific points below ground. A GPS device is then fooled into behaving as if it were out in the open, providing accurate positional data from inside a tunnel, in an underground car park or in a heavily built up city. The same trick could also let GPS devices work inside buildings.

See the patent application.

Spatial Reasoning: Planning Among Movable Obstacles

Author:
Mike Stilman
Robotics Institute
Carnegie Mellon University

Abstract:
Autonomous robots operating in real world, unstructured environments cannot rely on the existence of collision free paths or feasible trajectories. Search and rescue, construction and planetary exploration domains contain debris that obstructs the robots path. Theoretically, one can represent all possible interactions between the robot and these objects as a single search problem. However, the resulting nonlinear state space would be exponentially large. In this thesis we explore methods for reasoning about the robots state space to reduce problem dimensionality and accomplish autonomous motion in the presence of movable objects.

Further Details:
A copy of the thesis proposal document can be found at http://www.cs.cmu.edu/~mstilman/proposal/stilman-proposal.pdf.

CMU Intelligence Seminar: Bayesian models of human learning and inference

Bayesian models of human learning and inference
Josh Tennenbaum, MIT

Faculty Host: Tom Mitchell

Bayesian methods have revolutionized major areas of artificial intelligence, machine learning, natural language processing and computer vision. Recently Bayesian approaches have also begun to take hold in cognitive science, as a principled framework for explaining how humans might learn, reason, perceive and communicate about their world. This talk will sketch some of the challenges and prospects for Bayesian models in cognitive science, and also draw some lessons for bringing probabilistic approaches to artificial intelligence closer to human-level abilities.

The focus will be on learning and reasoning tasks where people routinely make successful generalizations from very sparse evidence. These tasks include word learning and semantic interpretation, inference about unobserved properties of objects and relations between objects, reasoning about the goals of other agents, and causal learning and inference. These inferences can be modeled as Bayesian computations operating over constrained representations of world structure -- what cognitive scientists have called "intuitive theories" or "schemas". For each task, we will consider how the appropriate knowledge representations are structured, how these representations guide Bayesian learning and reasoning, and how these representations could themselves be learned via Bayesian methods. Models will be evaluated both in terms of how well they capture quantitative or qualitative patterns of human behavior, and their ability to solve analogous real-world problems of learning and inference. The models we discuss will draw on -- and hopefully, offer new insights for -- several directions in contemporary machine learning, such as semi-supervised learning, modeling relational data, structure learning in graphical models, hierarchical Bayesian modeling, and Bayesian nonparametrics.

Speaker Bio
Josh Tenenbaum studies learning and reasoning in humans and machines, with the twin goals of understanding human intelligence in computational terms and bringing artificial intelligence closer to human-level capacities. He received his Ph.D. from MIT in 1999, and from 1999-2002, he was a member of the Stanford University faculty in the Departments of Psychology and (by courtesy) Computer Science. In 2002, he returned to MIT, where he currently holds the Paul E. Newton Career Development Chair in the Department of Brain and Cognitive Sciences, and is a member of the Computer Science and Artificial Intelligence Laboratory. He has published extensively in cognitive science, machine learning and other AI fields, and his group has received several outstanding paper or student-paper awards at NIPS, CVPR, and Cognitive Science. He received the 2006 New Investigator Award from the Society for Mathematical Psychology, and the 2007 Young Investigator Award from the Society of Experimental Psychologists. He serves as an associate editor of the journal Cognitive Science and is currently co-organizing a summer school on "Probabilistic Models of Cognition: The Mathematics of Mind" for July 2007 at IPAM, the Institute of Pure and Applied Mathematics at UCLA.

Simulating Thought to Model Terrorists

A rock star among game developers, Silverman and his team of 20 researchers and graduate students at the University of Pennsylvania's Ackoff Center for Advancement of Systems Approaches have gone well beyond any video game in existence. They imbue agents with detailed physiologies that respond to hunger, fatigue, and stress, as well as minds that encompass complex reasoning skills, value systems, and up to 22 emotions. This is the closest a computer comes to simulating a real person, and is at the cutting edge of computational behavior modeling.

[LINK]

Thursday, February 15, 2007

CMU RI Thesis Proposal: Integrated Localization, Mapping, and Planning in Unstructured 3D Environments

Nathaniel Fairfield (than@cmu.edu)
Robotics Institute
Carnegie Mellon University

Abstract:

The ability to explore an unknown environment is a prerequisite for most useful mobile robotics. Exploration can be decomposed into the tasks of perceiving the environment to build a map, localizing within that map, and planning where to explore next. Over the past ten years or so, the field of simultaneous localization and mapping (SLAM) has been active and increasingly applied. More recently, work has been directed towards the problem of planning as an integral part of exploration and SLAM. Another persistent challenge is scale many SLAM formulations have problems with exploring areas. We are interested in developing an integrated mapping, localization, and planning approach that can handle large scale three-dimensional environments and sparse sensor data. As a start, we have developed a method for doing SLAM using a Rao-Blackwellized Particle Filter and evidence grid-based maps, and demonstrated successful SLAM using an autonomous underwater vehicle in a 3D environment. The two major limitations of our current method are its inability to scale the evidence grid approach to truly large environments (hundreds of meters and millions of observations), and its lack of planning ability for picking exploration and/or uncertainty-reducing actions. We propose to address the first limitation by developing SLAM on multiple scales: local submaps and global maps; in effect using the submaps as features at larger scale. We propose to address the second limitation, planning, by integrating the tasks of mapping, localizing, and planning under an information-theoretic framework. The planning algorithm will use models of unmapped regions and the entropy of the predicted SLAM state to choose the action with the greatest estimated information gain. The combination of multi-scale SLAM and information gain-based planning raises the possibility of hierarchical exploration, where the robot's current task determines its exploration strategy. In this proposal we describe our current work, motivation, and proposed solutions, with the goal of building a system which is capable of exploring large-scale 3D environments.

Further Details: http://gs4435.sp.cs.cmu.edu/fairfield_proposal.pdf

Wednesday, February 14, 2007

ICRA07: Identification and Control of an Autonomous Blimp

Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp

Abstract:

Blimps are a promising platform for aerial robotics and have been studied extensively for this purpose. Unlike other aerial vehicles, blimps are relatively safe and also possess the ability to loiter for long periods. These advantages, however, have been difficult to exploit because blimp dynamics are complex and inherently non-linear. The classical approach to system modeling represents the system as an ordinary differential equation (ODE) based on Newtonian principles. A more recent modeling approach is based on representing state transitions as a Gaussian process (GP). In this paper, we present a general technique for system identification that combines these two modeling approaches into a single formulation. This is done by training a Gaussian process on the residual between the non-linear model and ground truth training data. The result is a GP-enhanced model that provides an estimate of uncertainty in addition to giving better state predictions than either ODE or GP alone. We show how the GP-enhanced model can be used in conjunction with reinforcement learning to generate a blimp controller that is superior to those learned with ODE or GP models alone.

Original link:
http://www.cs.washington.edu/homes/fox/abstracts/gp-blimp-icra-07.abstract.html

Paper link:
http://www.cs.washington.edu/homes/fox/postscripts/gp-blimp-icra-07.pdf

Tuesday, February 13, 2007

Stanford Talk: Large Scale Detection of Irregularities in Accounting Data

Large Scale Detection of Irregularities in Accounting Data

Stephen Bay, Center for Advanced Research, PricewaterhouseCoopers LLP

Abstract:
In recent years, there have been several large accounting frauds where a company's financial results have been intentionally misrepresented by billions of dollars. In response, regulatory bodies have mandated that auditors perform analytics on detailed financial data with the intent of discovering such misstatements. For a large auditing firm, this may mean analyzing millions of records from thousands of clients. In this talk, I will discuss techniques for automatic analysis of company general ledgers on such a large scale to identify irregularities -- which may indicate fraud or just honest errors -- for additional review by auditors. These techniques have been implemented in a prototype system, called Sherlock, which combines aspects of both outlier detection and classification. In developing Sherlock, we faced three major challenges: developing an efficient process for obtaining data from many heterogeneous sources, training classifiers with only positive and unlabeled examples, and presenting information to auditors in an easily interpretable manner.

MIT CSAIL talk: Neural Discrimination of Complex Natural Sounds in Songbirds

Title: Neural Discrimination of Complex Natural Sounds in Songbirds
Speaker: Dr. Kamal Sen , Neural Coding Laboratory, Hearing Research Center, Boston University
Date: Wednesday, February 14 2007

Discrimination and recognition of complex natural stimuli is a fundamental problem that arises in a wide variety of fields e.g., neuroscience and computer science. In neuroscience an important problem is to understand how animals and humans discriminate between complex sounds e.g., vocal communication sounds of two different individuals. In computer science, speech recognition algorithms must solve a similar problem. Moreover, such discrimination must often be performed in noisy environments, e.g., a cocktail party. How does the brain solve this problem? Currently, relatively little is known about the neural basis for complex sound discrimination and recognition. An attractive model system for investigating this question is the songbird, which shows striking analogies to humans in the context of speech. In this talk, I will describe our ongoing work on the neural discrimination of birdsongs in field L, the analog of primary auditory cortex, in zebra finches. I will present some of our findings on the accuracy and time-scales of neural discrimination, sensitivity vs. invariance to stimulus parameters e.g., intensity, and then discuss how we are extending this paradigm to investigate more complex auditory scenes, e.g., a cocktail party.

Monday, February 12, 2007

CMU VASC seminar: Computer Vision in Archaeology: Recent Case Studies

Computer Vision in Archaeology: Recent Case Studies
Kevin Cain
Institute for Study and Integration of Graphical Heritage Techniques

Computing for archaeology is a study in contrasts: graphics and vision techniques are still somewhat exotic, but interesting (and difficult) problems abound! In this talk, we'll present a snapshot of current needs in archaeology, framing the discussion with results from the past seven seasons of field work at the memorial temple of Ramses II in Egypt. Topics include: 3d representations of ancient sites, large scale orthomosaics of inscribed wall surfaces, lighting capture, relighting, and site reconstructions. We'll also take a look at efforts to present archaeological results in novel environments, including a new NSF 'full dome' film project Maya Skies and a large digital projection installation in Egypt's Valley of the Kings.

CMU VASC seminar: Observations from Parsing Images of Architectural Scenes

Observations from Parsing Images of Architectural Scenes
Alexander Berg
UC Berkeley

Computational models for visual recognition show promise for some tasks. I will review our success in this area and show some information theoretic comparisons with our ongoing work on parsing scenes. For images of architectural scenes we have observed that very simple independent local features provide a great deal of information about what components -- building, sky, ground, etc. -- make up a scene. In addition a few carefully chosen image wide latent variables are added to the model then even more information is available. Finally given this coarse level parsing it is possible to effectively identify features such as windows and roof-lines that would be difficult to parse in isolation.

CMU ML lunch: Discrete Markov Random Fields -- the Inference story

Discrete Markov Random Fields -- the Inference story

Speaker: Pradeep Ravikumar, CMU
http://www.cs.cmu.edu/~pradeepr

Abstract: Markov random fields, or undirected graphical models, are graphical representations of probability distributions. Each graph represents a family of distributions -- the nodes of the graph represent random variables, the edges encode independence assumptions, and weights over the edges and cliques specify a particular member of the family.

In this talk, I will give the high-level intuitions behind the wide array of inference techniques for discrete markov random fields.

The problem of inference in markov random fields is, in generality, the problem of answering queries about the probability distribution represented by the markov random field. Key inference tasks include partition function estimation, event probability estimation, and computing the most probable configuration. The talk will give a high-level picture of these queries, and the methods used to answer these queries.

Saturday, February 10, 2007

Invention: On-road warning signs

On-road warning signs

* 16:27 05 February 2007
* NewScientist.com news service
* Barry Fox

Could real-time traffic information be projected directly onto the road ahead?

Philips thinks so and proposes attaching laser projectors, each with a rapidly-moving mirror that deflects its beam, to ordinary lampposts. These would be used to project images and words onto the road just ahead of approaching cars.

The solution would be cheaper than installing a large video display and safer too, since drivers would not need to take their eyes off the road. Also, a warning about ice or danger on the road ahead would not need a full colour screen, so the projector could use just a single-colour laser.

Each lamppost would have its own IP address and would connect wirelessly, or via a cable, to a central traffic control centre. The projectors could also tap into the power already used to illuminate streetlamps.

As well as providing warning signs, the laser projectors could paint temporary lanes onto the road, steering traffic round an obstruction, or away from the main highway and onto a side road. It's a neat idea, but how well would it work in busy traffic?

Read the full on-road warning signs patent application.

News: (Invention) Covert iris scanner

Invention: Covert iris scanner

* 16:27 05 February 2007
* NewScientist.com news service
* Barry Fox

Covert iris scanner

Sarnoff Labs in New Jersey, US, has been working on a clever homeland security system for the US government. It scans people's irises as they walk towards a checkpoint, without them even knowing it.

Current systems require a person to stand still and look directly into a single digital camera from close range. The new system will instead use an array of compact, high resolution cameras, all of which point in slightly different directions and focus at slightly different distances.

As a subject walks into range, a sensor triggers a powerful infrared strobe light. The strobing is synchronised with the camera exposures, illuminating pictures of a subject's face thirty times per second, to create a bank of different images.

At least one of these shots should provide a clear, high-definition image of the target's iris. Clarity could also be enhanced by combining two similar shots. Sarnoff reckons this could be done at a distance around 3 metres, and a database could be queried fast enough to sound the alarm if the subject warrants a closer check. Let's just hope the target is not wearing sunglasses.

the full covert iris scanner patent application.

Friday, February 09, 2007

CMU Intelligence Seminar: Adaptation, Inference, and Optimization: Speech Driven Machine Learning

Adaptation, Inference, and Optimization: Speech Driven Machine Learning
Jeff Bilmes
University of Washington

Speech applications (such as speech recognition) have a long history of utilizing statistical learning methodology. In this talk, we will describe how machine learning research can be motivated by the application of speech processing, including speech recognition and speech-based human-computer interaction. First, dynamic graphical models (e.g., DBNs, and CRFS) can be used to express many novel speech recognition procedures. We describe new methods to perform fast exact and approximate inference in such models. These methods involve, as expected, graph triangulation, conditioning, and search, but perhaps more surprisingly, also involve optimal bin packing, max-flow procedures, and submodular matchings. In the second part of the talk, we will describe a new speech application, the Vocal Joystick, for specifying multi-dimensional continuous control parameters using non-verbal vocalizations. This application has resulted in sample complexity bounds for model adaptation, and adaptation strategies for discriminative classifiers (SVMs and Neural Networks).

Speaker Bio.: Jeff A. Bilmes is an Associate Professor in the Department of Electrical Engineering at the University of Washington, Seattle (adjunct in Linguistics and in Computer Science and Engineering). He co-founded the Signal, Speech, and Language Interpretation Laboratory at the University. He received a masters degree from MIT, and a Ph.D. in Computer Science at the University of California, Berkeley. Jeff is the main author and designer of the graphical model toolkit (GMTK), and has done much research on both structure learning of and fast probabilistic inference in dynamic Graphical models. His main research lies in statistical graphical models, speech, language and time series, machine learning, human-computer interaction, combinatorial optimization, and high-performance computing. He was a general co-chair for IEEE ASRU 2003, and HLT/NAACL 2006. He is a member of the IEEE, ACM, and ACL, is a 2001 CRA Digital-Government Research Fellow, and is a 2001 recipient of the NSF CAREER award.

Wednesday, February 07, 2007

IEEE news: MOVIES IN THE ROUND (Cool Stuff!)

Most of the so-called 3-D displays you've seen use stereoscopic tricks to create the feeling of depth, but it's just an illusion. Move your head, and you won't see round any corners. Now comes Holografika, a Budapest firm, that gives the real deal: a flat-panel display that exploits the principle holography to present movies that look different to people standing at different points. The link.

IEEE news: THOUGHT POWERED WHEELCHAIRS IN DEVELOPMENT

A new wheelchair commanded by thoughts is currently in development by researchers at the University of Zaragoza in Spain. The chair will use a process called brain-computer interface, or BCI for short, which involves attaching electroencephalogram electrodes to a rider’s scalp, which then record brain rhythms and convey them to the chair’s computer. Two 800-MHz Intel computers mounted on the wheelchair will process these readings and send instructions to the wheels. While the signals are crude, advances in decoding algorithms are have made it possible to train the software to recognize simple commands such as turn left or turn right. Over time, more specific commands, such as moving to a certain room by thinking of it, will become understood by the software. To combat misinterpreted commands, a laser will be attached to the front of the chair to avoid collisions with the surroundings. The technology is still a couple years away from being perfected, and the researchers are looking at 2008 or 2009 before they will have a working prototype. Read more: the link

Tuesday, February 06, 2007

[CMU VASC seminar] Free-Viewpoint Image Synthesis from Multi-View Images

Title : Free-Viewpoint Image Synthesis from Multi-View Images
Speaker : Keita Takahashi, University of Tokyo

Abstract :
This talk introduces an image-based rendering method using multi-viewimages. Using densely-arranged 2-D array of cameras as a model for data acquisition, we adopted a layered depth approach for synthesizing free-viewpoint images. Instead of explicit shape reconstruction, we proposed a signal processing approach based on our "focus measure" scheme;first, synthesize an image for each of the depth layers, then, find the "in-focus" parts on each of the images to integrate them into a final image. Since our method requires no off-line processing, it runs at interactive frame-rates on a commodity PC. Several experimental results are presented to show the effectiveness of our method.

Here is the related links for this research :
Author's homepage
Full paper of this research

MIT CSAIL report: Phonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures

Authors: Rifkin Ryan, Bouvrie Jake, Schutte Ken, Chikkerur Sharat, Kouh Minjoon, Ezzat Tony, Poggio Tomaso

Issue Date: 1-Feb-2007

Abstract: A preliminary set of experiments are described in which a biologically-inspired computer vision system (Serre, Wolf et al. 2005; Serre 2006; Serre, Oliva et al. 2006; Serre, Wolf et al. 2006) designed for visual object recognition was applied to the task of phonetic classification. During learning, the systemprocessed 2-D wideband magnitude spectrograms directly as images, producing a set of 2-D spectrotemporal patch dictionaries at different spectro-temporal positions, orientations, scales, and of varying complexity. During testing, features were computed by comparing the stored patches with patches fromnovel spectrograms. Classification was performed using a regularized least squares classifier (Rifkin, Yeo et al. 2003; Rifkin, Schutte et al. 2007) trained on the features computed by the system. On a 20-class TIMIT vowel classification task, the model features achieved a best result of 58.74% error, compared to 48.57% error using state-of-the-art MFCC-based features trained using the same classifier. This suggests that hierarchical, feed-forward, spectro-temporal patch-based architectures may be useful for phoneticanalysis.
pdf, link

CMU RI Seminar: UAV-Enabled Wilderness Search and Rescue (WiSAR)

Wilderness Search and Rescue (WiSeR) operations include finding and giving assistance to humans who are lost or injured in mountain, desert, lake, river, or other remote settings. WiSeR is a challenging task that requires many hours of effort by people with specialized training. These searches, which consume thousands of man-hours and hundreds of thousands of dollars per year in Utah alone, are often very slow because of the large distances and challenging terrain that must be searched. Moreover, timeliness of the search is critical; for every hour that passes, the search radius must increase by approximately 3km, and the probability of finding and successfully aiding the victim decreases.

This talk will present research on using small Unmanned Aerial Vehicles (UAVs) to assist in WiSeR tasks. Topics include an analysis of how WiSeR is currently done, how UAVs can be used to support the current efforts, and the development of key technologies that make UAV-enabled WiSeR possible. Discussion will include designing UAV autonomy, modeling victim behavior, creating interfaces that allow the UAV to be effectiently tasked, and presenting imagery in a way that increases the probability of finding a victim.

Bio:
Michael A. Goodrich is an associate professor in the Computer Science Department at Brigham Young University. Before joining BYU, he completed a Ph.D. degree in Electrical and Computer Engineering and spent two years as a post-doctoral research associate at Nissan Cambridge Basic Research. His research is driven by a desire to understand intelligence. Toward this goal, he works on problems in human-robot interaction, multi-agent learning, and intelligent vehicle systems.

Monday, February 05, 2007

MIT CSAIL defense : A few days of a robot

Speaker: Lijin Aryananda , MIT CSAIL
Relevant URL: http://people.csail.mit.edu/lijin

Abstract:

This thesis presents an implementation of a robotic head, Mertz, designed to explore incremental face recognition through natural interaction. We have seen many recent efforts in the path of integrating robots into the home for assisting with elder care, domestic chores, etc. In order to be effective in human-centric tasks, the robots must distinguish not only between people and inanimate objects, but also among different family members in the household. This thesis was driven by two specific limitations of the current technology. First, current automatic face recognition technology mostly explores the supervised solutions which are limited to a fixed training set and require cumbersome data collection and labelling procedures. Second, the lack of robustness and scalability to unstructured environments create a large gap between current research robots and commercial home products. The goal of this thesis is to advance toward a framework which would allow the robots to incrementally "get to know" each individual in an unsupervised way through daily interaction. In contrast to the target of a stand-alone and maximally optimized face recognition system, our approach is to develop an integrated robotic system as a step toward the ultimate end-to-end system capable of incremental individual recognition in a real human environment. Our main emphasis is to develop Mertz as a robotic creature with adequate overall robustness to be embedded in the dynamic and noisy human environment. Thus, we require the robot to operate for a few hours at a time and interact with a large number of passersby with minimal constraints at public locations. The robot then autonomously detects, tracks, and segments face images during these interactions and automatically generates a training set for its face recognition system. In this talk, we present the robot implementation and its unsupervised incremental face recognition framework. We describe an algorithm for clustering SIFT features extracted from a large set of face sequences automatically generated by the robot. We demonstrate the robot's capabilities and limitations in a series of experiments at a public lobby. In a final experiment, the robot interacted with a few hundred individuals in an eight day period and generated a training set of over a hundred thousand face images. We evaluate the clustering algorithm performance across a range of parameters on this automatically generated training data and also the Honda-UCSD video face database. Lastly, we present some recognition results using the self-labelled clusters.

Saturday, February 03, 2007

News: ASIMO robot falls down stairs

Most of the time all you see are slick videos of the new bipedal robots flawlessly trotting about, but we like this one, when things go a little wrong... - Link.

The action starts 58 seconds in.

Originated from MAKE:

News: A robot for your digital camera?

Posted by Roland Piquepaille @ 9:43 am, February 2nd, 2007

According to the Pittsburgh Post-Gazette, researchers from Carnegie Mellon University and NASA Ames will release in March a $200 robot which will transform your digital cameras into powerful image-makers without your help. Attached to almost any model of digital cameras, the Gigapan robot platform will take continuous snapshots of a place or an event. Then the software provided by the research team will produce a panoramic image built from all these snapshots. And these images will be zoomable. This means you'll have the best of two worlds, big panoramas and startling details.

This GigaPan platform has been developed at Carnegie Mellon University by Illah Nourbakhsh, an associate professor of robotics with the help of the NASA Ames Intelligent Robot Group. This project is part of the Global Connection Project.

See the full article. Go to the GIGAPAN web site.

Friday, February 02, 2007

MIT talk: Do robots offer a quantum leap in studying whales?

Speaker: Roger Payne , Ocean Alliance, Lincoln, MA
Date: Tuesday, February 6 2007
Relevant URL: http://www.oceanalliance.org/wci

Abstract:

Payne will review his 39 year study of Patagonian right whales one conclusion of which is that a key to conserving any species seems to be to learn to live with it, and from that experience to build it into human consciousness. But learning to live with whales that migrate 8,000 miles each year requires one to overcome the obstacle of keeping up with them, something that has failed in spite of numerous attempts to achieve it. Storms are what usually cause those who tag and follow whales to lose them. However, robots show great promise in enabling boats to keep up with whales, something that may offer a chance for a major leap in understanding of whales, and that may even someday allow future generations to guide whales into waters in which there is no whaling industry.

Thursday, February 01, 2007

Lab Meeting 1 Feb 2007 (Yu-Chun): Integrating the OCC Model of Emotions in Embodied Characters

Christoph Bartneck
Workshop on Virtual Conversational Characters: Applications, Methods, and Research Challenges, 2002
[Link]

Abstract:
The OCC (Ortony, Clore, & Collins, 1988) model has established itself as the standard model for emotion synthesis. A large number of studies employed the OCC model to generate emotions for their embodied characters. Many developers of such characters believe that the OCC model will be all they ever need to equip their character with emotions. This paper points out what the OCC model is able to do for an embodied emotional character and what it does not. Missing features include a history function, a personality designer and the interaction of the emotional categories.

MIT report: Online Active Learning in Practice

Title: Online Active Learning in Practice
Authors: Monteleoni, Claire and Kaariainen, Matti
Advisor: Tommi Jaakkola
Issue Date: 23-Jan-2007

Abstract: We compare the practical performance of several recently proposed algorithms for active learning in the online setting. We consider two algorithms (and their combined variants) that are strongly online, in that they do not store any previously labeled examples, and for which formal guarantees have recently been proven under various assumptions. We perform an empirical evaluation on optical character recognition (OCR) data, an application that we argue to be appropriately served by online active learning. We compare the performance between the algorithm variants and show significant reductions in label-complexity over random sampling.

URI: http://hdl.handle.net/1721.1/35784

CMU RI Seminar: Socially Guided Machine Learning

Andrea Thomaz
Post-Doctoral Associate
MIT

Abstract
There is a surge of interest in having robots leave the labs and factory floors to help solve critical issues facing our society, ranging from eldercare to education. A critical issue is that we will not be able to preprogram these robots with every skill they will need to play a useful role in society; robots will need the ability to interact and learn new things 'on the job' from everyday people. This talk introduces a paradigm, Socially Guided Machine Learning, that reframes the Machine Learning problem as a human-machine interaction, asking: How can systems be designed to take better advantage of learning from a human partner and the ways that everyday people approach the task of teaching?

In this talk I describe two novel social learning systems, on robotic and computer game platforms. Results from these systems show that designing agents to better fit human expectations of a social learning partner both improves the interaction for the human and significantly improves the way machines learn.

Sophie is a virtual robot that learns from human players in a video game via interactive Reinforcement Learning. A series of experiments with this platform uncovered and explored three principles of Social Machine Learning: guidance, transparency, and asymmetry. For example, everyday people were able to use an attention direction signal to significantly improve learning on many dimensions: a 50% decrease in actions needed to learn a task, and a 40% decrease in task failures during training.

On the Leonardo social robot, I describe my work enabling Leo to participate in social learning interactions with a human partner. Examples include learning new tasks in a tutelage paradigm, learning via guided exploration, and learning object appraisals through social referencing. An experiment with human subjects shows that Leo's social mechanisms significantly reduced teaching time by aiding in error detection and correction.