This Blog is maintained by the Robot Perception and Learning lab at CSIE, NTU, Taiwan. Our scientific interests are driven by the desire to build intelligent robots and computers, which are capable of servicing people more efficiently than equivalent manned systems in a wide variety of dynamic and unstructured environments.
Tuesday, January 22, 2008
CMU RI seminar: Robust Sensor Placements, Active Learning and Submodular Functions
Speaker: Carlos Guestrin, Carnegie Mellon University
Date: Jan 25, 2008
In this talk, we tackle a fundamental problem that arises when using sensors to monitor the ecological condition of rivers and lakes, the network of pipes that bring water to our taps, or the activities of an elderly individual when sitting on a chair: Where should we place the sensors in order to make effective and robust predictions?
Optimizing the informativeness of the observations collected by the sensors is an NP-hard problem, even in the simplest settings. We will first identify a fundamental property of sensing tasks, submodularity, an intuitive diminishing returns property. By exploiting submodularity, we develop effective approximation algorithms for the placement problem which have strong theoretical guarantees in terms of the quality of the solution. These algorithms address settings where, in addition to sensing, nodes must maintain effective wireless connectivity, the data may be collected by mobile robots, or we seek to have solutions that are robust to adversaries.
We demonstrate our approach on several real-world settings, including data from real deployments, from a built activity recognition chair, from stories propagating through blogs, and from a sensor placement competition.
This talk is primarily based on joint work with Andreas Krause.
Speaker Biography: Carlos Guestrin's current research spans the areas of planning, reasoning and learning in uncertain dynamic environments, focusing on applications in sensor networks. He is an assistant professor in the Machine Learning and in the Computer Science Departments at Carnegie Mellon University. Previously, he was a senior researcher at the Intel Research Lab in Berkeley. Carlos received his MSc and PhD in Computer Science from Stanford University in 2000 and 2003, respectively, and a Mechatronics Engineer degree from the Polytechnic School of the University of Sao Paulo, Brazil, in 1998. Carlos Guestrin work received awards at a number of conferences and a journal: KDD 2007, IPSN 2005 and 2006, VLDB 2004, NIPS 2003 and 2007, UAI 2005, ICML 2005, and JAIR in 2007. He is also a recipient of the NSF Career Award, Alfred P. Sloan Fellowship, IBM Faculty Fellowship, the Siebel Scholarship and the Stanford Centennial Teaching Assistant Award.
[Lab meeting] Jan. 22nd, 2008 (Kuo-Hwei Lin): Mathematical Model Derivation of SLAMMOT
[Lab meeting] 22.01.2008(Andi) Extrinsic self calibration of a camera and a 3D laser range finder from natural scenes
Autonomous Systems Laboratory (ASL) at the Swiss Federal Institute of Technology Zurich (ETH), Switzerland;
From: Intelligent Robots and Systems, 2007. IROS
Abstract: In this paper, we describe a new approach for the extrinsic calibration of a camera with a 3D laser range finder, that can be done on the fly. This approach does not require any calibration object. Only few point correspondences are used, which are manually selected by the user from a scene viewed by the two sensors. The proposed method relies on a novel technique to visualize the range information obtained from a 3D laser scanner. This technique converts the visually ambiguous 3D range information into a 2D map where natural features of a scene are highlighted. We show that by enhancing the features the user can easily find the corresponding points of the camera image points. Therefore, visually identifying laser camera correspondences becomes as easy as image pairing. Once point correspondences are given, extrinsic calibration is done using the well-known PnP algorithm followed by a non linear refinement process. We show the performance of our approach through experimental results. In these experiments, we will use an omnidirectional camera. The implication of this method is important because it brings 3 computer vision systems out of the laboratory and into practical use.
Monday, January 21, 2008
[Lab meeting] Jan. 22nd, 2008 (Stanley): Apprenticeship Learning via Inverse Reinforcement Learning
From: Proceedings of the 21 st International Conference on Machine Learning, Banff, Canada, 2004.
Link
Abstract: We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to perform. This setting is useful in applications (such as the task of driving) where it may be diffcult to write down an explicit reward function specifying exactly how di erent desiderata should be traded off. We think of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and give an algorithm for learning the task demonstrated by the expert. Our algorithm is based on using "inverse reinforcement learning" to try to recover the unknown reward function. We show that our algorithm terminates in a small number of iterations, and that even though we may never recover the expert's reward function, the policy output by the algorithm will attain performance close to that of the expert, where here performance is measured with respect to the expert's unknown reward function.
Saturday, January 12, 2008
CMU Intelligence Seminar: Steps towards human-level AI
Tuesday 1/15, 3:30pm in Wean 5409
Speaker:Kenneth D. Forbus, Northwestern University
Title: Steps towards human-level AI
Faculty Host: Scott Fahlman
Appointments: Barbara Grandillo <bag+@cs.cmu.edu>
Abstract
A confluence of three factors is changing the kinds of AI experiments that can be done: (1) increasing computational power, (2) off-the-shelf representational resources, and (3) steady scientific progress, both in AI and in other areas of Cognitive Science. Consequently, I believe it is time for the field to spend more of its energy experimenting with larger-scale systems, and attempting to capture larger constellations of human cognitive abilities. This talk will summarize experiments with two larger-scale systems we have built at Northwestern: (1) Learning to solve AP Physics problems, in the Companions cognitive architecture. In an evaluation conducted by the Educational Testing Service, a Companion showed it was able to transfer knowledge across multiple types of variant problems. (2) Learning by reading, using the Learning Reader prototype. Learning Reader includes a novel process, rumination, where the system improves its learning by asking itself questions about material it has read.
Speaker Bio
Kenneth D. Forbus is the Walter P. Murphy Professor of Computer Science and Professor of Education at Northwestern University. His research interests include qualitative reasoning, analogy and similarity, sketch understanding, spatial reasoning, cognitive simulation, reasoning system design, articulate educational software, and the use of AI in computer gaming. He received his degrees from MIT (Ph.D. in 1984). He is a Fellow of the American Association for Artificial Intelligence, the Cognitive Science Society, and the Association for Computing Machinery. He serves on the editorial boards of Cognitive Science, the AAAI Press, and on the Advisory Board of the Journal of Game Development.
[VASC Seminar]Fast IKSVM and other Generalizations of Linear SVMs
Speaker: Alexander C. Berg, Yahoo! Research
Date: Monday, Jan 14
Abstract:
We show that one can build histogram intersection kernel SVMs (IKSVMs)
with runtime complexity of the classifier logarithmic in the number of
support vectors as opposed to linear for the standard approach. We
further show that by pre-computing auxiliary tables we can construct an
approximate classifier with constant runtime and space requirements,
independent of the number of support vectors, with negligible loss in
classification accuracy on various tasks.
This result is based on noticing that the IKSVM decision function is a sum
of piece-wise linear functions of each coordinate. We generalize this
notion and show that the resulting classifiers can be learned efficiently.
The practical results are classifiers strictly more general than linear
svms that in practice provide better classification performance for a
range of tasks all at reasonable computational cost.
We also introduce novel features based on multi-level histograms of
oriented edge energy and present experiments on various detection
datasets. On the INRIA pedestrian dataset an approximate IKSVM classifier
based on these features has a miss rate 13% lower at 10^-6 False Positive
Per Window than the linear SVM detector of Dalal and Triggs while being
only twice as slow for classification. On the Daimler Chrysler pedestrian
dataset IKSVM gives comparable accuracy to the best results (based on
quadratic SVMs), while being 15x faster. In these experiments our
approximate IKSVM is up to 2000x faster than a standard implementation and
requires 200x less memory. Finally we show that a 50x speed up is possible
using approximate IKSVM based on spatial pyramid features on the Caltech
101 dataset with negligible loss of accuracy.
Related Papers:
Histogram intersection kernel for image classification
Generalized Histogram Intersection Kernel for Image Recognition
Biography:
Alex Berg's research concerns computational visual recognition. He is a
research scientist at Yahoo! Research and a visiting scholar at U.C.
Berkeley. He has worked on general object recognition in images, action
recognition in video, human pose identification in images, image parsing,
face recognition, image search, and machine learning for computer vision.
His Ph.D. at U.C. Berkeley developed a novel approach to deformable
template matching. He earned a BA and MA in Mathematics from Johns
Hopkins University.
Wednesday, January 09, 2008
CMU ML/Google Seminar: RECENT DEVELOPMENTS IN MARKOV LOGIC
Speaker: Pedro Domingos Associate Professor, University of Washington
Title: RECENT DEVELOPMENTS IN MARKOV LOGIC
Abstract:
Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching weights to first-order formulas and viewing them as templates for features of Markov networks. Learning and inference algorithms for Markov logic are available in the open-source Alchemy system. In this talk I will discuss some of the latest developments in Markov logic, including lifted first-order probabilistic inference, relational decision theory, statistical predicate invention, efficient second-order algorithms for weight learning, extending the representation to continuous features, and transferring learned knowledge across domains. I will give an overview of recent and ongoing applications in natural language processing, robot mapping, social network analysis, and computational biology, and conclude with a discussion of open problems and exciting research directions. (Joint work with Jesse Davis, Stanley Kok, Daniel Lowd, Hoifung Poon, Aniruddh Nath, Matt Richardson, Parag Singla, Marc Sumner, and Jue Wang.)
Speaker Bio: I received an undergraduate degree (1988) and M.S. in Electrical Engineering and Computer Science (1992) from IST, in Lisbon. I received an M.S. (1994) and Ph.D. (1997) in Information and Computer Science from the University of California at Irvine. I spent two years as an assistant professor at IST, before joining the faculty of the University of Washington in 1999. I'm the author or co-author of over 100 technical publications in machine learning, data mining, and other areas. I'm a member of the editorial board of the Machine Learning journal and the advisory board of JAIR, and a co-founder of the International Machine Learning Society. I was program co-chair of KDD-2003, and I've served on the program committees of AAAI, ICML, IJCAI, KDD, SIGMOD, WWW, and others. I've received a Sloan Fellowship, an NSF CAREER Award, a Fulbright Scholarship, an IBM Faculty Award, two KDD best paper awards, and other distinctions.
CMU talk: The Maximum Entropy Principle
Miroslav Dudik, Postdoctoral Researcher, MLD, CMU
The link
Abstract
The maximum entropy principle (maxent) has been applied to solve density estimation problems in physics (since 1871), statistics and information theory (since 1957), as well as machine learning (since 1993). According to this principle, we should represent available information as constraints and among all the distributions satisfying the constraints choose the one of maximum entropy. In this overview I will contrast various motivations of maxent with the main focus on applications in statistical inference. I will discuss the equivalence between robust Bayes, maximum entropy, and regularized maximum likelihood estimation, and the implications for principled statistical inference. Finally, I will describe how maxent has been applied to model natural languages and geographic distributions of species.
Monday, January 07, 2008
Lab Meeting January 8th, 2008 (Yu-Hsiang): CRF-Matching: Conditional Random Fields for Feature-Based Scan Matching
Fabio Ramos,Dieter Fox,Hugh Durrant-Whyte
Abstract :
Matching laser range scans observed at differentpoints in time is a crucial component of many robotics tasks,including mobile robot localization and mapping. While existingtechniques such as the Iterative Closest Point (ICP) algorithmperform well under many circumstances, they often fail when theinitial estimate of the offset between scans is highly uncertain.This paper presents a novel approach to 2D laser scan matching.CRF-Matching generates a Condition Random Field (CRF) toreason about the joint association between the measurementsof the two scans. The approach is able to consider arbitraryshape and appearance features in order to match laser scans.The model parameters are learned from labeled training data.Inference is performed efficiently using loopy belief propagation.Experiments using data collected by a car navigating throughurban environments show that CRF-Matching is able to reliablyand efficiently match laser scans even when no a priori knowledgeabout their offset is given. They additionally demonstrate that ourapproach can seamlessly integrate camera information, therebyfurther improving performance.
link
New Laboratory Robot Can Lift The Burden Of Boring Work
...
LISA is equipped with a sensing gripper arm designed to hold plastic dishes but not injure human beings. Its “artificial skin” consists of conductive foam and textiles and intelligent signal processing electronics. This skin immediately senses and cushions inadvertent jostling. A thermographic camera additionally registers body heat and indicates for instance if a human colleague’s hand is in the way.
...
Link
Lab Meeting January 8th, 2008 (Yu-chun): Socially Distributed Perception: GRACE plays Social Tag at AAAI 2005
Autonomous Robots, 2007
This paper presents a robot search task (social tag) that uses social interaction, in the form of asking for help, as an integral component of task completion. Socially distributed perception is defined as a robot's ability to augment its limited sensory capacities through social interaction. We describe the task of social tag and its implementation on the robot GRACE for the AAAI 2005 Mobile Robot Competition & Exhibition. We then discuss our observations and analyses of GRACE's performance as a situated interaction with conference participants. Our results suggest we were successful in promoting a form of social interaction that allowed people to help the robot achieve its goal. Furthermore, we found that different social uses of the physical space had an effect on the nature of the interaction. Finally, we discuss the implications of this design approach for effective and compelling human-robot interaction, considering its relationship to concepts such as dependency, mixed initiative, and socially distributed cognition.
link
Tuesday, January 01, 2008
[CMU RI Thesis] On the Multi-View Fitting and Construction of Dense Deformable Face Models
Author: K. Ramnath
Abstract:
Active Appearance Models (AAMs) are generative, parametric models that have been successfully used in the past to model deformable objects such as human faces. Fitting an AAM to an image consists of minimizing the error between the input image and the closest model instance; i.e. solving a nonlinear optimization problem. In this thesis we study three important topics related to deformable face models such as AAMs: (1) multi-view 3D face model fitting, (2) multi-view 3D face model construction, and (3) automatic dense deformable face model construction.
The original AAMs formulation was 2D, but they have recently been extended to include a 3D shape model. A variety of single-view algorithms exist for fitting and constructing 3D AAMs but one area that has not been studied is multi-view algorithms. In the first part of this thesis we describe an algorithm for fitting a single AAM to multiple images, captured simultaneously by cameras with arbitrary locations, rotations, and response functions. This algorithm uses the scaled orthographic imaging model used by previous authors, and in the process of fitting computes, or calibrates, the scaled orthographic camera matrices. We also describe an extension of this algorithm to calibrate weak perspective (or full perspective) camera models for each of the cameras. In essence, we use the human face as a (nonrigid) calibration grid. We demonstrate that the performance of this algorithm is roughly comparable to a standard algorithm using a calibration grid. We then show how camera calibration improves the performance of AAM fitting.
A variety of non-rigid structure-from-motion algorithms, both single-view and multiview, have been proposed that can be used to construct the corresponding 3D non-rigid shape models of a 2D AAM. In the second part of this thesis we show that constructing a 3D face model using non-rigid structure-from-motion suffers from the Bas-Relief ambiguity and may result in a �scaled� (stretched/compressed) model. We outline a robust non-rigid motion-stereo algorithm for calibrated multi-view 3D AAM construction and show how using calibrated multi-view motion-stereo can eliminate the Bas-Relief ambiguity and yield face models with higher 3D fidelity.
An important step in computing dense deformable face models such as 3D Morphable Models (3DMMs) is to register the input texture maps using optical flow. However, optical flow algorithms perform poorly on images of faces because of the appearance and disappearance of structure such as teeth and wrinkles, and because of the non-Lambertian, textureless cheek regions. In the final part of this thesis we propose a different approach to building dense face models. Our algorithm iteratively builds a face model, fits the model to the input image data, and then refines the model. The refinement consists of three steps: (1) the addition of more mesh points to increase the density, (2) image consistent re-triangulation of the mesh, and (3) refinement of the shape modes. Using a carefully collected dataset containing hidden marker ground-truth, we show that our algorithm generates dense models that are quantitatively better than those obtained using off the shelf optical flow algorithms. We also show how our algorithm can be used to construct dense deformable models automatically, starting with a rigid planar model of the face that is subsequently refined to model the non-planarity and the non-rigid components.
The full text can be found here.