Saturday, November 21, 2009

CMU talk: Imitation Learning and Purposeful Prediction

Machine Learning Lunch (http://www.cs.cmu.edu/~learning/)
Speaker: Prof. Drew Bagnell
Date: Monday, November 23, 2009

Imitation Learning and Purposeful Prediction

Programming robots is hard. While demonstrating a desired behavior may be easy, designing a system that behaves this way is often difficult, time consuming, and ultimately expensive. Machine learning promises to enable "programming by demonstration" for developing high-performance robotic systems. Unfortunately, many approaches that utilize the classical tools of supervised learning fail to meet the needs of imitation learning. Perhaps foremost, classical statistics and supervised machine learning exist in a vacuum: predictions made by these algorithms are explicitly assumed to not affect the world in which they operate. I'll discuss the problems that result from ignoring the effect of actions influencing the world, and I'll highlight simple "reduction-based" approaches that, both in theory and in practice, mitigate these problems.

Additionally, robotic systems are often built atop sophisticated planning algorithms that efficiently reason far into the future; consequently, ignoring these planning algorithms in lieu of a supervised learning approach often leads to poor and myopic performance. While planners have demonstrated dramatic success in applications ranging from legged locomotion to outdoor unstructured navigation, such algorithms rely on fully specified cost functions that map sensor readings and environment models to a scalar cost. Such cost functions are usually manually designed and programmed. Recently, our group has developed a set of techniques that learn these functions from human demonstration by applying an /Inverse Optimal Control/ (IOC) approach to find a cost function for which planned behavior mimics an expert's demonstration. These approaches shed new light on the intimate connections between probabilistic inference and optimal control. I'll consider case studies in activity forecasting of drivers and pedestrians as well as the imitation learning of robotic locomotion and rough-terrain navigation. These case-studies highlight key challenges in applying the algorithms in practical settings.

No comments: