Thursday, October 26, 2006

[CMU Intelligence Seminar]] Improving Systems Management Policies Using Hybrid Reinforcement Learning

Other information of Intelligence Seminar [link]

Topic: Improving Systems Management Policies Using Hybrid Reinforcement Learning

Speaker: Gerry Tesauro (IBM Watson Research)

Abstract:
Reinforcement Learning (RL) provides a promising new approach to systems
performance management that differs radically from standard
queuing-theoretic approaches making use of explicit system performance
models. In principle, RL can automatically learn high-quality management
policies without explicit performance models or traffic models, and with
little or no built-in system specific knowledge. Previously we showed
that online RL can learn to make high-quality server allocation
decisions in a multi-application prototype Data Center scenario. The
present work shows how to combine the strengths of both RL and queuing
models in a hybrid approach, in which RL trains offline on data
collected while a queuing model policy controls the system. By training
offline we avoid suffering potentially poor performance in live online
training. Our latest results show that, in both open-loop and
closed-loop traffic, hybrid RL training can achieve significant
performance improvements over a variety of initial model-based policies.
We also give several interesting insights as to how RL, as expected, can
deal effectively with both transients and switching delays, which lie
outside the scope of traditional steady-state queuing theory.



Speaker Bio:
Gerry Tesauro received a PhD in theoretical physics from Princeton
University in 1986, and owes his subsequent conversion to machine
learning research in no small part to the first Connectionist Models
Summer School, held at Carnegie Mellon in 1986. Since then he has worked
on a variety of ML applications, including computer virus recognition,
intelligent e-commerce agents, and most notoriously, TD-Gammon, a
self-teaching program that learned to play backgammon at human world
championship level. He has also been heavily involved for many years in
the annual NIPS conference, and was NIPS Program Chair in 1993 and
General Chair in 1994. He is currently interested in applying the latest
and greatest ML approaches to a huge emerging application domain of
self-managing computing systems, where he foresees great opportunities
for improvements over current state-of-the-art approaches.

No comments: