Forum for AI: Michael Littman

FAI has started up again for the spring. This spring we’re folding the AI-Lab‘s invited speaker series into FAI, so we’ll have lots of good speakers from other universities early in the semester. Hopefully Matt and I will get some local speakers to close out the series in April/May.

Anyway, I totally forgot to blog semester’s first FAI talk by Michael Littman of Rutgers University. The talk was in ACES Auditorium Here’s the title and abstract:

Reinforcement Learning for Autonomous Diagnosis and Repair

This talk describes ongoing work on an application of reinforcement learning in the context of autonomous diagnosis and repair. I will present a new formal model, cost-sensitive fault remediation (CSFR), which is a simplified partially observable environment model. CSFR is powerful enough to capture some real-world problems — we’ve looked at network, disk-system, and web-server maintenance. However, it also admits simplified algorithms for planning, learning, and exploration, which I’ll discuss.

It was a very cool talk. The agent he used as an example in the talk was a network connection repair agent, equipped with some corrective actions (e.g. renewing the DHCP lease) as well as some diagnostic actions. All the actions have associated costs, that may be different depending on whether the action succeeds. They then use a case-based reinforcement learning method to learn a policy for how to repair a network failure at the least cost. (In this case, cost=time) Having diagnostic actions, that is, actions that are taken only to give information, is one of my favorite ideas. It is by no means new, but it strikes me in many ways as the right way to do perception: the agent acquiring information as it needs it, and integrating it into its internal state representation. In this case, the state representation was simply the history of the current repair episode (i.e. the actions taken and their results). Episodes were compared for compatibilty with cases stored in memory, producing a probabilistic belief distribution over compatible stored states. This distribution was used to choose the action that will minimize the expected cost for the current episode.

%d bloggers like this: