back icon
close icon

Capture phrases in quotes for more specific queries (e.g. "rocket ship" or "Fred Lynn")

  November 1, 2020

Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents’ Capabilities and Limitations

SRI Authors Melinda Gervasio, Pedro Sequeira



Sequeira, P. and Gervasio, M. (2020). Interestingness elements for explainable reinforcement learning: Understanding agents’ capabilities and limitations. Artificial Intelligence, 288, 103367.


We propose an explainable reinforcement learning (XRL) framework that analyzes an agent’s history of interaction with the environment to extract interestingness elements that explain its behavior. The framework relies on data readily available from standard RL algorithms, augmented with data that can easily be collected by the agent while learning. We describe how to create visual explanations of an agent’s behavior in the form of short video-clips highlighting key interaction moments, based on the proposed elements. We also report on a user study where we evaluated the ability of humans in correctly perceiving the aptitude of agents with different characteristics, including their capabilities and limitations, given explanations automatically generated by our framework. The results show that the diversity of aspects captured by the different interestingness elements is crucial to help humans correctly identify the agents’ aptitude in the task, and determine when they might need adjustments to improve their performance.


How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you. Expect a response within 48 hours.

Our Privacy Policy