Abstract Existing calibration algorithms address the problem of covariate shift via unsupervised domain adaptation. However, these methods suffer from the following limitations: 1) they require unlabeled data from the target domain, which may not be available at the stage of calibration in real-world applications and 2) their performance depends heavily on the disparity between the […]
Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents’ Capabilities and Limitations
We propose an explainable reinforcement learning (XRL) framework that analyzes an agent’s history of interaction with the environment to extract interestingness elements that explain its behavior. The framework relies on data readily available from standard RL algorithms, augmented with data that can easily be collected by the agent while learning. We describe how to create visual explanations of an agent’s behavior in the form of short video-clips highlighting key interaction moments, based on the proposed elements. We also report on a user study where we evaluated the ability of humans in correctly perceiving the aptitude of agents with different characteristics, including their capabilities and limitations, given explanations automatically generated by our framework. The results show that the diversity of aspects captured by the different interestingness elements is crucial to help humans correctly identify the agents’ aptitude in the task, and determine when they might need adjustments to improve their performance.
Procedure automation can relieve users of the burden of repetitive, time-consuming, or complex procedures and enable them to focus on more cognitively demanding tasks. Procedural learning is a method by which procedure automation can be achieved by intelligent computational assistants. This paper explores the use of filtering heuristics based on action models for automated planning to augment sequence mining techniques. Sequential pattern mining algorithms rely primarily on frequency of occurrence to identify patterns, leaving them susceptible to discovering patterns that make little sense from a cognitive perspective. In contrast, humans are able to form models of procedures from small numbers of observations, even without explicit instruction. We posit that humans are able to do so because of background knowledge about actions and procedures, which lets them effectively filter out meaningless sequential patterns. The action models foundational to artificial intelligence (AI) planning is one way to provide semantics to actions, supporting the design of heuristics for eliminating spurious patterns discovered from event logs. We present experiments with various filters derived from these action models, the results of which show the value of the filters in greatly reducing the number of sequential patterns discovered without sacrificing the number of correct patterns found, even with small, noisy event logs.
Advice is a powerful tool for learning. But advice also presents the challenge of bridging the gap between the high-level representations that easily capture human advice and the low-level representations that systems must operate with using that advice. Drawing inspiration from studies on human motor skills and memory systems, we present an approach that converts human advice into synthetic or imagined training experiences, serving to scaffold the low-level representations of simple, reactive learning systems such as reinforcement learners. Research on using mental imagery and directed attention in motor and perceptual skills motivates our approach. We introduce the concept of a cognitive advice template for generating scripted, synthetic experiences and use saliency masking to further conceal irrelevant portions of training observations. We present experimental results for a deep reinforcement learning agent in a Minecraft-based game environment that show how such synthetic experiences improve performance, enabling the agent to achieve faster learning and higher rates of success.
We propose a framework toward more explainable reinforcement learning (RL) agents. The framework uses introspective analysis of an agent’s history of interaction with its environment to extract several interestingness elements regarding its behavior. Introspection operates at three distinct levels, first analyzing characteristics of the task that the agent has to solve, then the behavior of the agent while interacting with the environment, and finally by performing a meta-analysis combining information gathered at the lower levels. The analyses rely on data that is already collected by standard RL algorithms. We propose that additional statistical data can easily be collected by a RL agent while learning that helps extract more meaningful aspects. We provide insights on how an explanation framework can leverage the elements generated through introspection. Namely, they can help convey learned strategies to a human user, justify the agent’s decisions in relevant situations, denote its learned preferences and goals, and identify circumstances in which advice from the user might be needed.
Most explanation schemes are reactive and informational: explanations are provided in response to specific user queries and focus on making the system’s reasoning more transparent. In mixed autonomy settings that involve teams of humans and autonomous agents, proactive explanation that anticipates and preempts potential surprises can be particularly valuable. By providing timely, succinct, and context-sensitive explanations, autonomous agents can avoid perceived faulty behavior and the consequent erosion of trust, enabling more fluid collaboration. We present an explanation framework based on the notion of explanation drivers —i.e., the intent or purpose behind agent explanations. We focus on explanations meant to reconcile expectation violations and enumerate a set of triggers for proactive explanation. Most work on explainable AI focuses on intelligibility; investigating explanation in mixed autonomy settings helps illuminate other important explainability issues such as purpose, timing, and impact.
This paper presents an approach to automated assessment for online training based on approximate graph matching. The algorithm lies at the core of two prototype training systems that we have built in accord with U.S. Army training materials: one for the use of a collaborative visualization and planning tool, the other for rifle maintenance. The algorithm uses approximate graph-matching techniques to align a representation of a student response for a training exercise with a predefined solution model for the exercise. The approximate matching enables tolerance to learner mistakes, with deviations in the alignment providing the basis for feedback that is presented to the student. Given that graph matching is NP-complete, the algorithm uses a heuristic approach to balance computational performance with alignment quality. A comprehensive experimental evaluation shows that our technique scales well while retaining the ability to identify correct alignments for responses containing realistic types and numbers of learner mistakes. Keywords: Artificial Intelligence, Artificial Intelligence Center, AIC
The high cost of developing content has been a major impediment to the widespread deployment of intelligent training systems. To enable automated skill assessment, traditional approaches have required significant time investment by highly trained individuals to encode first-principles domain models for the training task. In contrast, approaches grounded in example-based methods have been shown to significantly reduce authoring time. This paper reports on an approach to creating solution models for automated skill assessment using an example-based methodology, specifically targeting domains for which solution models must support robustness to learner mistakes. With this approach, a content author creates a baseline solution model by demonstrating a solution instance and then specifies a set of annotations to generalize from that instance to a comprehensive solution model. Results from a user study show that domain experts are comfortable with the approach and capable of applying it to generate quality solution models. Keywords: Artificial Intelligence, Artificial Intelligence Center, AIC, intelligent tutoring, assessment, model authoring
Virtual environments (VEs) provide an appealing vehicle for training complex skills, particularly for domains where real-world practice incurs significant time, expense, or risk. Two impediments currently block widespread use of intelligent training tools for VEs. The first impediment is that techniques for assessing performance focus on algorithmic skills that force learners to follow rigid solution paths. The second impediment is the high cost of authoring the models that drive intelligent training capabilities.
This paper presents an approach to training in VEs that directly addresses these challenges and summarizes its application to a weapons maintenance task. With our approach, a learner’s actions are recorded as he completes training exercises in a semantically instrumented VE. An example-tracing methodology, in which the learner’s actions are compared to a predefined solution model, is used to generate assessment information with contextually relevant feedback. Novel graph-matching technology, grounded in edit-distance optimization, aligns student actions with solution models while tolerating significant deviation. With this robustness to learner mistakes, assessment can support exploratory learning processes rather than forcing learners down fixed solution paths.
Our approach to content creation leverages predefined ontologies, enabling authoring by domain experts rather than technology experts. A semantic mark-up framework supports authors in overlaying ontologies onto VE elements and in specifying actions with their effects. Drawing on these semantics, exercises and their solutions are created through end-user programming techniques: a domain expert demonstrates one or more solutions to a task and then annotates those solutions to define a generalized solution model. A concept validation study shows that users are comfortable with this approach and can apply it to create quality solution models.