Explainable Reinforcement Learning with Causality

Reinforcement Learning (RL) is a technique widely used to allow agents to learn behaviours based on a reward/punishment mechanism [1]. In combination with methods from deep learning, RL is currently applied in a number of different scenarios that have a significant impact in society, including healthcare, transportation, finance, social networks, e-commerce, education.

However, humans need to understand how algorithms make their decisions in order to trust them and take advantage of their full potential. As a result, explainability in RL is a growing field of increasing importance. There are many approaches being studied about how to make algorithms explain their decisions. The purpose of this project is to study causality as a method to generate explanations, as according to many experts it mimics the way humans understand and model the world around them.

The overall goal of the project is to generate explanations from an RL agent about how they are making their decisions. The research hypothesis is to use causality to achieve this goal.

In particular, current approaches to explainability in AI [2, 3, 4] assume that the effects of system variables can be represented as action influence models (AIM).

AIMs are sets of structural equations representing the values of variables in the environment in terms of the values of other variables that affect them, including the action taken.

AIMs have been introduced in [4] with the purpose of generating explanations. The intuitive idea is that within an RL framework, explanations are based on how different actions produce different effects in the environment.

These structural equations are then used to generate explanations based on counterfactual scenarios of the best action to take at each time-step.

AIMs are extracted from action matrices and direct acyclic graphs (DAGs) representing the influence variables have on each other.

In turn, such DAGs are typically obtained by using causal discovery methods [6], whereby the causal relations between the different variables of a system are extracted.

Unfortunately, causal discovery is well-developed for standard machine learning techniques, less so for reinforcement learning. In particular, for RL systems the action matrices and DAGs are typically extracted by hand, a process which is prone to man-made errors.

The purpose of the project is to research and identify methods to automatically infer the action matrix and the variable DAG, as the agent is training and getting observations from the environment, so as to then extract the action influence model.

There is a significant amount of research into different causal discovery methods for classical machine learning (ML) problems. However, there is little research (although increasing interest) into how to infer the action influence model of an RL environment during training time. This project is designed to fill this gap.

We also anticipate conducting experimental evaluation of the proposed approach to causal explanations. In particular, at first we will evaluate the proposed approach on a simplified map of the Starcraft game, as a proof of concept. Part of the project will also be to identify benchmarks for evaluation.

Reinforcement Learning (RL) is a technique widely used to allow agents to learn behaviours based on a reward/punishment mechanism [1]. In combination with methods from deep learning, RL is currently applied in a number of different scenarios that have a significant impact in society, including healthcare, transportation, finance, social networks, e-commerce, education.

However, humans need to understand how algorithms make their decisions in order to trust them and take advantage of their full potential. As a result, explainability in RL is a growing field of increasing importance. There are many approaches being studied about how to make algorithms explain their decisions. The purpose of this project is to study causality as a method to generate explanations, as according to many experts it mimics the way humans understand and model the world around them.

The overall goal of the project is to generate explanations from an RL agent about how they are making their decisions. The research hypothesis is to use causality to achieve this goal.

In particular, current approaches to explainability in AI [2, 3, 4] assume that the effects of system variables can be represented as action influence models (AIM).

AIMs are sets of structural equations representing the values of variables in the environment in terms of the values of other variables that affect them, including the action taken.

AIMs have been introduced in [4] with the purpose of generating explanations. The intuitive idea is that within an RL framework, explanations are based on how different actions produce different effects in the environment.

These structural equations are then used to generate explanations based on counterfactual scenarios of the best action to take at each time-step.

AIMs are extracted from action matrices and direct acyclic graphs (DAGs) representing the influence variables have on each other.

In turn, such DAGs are typically obtained by using causal discovery methods [6], whereby the causal relations between the different variables of a system are extracted.

Unfortunately, causal discovery is well-developed for standard machine learning techniques, less so for reinforcement learning. In particular, for RL systems the action matrices and DAGs are typically extracted by hand, a process which is prone to man-made errors.

The purpose of the project is to research and identify methods to automatically infer the action matrix and the variable DAG, as the agent is training and getting observations from the environment, so as to then extract the action influence model.

There is a significant amount of research into different causal discovery methods for classical machine learning (ML) problems. However, there is little research (although increasing interest) into how to infer the action influence model of an RL environment during training time. This project is designed to fill this gap.

We also anticipate conducting experimental evaluation of the proposed approach to causal explanations. In particular, at first we will evaluate the proposed approach on a simplified map of the Starcraft game, as a proof of concept. Part of the project will also be to identify benchmarks for evaluation.

[1] Sutton, Barto; Reinforcement Learning.

[2] P. Madumal, T. Miller, L. Sonenberg, and F. Vetere. Explainable reinforcement learning through a causal lens. CoRR, abs/1905.10958, 2019.

[3] S. Sloman. Causal models: How people think about the world and its alter-natives.Causal Models: How People Think about the World and Its Alternatives,pages 1–211, 01 2007.

[4] M. R. Waldmann, Y. Hagmayer, and A. P. Blaisdell. Beyond the information given: Causal models in learning and reasoning.Current Directions in Psycho-logical Science, 15(6):307–311, 2006.

[5]  O.  Vinyals,  T.  Ewalds,  S.  Bartunov,  P.  Georgiev,  A.  S.  Vezhnevets,  M.  Yeo,A. Makhzani, H. K ̈uttler, J. Agapiou, J. Schrittwieser, et al.  Starcraft ii: A new challenge for reinforcement learning.arXiv preprint arXiv:1708.04782, 2017.pages 2, 33

[6]  C.  Glymour,  K.  Zhang,  and  P.  Spirtes.   Review  of  causal  discovery  methods based on graphical models.Frontiers in genetics, 10:524, 2019. pages 12, 22


 

P. Madumal, T. Miller, L. Sonenberg, and F. Vetere. Explainable reinforcement learning through a causal lens. CoRR,abs/1905.10958, 2019.

S. Sloman. Causal models: How people think about the world and its alter-natives.Causal Models: How People Think about the World and Its Alternatives,pages 1–211, 01 2007.

M. R. Waldmann, Y. Hagmayer, and A. P. Blaisdell. Beyond the information given: Causal models in learning and reasoning.Current Directions in Psycho-logical Science, 15(6):307–311, 2006.

Project ID

STAI-CDT-2022-IC-4

Supervisor

Francesco Belardinellihttps://www.doc.ic.ac.uk/~fbelard/

Category

Logic, Verification