Explainable AI has become increasingly relevant, because in many domains, especially safety-critical ones, it is desirable to complement black-box machine learning (ML) models with comprehensible explanations of the models’ predictions.
This project focuses on explanations for sequential decision making processes. Such processes are found in AI planning, reinforcement learning, and control/cyber-physical systems, and they nowadays make use of ML models to e.g., represent the policy or the environment’s dynamics. Unlike most explainability techniques that deal with input-output, i.e., one-step, predictions, the challenge here is to deal with sequence data that arise from multiple, inter-dependent steps taken over time. Moreover, explanations need to account for the uncertain or probabilistic environment dynamics.
In particular, the focus will be on causal explanations building on the actual causality framework by Halpern and Pearl [1,2]. Given a realization of the sequential process under study, we seek to find the minimal set of units (e.g., observed steps, policy actions, agents) responsible for the observed outcome, i.e., such that the counterfactual model obtained by changing such units leads to a different outcome.
More specifically, we are interested in any of the following topics that our group is currently investigating:
* Counterfactual Inference of Markov Decision Processes [3-6]
* Dealing with uncertain models, partial observability, unobserved confounders [7,8]
* Combining counterfactuals with temporal logic reasoning for verification [9-11]
* Reliable counterfactual inference with data-driven models [12,13]
This project aims to enhance the trustworthiness of AI systems by providing rigorous explanations rooted in principled techniques from causal inference, with an emphasis on providing (probabilistic) guarantees on the correctness of the identified causes.