Causal Explanations for Sequential Decision Making

Explainable AI has become increasingly relevant, because in many domains, especially safety-critical ones, it is desirable to complement black-box machine learning (ML) models with comprehensible explanations of the models’ predictions.

This project focuses on explanations for sequential decision making processes. Such processes are found in AI planning, reinforcement learning, and control/cyber-physical systems, and they nowadays make use of ML models to e.g., represent the policy or the environment’s dynamics. Unlike most explainability techniques that deal with input-output, i.e., one-step, predictions, the challenge here is to deal with sequence data that arise from multiple, inter-dependent steps taken over time. Moreover, explanations need to account for the uncertain or probabilistic environment dynamics.
In particular, the focus will be on causal explanations building on the actual causality framework by Halpern and Pearl [1,2]. Given a realization of the sequential process under study, we seek to find the minimal set of units (e.g., observed steps, policy actions, agents) responsible for the observed outcome, i.e., such that the counterfactual model obtained by changing such units leads to a different outcome.

More specifically, we are interested in any of the following topics that our group is currently investigating:

* Counterfactual Inference of Markov Decision Processes [3-6]
* Dealing with uncertain models, partial observability, unobserved confounders [7,8]
* Combining counterfactuals with temporal logic reasoning for verification [9-11]
* Reliable counterfactual inference with data-driven models [12,13]

This project aims to enhance the trustworthiness of AI systems by providing rigorous explanations rooted in principled techniques from causal inference, with an emphasis on providing (probabilistic) guarantees on the correctness of the identified causes.

[1] Halpern, Joseph Y., and Judea Pearl. “Causes and explanations: A structural-model approach. Part II: Explanations.” The British journal for the philosophy of science (2005).

[2] Beckers, Sander. “Causal explanations and XAI.” Conference on Causal Learning and Reasoning. PMLR, 2022.

[3] Oberst, Michael, and David Sontag. “Counterfactual off-policy evaluation with gumbel-max structural causal models.” International Conference on Machine Learning. PMLR, 2019.

[4] Tsirtsis, Stratis, Abir De, and Manuel Rodriguez. “Counterfactual explanations in sequential decision making under uncertainty.” Advances in Neural Information Processing Systems 34 (2021): 30127-30139.

[5] Kazemi, Milad, Jessica Lally, Ekaterina Tishchenko, Hana Chockler, and Nicola Paoletti. “Counterfactual Influence in Markov Decision Processes.” arXiv preprint arXiv:2402.08514 (2024).

[6] Triantafyllou, Stelios, Adish Singla, and Goran Radanovic. “Actual causality and responsibility attribution in decentralized partially observable Markov decision processes.” Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society. 2022.

[7] Lu, Chaochao, Bernhard Schölkopf, and José Miguel Hernández-Lobato. “Deconfounding reinforcement learning in observational settings.” arXiv preprint arXiv:1812.10576 (2018).

[8] Zhang, Junzhe, and Elias Bareinboim. Markov decision processes with unobserved confounders: A causal approach. Technical report, Technical Report R-23, Purdue AI Lab, 2016.

[9] Kazemi, Milad, and Nicola Paoletti. “Causal Temporal Reasoning for Markov Decision Processes.” arXiv preprint arXiv:2212.08712v2 (2023).

[10] Finkbeiner, Bernd, and Julian Siber. “Counterfactuals modulo temporal logics.” arXiv preprint arXiv:2306.08916 (2023).

[11] Coenen, Norine, et al. “Temporal causality in reactive systems.” International Symposium on Automated Technology for Verification and Analysis. Cham: Springer International Publishing, 2022.

[12] Chernozhukov, V., Wüthrich, K., & Zhu, Y. (2021). An exact and robust conformal inference method for counterfactual and synthetic controls. Journal of the American Statistical Association, 116(536), 1849-1864.

[13] Lei, L., & Candès, E. J. (2021). Conformal inference of counterfactuals and individual treatment effects. Journal of the Royal Statistical Society Series B: Statistical Methodology, 83(5), 911-938.

Project ID

STAI-CDT-2024-KCL-18