We developed a framework for causal explanations of image classifiers based on the principled approach of actual causality  and responsibility , the latter pioneered by Dr Chockler.
Our framework already resulted in a number of publications in top-ranked conferences  and . The framework provides an explanation to the question “why is this image classified the way it is” (e.g., “why is this image classified as a panda?”). It refers to the AI image classifier (a neural network) as a black-box, hence making it applicable to any AI system.
Explanations are crucial in the interaction of humans with the AI systems and components and are a subject of upcoming US, EU, and UK regulations (the right for explanations, the EU AI act). In particular, explanations are useful for increasing users’ trust in the system and for finding errors in the classifiers.
This project extends the current framework with multiple explanations. The goal is to provide multiple different explanations, hence bringing the output closer to what is perceived by humans as good explanations and also finding more subtle errors in classification.
The implementation will use the tool DeepCover, developed by the supervisor [3,4] and will combine it with symbolic AI tools, namely constraint solvers and SMT solvers. Symbolic AI tools are needed to capture the constraints on the set of multiple explanations, such as the maximum overlap and the relative location of the explanations on the image.
The output of the project will increase trustworthiness of black-box AI systems by augmenting the classification with human-like explanations, and the safety of these systems by providing a fine-grained insight into the classification process. Papers resulting from the project will be submitted to AAAI, NeurIPS, ICCV, and CVPR conferences.
While the use-case for the project is black-box image classifiers, the problem is general and applicable to a variety of domains: accompanying decisions of black-box systems with as much information as possible to increase safety and trustworthiness.