Causal reasoning is essential to decision-making in real-world problems. However, observational data is rarely sufficient to infer causal relationships or estimate treatment effects due to confounding signals. Pearl (2009) proposes a sound and complete formal system for identifying quantities on all three rungs of the so-called “ladder of causality” – observational, interventional, and counterfactual levels – however, these methods assume access to structural information that is often unavailable. Machine learning algorithms such as deep neural networks (DNNs) can help bridge the gap between theory and practice, finding patterns in large datasets that can be used to test causal hypotheses under some reasonable assumptions. Yet DNNs are often opaque and inscrutable.
The goal of this project is to combine the formal rigour of Pearl’s do-calculus with the flexibility of contemporary machine learning methods to model complex causal systems at variable levels of granularity. The main methodological toolkit will be probabilistic circuits (PCs), a flexible family of neural networks that provides tractable inference for queries involving conditioning, marginalisation, and/or maximisation. Existing work on PCs has focused almost exclusively on observational quantities, i.e. level one of Pearl’s hierarchy. This project aims to advance the state of the art, extending PCs to higher order queries involving interventional and counterfactual probabilities. Primary objectives include defining the graphical conditions necessary and sufficient for identifying such quantities, and implementing all methods in user-friendly open-source software packages.
Building on foundational work in causal representation learning, we will develop novel methods for structure discovery, approximate abstraction, and counterfactual reasoning in high-dimensional datasets. The resulting neuro-symbolic systems will be designed for use in sensitive domains such as healthcare, where algorithmic safety and trust are essential.