Temporal logic (TL) is arguably the primary language for formal specification and reasoning about system correctness and safety. They enable the specification and verification of properties such as “will the agent eventually reach the target while avoiding unsafe states?” or “will the system return to some desired service level after a fault?” TL has been successfully applied to the analysis of a wide range of systems, including hardware and software systems, cyber-physical systems (aka control systems) [1], and stochastic models [2] like Markov Decision Processes (MPDs), one of the main models in reinforcement learning and planning.
A limitation of existing TLs is that TL specifications are evaluated on a fixed configuration of the system, e.g., a fixed choice of policy or process/environment dynamics. This project aims to enable formal reasoning about different system configurations (or conditions) to express properties like “would the system remain safe if we switch to a high-performance controller?” or “what is the probability that a failure had happened if we had applied a different configuration in the past?” These queries fall under causal inference (CI) [3]. Even though both CI and TL are very well-established on their own, their combination hasn’t been sufficiently explored in the literature.
With this project, we aim to develop a new formal verification framework combining temporal logic and causality. Such new logic will include the usual temporal and causal operators as well, allowing to express interventional (“what will happen if…”) and counterfactual (“what would have happened if”) properties. To make the framework comprehensive and generally applicable, we aim to support different models, such as MDPs, control systems, and data-driven models like neural networks. This approach can be applied to reason about any kind of sequential decision-making process, but we expect to focus primarily on models of cyber-physical systems and autonomous and multi-agent systems. In particular, by enabling formal reasoning about the effect of interventions, our framework is relevant for safety and security analysis in healthcare and in epidemiology [4-11]. Safety will be an overarching theme as the approach is designed for safety verification of AI systems under nominal and counterfactual scenarios.
This project will result in the first logic-based framework to combine temporal and causal reasoning, topics that are central to AI safety and interpretability. See [12] for a preliminary version of this framework. This project will require extending existing (probabilistic) temporal logics with causal operators, defining the formal semantics of the introduced operators, and implementing effective decision procedures (based on symbolic/numerical or Monte-Carlo techniques). A significant challenge will have to do with the fact that counterfactuals can be precisely identified only for a restricted class of models, a problem exacerbated by partial observability. We envision that our framework will allow deriving bounds on the identification error for such problematic models, bounding as a result the satisfaction probability of the property. Another relevant direction will be investigating methods for the automated synthesis of optimal policies w.r.t. such logic specifications.