Run-time Verification for Safe and Verifiable AI

The growing societal impact of AI-based systems has brought with it a set of risks and concerns [1, 2].
Indeed, unintended and harmful behaviours may emerge from the application of machine learning (ML) algorithms, including negative side effects, reward hacking, unsafe exploration [3].
Hence, the safe and trusted deployment of AI systems calls for their verification and certification. In particular, verifiable artificial intelligence (VAI) has been described as “the goal of designing AI-based systems that have strong, ideally provable, assurances of correctness with respect to mathematically-specified requirements.” [4]
Verifiable AI can be achieved by applying formal methods, which have already proved successful in the specification, design, and verification of reactive, distributed, and multi-agent systems.
However, the application of formal methods in AI raises a number of challenges, from modelling continuous behaviours and formalising quantitative requirements, to high-dimensional input and parameters spaces, and online adaptation and evolution.

This project is meant to contribute to verifiable AI by applying techniques from run-time verification to monitoring the learning process of agents in reinforcement learning [5].
Runtime verification is a well-developed set of techniques to monitor system executions online [6]. Recently, (multi-valued) temporal logics [7] have been applied to run-time verification in [8]. Intuitively, a safety constraint is considered false as soon as it is violated, otherwise it is not (yet) false, but it is not necessarily true either.

Your contribution: you will develop a framework to verify at run-time the learning process of agents in RL.
The standard feedback loop in RL of actions and rewards will be integrated with a module that monitors the satisfaction of temporal requirements at run-time, these might include both safety and reachability goals. We envisage that the monitor will affect the agent’s learning process, either by acting as a shield and therefore preventing the agent to take “unsafe” actions, or by shaping the reward in order to direct the agent’s behaviour. You will implement the run-time monitor in a (possibly external) tool, and evaluate its performance in detecting risky behaviours as well as repairing them.
This framework might then be adapted to settings with multiple agents, acting either cooperatively or competitively, possibly with only partial observability on the actual state of the environment.
We anticipate that this project will contribute to design more safe and trustworthy RL mechanisms through run-time verification.

[1] Russell, S.; Dietterich, T.; Horvitz, E.; Selman, B.; Rossi, F.; Hassabis, D.; Legg, S.; Suleyman, M.; George, D. & Phoenix, S. Letter to the Editor: Research Priorities for Robust and Beneficial Artificial Intelligence: An Open Letter AI Magazine, 2015, 36, 3-4

[2] Dietterich, T. G. & Horvitz, E. J. Rise of Concerns about AI: Reflections and Directions Commun. ACM, Association for Computing Machinery, 2015, 58, 38–40

[3] Dario Amodei, Chris Olah, Jacob Steinhardt, Paul F. Christiano, John Schulman, and Dan Mané.
Concrete problems in AI safety. CoRR, abs/1606.06565, 2016.

[4] Sanjit A. Seshia and Dorsa Sadigh. Towards verified artificial intelligence. CoRR, abs/1606.08514, 2016.

[5] Sutton, Richard S.; Barto, Andrew G. (1998). Reinforcement Learning: An Introduction. MIT Press. ISBN 0-262-19398-1.

[6] Y. Falcone, K. Havelund and Giles, A Tutorial on Runtime Verification, 2013

[7] M. Huth, M. Ryan; Logic in Computer Science, Cambridge University Press, 2012.

[8] Bauer, A., Leucker, M., Schallhart, C.;The Good, the Bad, and the Ugly, But How Ugly Is Ugly? in Runtime Verification, 2007.

Project ID

STAI-CDT-2021-IC-23

Supervisor

Francesco Belardinellihttps://www.doc.ic.ac.uk/~fbelard/

Category

AI Planning, Logic, Verification