I am currently looking into defining an information-theoretic measure of the “effectiveness” of explanations in a communicative multiagent setting. This involves setting up the problem of generating explanations as a two-player cooperative game in which rewards are given according to the effectiveness measure. As it’s a natural fit, I am working on using deep reinforcement learning to build agents that can play this game. My interest in explanatory AI is tied to my broader interests in scalable approaches to technical safety problems. More specifically, I expect that active learning approaches to the value alignment problem, such as reward modelling, would benefit from agents that are intrinsically motivated to explain their actions, motivations and reasoning faculties.
My other research interests include:
- The relationship between deep learning and symbolic reasoning (especially analogical reasoning)
- Model-based control and planning
- Probabilistic programming
The CDT has been a good fit for my interests. There is a diverse set of approaches to AI safety issues being explored by the professors and students in the group, which means that there are always interesting discussions to get involved with. Additionally, having access to resources at both King’s and Imperial has allowed me to explore options that I wouldn’t otherwise be able to.
Prior Experience: After completing my undergraduate degree in Computer Science at the University of Bristol I worked as a software engineer and data scientist.