Jared Swift

Exploration in Reinforcement Learning (RL) has been a well-researched topic since the inception of RL, with a plethora of methods proposed that aim to perform “good” exploration, which is often measured by “regret”; the difference between the return received and the optimal return. Whilst many of these methods provide sound theoretical guarantees, such as bounds on the regret, they are not applicable in-practice due to unrealistic assumptions made of the environment. In-practice, exploration strategies that use heuristics predominantly based on randomness are ubiquitous – whilst these are easy to implement and are domain independent, they often only offer theoretical guarantees in the limit and are inefficient in practice, due to the need for a large number of samples. Moreover, an agent that acts randomly cannot be trusted, or guaranteed to consider safety. The main aim of this project is to develop agents that utilise models and subsequently planning, as a heuristic for exploration that is both efficient and intelligent, enabling humans to understand why and where the agent chose to explore.

My research interests are in automated decision-making more generally, including Planning and RL, particularly in relation and application to autonomous service robots. I chose to apply to the CDT as I was excited by the opportunity to become exposed to many different areas of AI.

Undergraduate Qualification: BSc Computer Science with Artificial Intelligence (Industrial)

Work Experience:

  • Scientific Software Developer, ISIS Disordered Materials, STFC,
  • Total Scattering Software Developer, ISIS Disordered Materials, STFC

LinkedIn: https://www.linkedin.com/in/jared-swift-504727200/