I am a PhD student in Computer Science Research at Imperial and King’s College London. My research mainly focuses on mechanistic interpretability and robustness in LLMs. I have a broad interdisciplinary background in cognitive psychology, philosophy, and artificial intelligence. I think that making progress in the mechanistic interpretability is crucial to create trustworthy and safe AI systems.
–
Undergraduate Qualification: Bsc Psychology and Philosophy, Erasmus University Rotterdam
Masters Qualification: Msc Artificial Intelligence, Utrecht University