Role: MRes Student in AI/ML
Topic: ML/AI Security, Privacy, and Safety
Email: m.-24@imperial.ac.uk
My research interests lie in advancing AI safety by developing methods to make machine learning models more controllable and aligned with desired outcomes. Currently, I am focused on building an adversarially robust image generation framework for text-to-image diffusion models that supports both conditional and unconditional safety controls through policy-driven mechanisms—without compromising the generative quality. This work draws on concept localization and mechanistic interpretability to enable fine-grained control over model behavior. Additionally, I have investigated attribute leakage in diffusion models, uncovering how sensitive or unintended features can be inferred from generated outputs.
Recent Comments