Mansi

Role: MRes Student in AI/ML

Topic: ML/AI Security, Privacy, and Safety

Email: m.-24@imperial.ac.uk

My research interests lie in advancing AI safety by developing methods to make machine learning models more controllable and aligned with desired outcomes. Currently, I am focused on building an adversarially robust image generation framework for text-to-image diffusion models that supports both conditional and unconditional safety controls through policy-driven mechanisms—without compromising the generative quality. This work draws on concept localization and mechanistic interpretability to enable fine-grained control over model behavior. Additionally, I have investigated attribute leakage in diffusion models, uncovering how sensitive or unintended features can be inferred from generated outputs.

Recent Posts

Recent Comments

Archives

Categories

Meta

MRes Student @ apss

Recent Posts

Recent Comments

Archives

Categories

Meta

Email to a Friend

Page URL:

Your Information:

Friend’s Information: