My research interests lie at the intersection of deep learning theory, computer vision, and interpretability,
with a focus on causal representation learning and the emergence of shared structure across vision, language, sound, and thought.
I’m driven by a fundamental curiosity about how high-level abstractions arise in deep models,
and how such representations might reveal the computational principles underlying intelligence.
While my intellectual pursuits are strongly grounded in theoretical inquiry, particularly the mechanisms that enable deep models to generalize,
compose, and align, I’m guided by the conviction that such understanding can, over time, reshape how we approach real-world challenges
in high-stakes domains such as healthcare.
In parallel, I’m deeply committed to the long-term safety of AI systems. I’m especially interested in the problem of scalable oversight:
how we can build models whose reasoning remains understandable and steerable, even as their capabilities grow. Ultimately,
I aim to help develop AI systems that are not only powerful but also transparent, trustworthy, and aligned with human values.