Skip to content
Go back

Mission Statement (of sorts)

Updated:  at  01:26 AM

Generally, language models keep getting more capable. To some that’s obvious. To most people, I don’t think they have any idea what’s coming.

Yet, we understand less and less about how they actually work. That’s where much of my current research interests lie. Something that I’m particularly interested in is “bitter lesson-pilled” interpretability. That is, building and testing mechanistic interpretability methods that scale cleanly with compute.

Right now I’m working under the supervision of Wendy Zheng at UVA who is advised by Professor Chen Chen. I’ll leave the specific details of it out right now, but I have some exciting research results on introspection in LLMs to publish in the near future.

In a broader sense, my hope is that in less than 1.5 years I’ll be in a position to join a frontier AI lab, something like Anthropic, DeepMind, Goodfire, Apollo, Redwood, anywhere doing serious interpretability work, in a role between research and engineering.

Right now, I truly believe the most important thing I can do is deepen my foundations and get to the frontier of research, where the interpretability problems are most urgent and the tools to solve them are being built.

Whether or not I end up working at a frontier AI lab, there’s nothing stopping me from doing tremendously good research but willpower and effort.

Godspeed.


Suggest Changes

Previous Post
Threat Modeling with LLMs
Next Post
A reflection on CMU REUSE