LessWrong (30+ Karma) - “Open Source Automated Interpretability for Sparse Autoencoder Features” by kh4dien, SrGonao, jacob_drori, Nora Belrose
Sign in to continue reading, translating and more.