16 Jul 2023

LW - Robustness of Model-Graded Evaluations and Automated Interpretability by Simon Lermen

The Nonlinear Library

The Nonlinear Library - LW - Robustness of Model-Graded Evaluations and Automated Interpretability by Simon Lermen

Preview

How to Get Rich: Every EpisodeNaval