AI Safety Fundamentals - Deceptively Aligned Mesa-Optimizers: It’s Not Funny if I Have to Explain It
Sign in to continue reading, translating and more.