LW - Red-teaming language models via activation engineering by Nina Rimsky | The Nonlinear Library | Podwise