High-Stakes Alignment via Adversarial Training [Redwood Research Report] | AI Safety Fundamentals | Podwise