“Measuring Coherence of Policies in Toy Environments” by dx26, Richard_Ngo, Martín Soto | LessWrong (30+ Karma) | Podwise