LessWrong (30+ Karma) - “[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF” by Leon Lang
Sign in to continue reading, translating and more.