26 Jun 2024
11m
RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition
Interconnects
Open in Podwise to generate AI notes
Sign in to process this episode and unlock summaries, transcripts, highlights and translations.
Shownotes are not generated by Podwise.

