26 Jun 2024
11m

RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition

Podcast cover

Interconnects

Open in Podwise to generate AI notes

Sign in to process this episode and unlock summaries, transcripts, highlights and translations.

Open in Podwise

Shownotes are not generated by Podwise.

RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition