YouTube11 Apr 2026
19m

突破RLHF的规模化瓶颈 | DeepMind团队论文 | 数据利用效率极低 | 四种RLHF算法 | off-policy | 在线RLHF | 认知神经网络ENN | 信息导向探索 | 肯定性微调

Podcast cover

最佳拍档

Open in Podwise to generate AI notes

Sign in to process this episode and unlock summaries, transcripts, highlights and translations.

Open in Podwise

Shownotes are not generated by Podwise.

突破RLHF的规模化瓶颈 | DeepMind团队论文 | 数据利用效率极低 | 四种RLHF算法 | off-policy | 在线RLHF | 认知神经网络ENN | 信息导向探索 | 肯定性微调