Learning to summarize user information for personalized reinforcement learning from human feedback | Best AI papers explained | Podwise