SWEET-RL: Training LLM Agents for Collaborative Reasoning | Best AI papers explained | Podwise