Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL | Best AI papers explained | Podwise