26:["$","$L2f",null,{"data":{"isPreview":true,"seq":7375212,"episode":{"Id":"5c1a067551134ac863642e8c9149be0d89a69910761ddbd128dcbb5ff2867437","Seq":7375212,"PodId":"c2d6b50707f47c5b2af65a35314bc77065b579cc615d7f559bf53717cbc4938f","PodSeq":24594,"Title":"LLMs as Greedy Agents: RL Fine-tuning for Decision-Making","PodName":"Best AI papers explained","Description":"

Google DeepMind researchers investigated why large language models underperform in decision-making tasks, identifying issues like greediness, frequency bias, and a knowing-doing gap. They explored whether reinforcement learning fine-tuning on self-generated reasoning could improve these abilities. Their experiments across different decision-making scenarios showed that RL fine-tuning enhanced exploration and narrowed the gap between knowing and acting. The study also examined the impact of various exploration techniques on the fine-tuning process and the importance of reasoning and expert data for better decision-making in LLMs.

\n","Url":"https://podcasters.spotify.com/pod/show/ehwkang/episodes/LLMs-as-Greedy-Agents-RL-Fine-tuning-for-Decision-Making-e322uj8","Link":"https://anchor.fm/s/1026675f8/podcast/play/101857320/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2025-3-27%2F399180556-44100-2-a8e6b8f57be3.m4a","LinkType":"m4a","PublishTime":"$D2025-04-27T21:07:53.000Z","Img":"https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/43252366/43252366-1744500070152-e62b760188d8.jpg","EpImg":"https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/43252366/43252366-1744500070152-e62b760188d8.jpg","Duration":"00:18:19","Language":null,"SampleDuration":null,"IsVBR":false,"Transcribed":false,"Indexed":1,"Deleted":false,"RedirectSeq":null,"Source":null,"Size":null},"prevAndNext":{"prevSeq":7375211,"nextSeq":7375213},"states":{"state":"not-login","extra":{"summary":"Best AI papers explained - LLMs as Greedy Agents: RL Fine-tuning for Decision-Making","previewContent":{"summary":"Best AI papers explained - LLMs as Greedy Agents: RL Fine-tuning for Decision-Making","chapters":[],"keywords":[],"highlights":[],"transcripts":[]}}}}}]