Ep 64: GPT 4.1 Lead at OpenAI Michelle Pokrass: RFT Launch, How OpenAI Improves Its Models & the State of AI Agents Today | Unsupervised Learning | Podwise
This episode explores the development and real-world applications of GPT-4.1, focusing on its enhanced utility for developers and improvements over previous models. Michelle Pokrass, a key figure behind GPT-4.1, discusses the shift from optimizing for benchmarks to prioritizing real-world usability, emphasizing the importance of gathering developer feedback and translating it into actionable evaluations. Against the backdrop of rapid AI progress, the conversation pivots to the current state of AI agents, highlighting their effectiveness in well-defined domains but also the challenges in bridging the gap to the messy real world, particularly in areas like context integration and ambiguity handling. More significantly, the discussion covers the advancements in AI code generation, noting the model's proficiency in locally scoped problems but also the ongoing efforts to improve global context understanding and code style. As the conversation evolves, the focus shifts to the future of OpenAI's model family, touching on the potential for purpose-built models versus a more generalized approach and the role of fine-tuning, especially with RFT, in pushing the frontier of specific applications like chip design and drug discovery. The episode concludes with insights on how companies can stay ahead in the rapidly evolving AI landscape, emphasizing the importance of good evaluations, prompt engineering, and building for capabilities that are just out of reach, reflecting emerging industry patterns in AI development and deployment.