
The podcast explores the state of Large Language Models (LLMs) in 2026, focusing on architectures, post-training techniques like RLVR and GRPO, inference scaling, benchmarks, and tool use. Sebastian Raschka, an AI researcher, suggests that while the transformer architecture remains dominant, improvements are now driven by post-training rather than architectural changes. He highlights the increasing adoption of Mixture of Experts (MOE) models for efficiency and discusses the potential of Reinforcement Learning with Verifiable Rewards (RLVR) to enhance reasoning capabilities. The conversation also touches on the challenges of benchmarking, the economic incentives driving model development, and the growing trend of companies training LLMs in-house using private data to gain a competitive edge.
Sign in to continue reading, translating and more.
Continue