06 Feb 2026
54m

A New Era of Context Memory with Val Bercovici from WEKA

Podcast cover

Semi Doped

The podcast explores the evolving landscape of AI inference, particularly the critical role of context memory and storage solutions. Val Bercovici, Chief AI Officer at Weka, discusses how the increasing demand for context in AI agents necessitates innovative memory tiering strategies. He highlights the shift from prompt engineering to context engineering, emphasizing the need for high-speed, low-latency storage to manage the exploding key value cache. The conversation covers the memory hierarchy from HBM to DRAM to NVMe, and how Weka's neural mesh technology optimizes performance across these tiers. Bercovici also introduces concepts like high bandwidth flash, Axon for utilizing local SSDs in GPU racks, and augmented memory grid for network-based memory scaling, all aimed at improving tokenomics and enabling positive unit economics in AI inference.

Outlines

Part 1: AI Data Platforms, Context Evolution

Part 2: Memory Challenges, NVMe Solutions

Part 3: Networking, Software-Defined Memory

Part 4: Economics, Future Outlook

Sign in to continue reading, translating and more.

Open full episode in Podwise