12 Feb 2026
45m

Breaking the Memory Wall in the Age of Inference

Podcast cover

The Data Exchange with Ben Lorica

The podcast explores the landscape of AI hardware, particularly focusing on inference and the role of memory. Sid Sheth, founder and CEO of D-Matrix, discusses the limitations of SRAM and HBM for cloud inference, highlighting D-Matrix's focus on digital in-memory compute (DIMC) to address these challenges. Sheth explains DIMC's architecture, which integrates compute and memory to reduce latency and improve efficiency, especially during the decode phase of generative AI models. The conversation covers the trade-offs between latency and throughput in hardware design, the importance of a software stack, and the collaborative approach D-Matrix takes by working with ecosystem partners rather than building its own cloud.

Outlines

Part 1: Introduction, Background

Part 2: Architecture, Memory Strategy

Part 3: DIMC Technology, Performance

Part 4: Software, Ecosystem, Users

Part 5: Future Trends, Scaling

Sign in to continue reading, translating and more.

Open full episode in Podwise