21 Jul 2025
55m

Warehouse Native Incremental Data Processing With Dynamic Tables And Delayed View Semantics

Podcast cover

Data Engineering Podcast

In this episode of the Data Engineering Podcast, Tobias Macey interviews Dan Sotolongo, a principal engineer at Snowflake, about the challenges of incremental data processing in warehouse environments and how delayed view semantics help address the problem. Dan defines incremental data processing as efficiently updating results from continuously evolving data sources through extraction, loading, and transformation. They discuss the trade-offs between batch and streaming systems, highlighting Snowflake's dynamic tables feature as a micro-batch engine with a streaming programming model. Dan explains delayed view semantics as a theoretical framework for semantic guarantees in data pipelines, allowing for delays to improve efficiency without sacrificing self-consistency. He also touches on the limitations of view semantics, particularly regarding data deletion and GDPR compliance, and introduces Snowflake's immutability features to address these issues. The conversation also covers data validation, testing, and the future of stream processing, emphasizing the need for a unified approach to data management that reduces sprawl and simplifies the integration of core primitives.

Outlines

Part 1: Introduction and Foundations

Part 2: Dynamic Tables and Delayed View Semantics

Part 3: Implementation and Integration

Part 4: Lessons, Challenges, and Future

Sign in to continue reading, translating and more.

Open full episode in Podwise