Apache Iceberg: What It Is and Why Everyone’s Talking About It.

In this monologue podcast, Tim Berglund discusses Apache Iceberg, an open table format, and its evolution from data warehouses to data lakes. He explains the need for open table formats, focusing on how Iceberg addresses consistency, transactionality, and schema management challenges. Berglund details Iceberg's logical architecture, including data files, manifest files, manifest lists, metadata files, and catalogs, emphasizing its pluggable nature and application in modern streaming environments. He also introduces Confluent's Tableflow, which integrates Iceberg semantics with Kafka topics, enabling real-time data accessibility as Iceberg tables.

Outlines

Sign in to continue reading, translating and more.

Continue

Confluent Developer

Introduction to Apache Iceberg and the Evolution of Data Storage

Iceberg's Logical Architecture: Data and Metadata Layers

Iceberg in the Streaming World and Confluent's Tableflow

Apache Iceberg: What It Is and Why Everyone’s Talking About It.

Confluent Developer

00:00Introduction to Apache Iceberg and the Evolution of Data Storage

Introduction to Apache Iceberg and the Evolution of Data Storage

03:50Iceberg's Logical Architecture: Data and Metadata Layers

Iceberg's Logical Architecture: Data and Metadata Layers

10:22Iceberg in the Streaming World and Confluent's Tableflow

Iceberg in the Streaming World and Confluent's Tableflow