Duck Lake: Simplifying the Lakehouse Ecosystem

In this episode of the Data Engineering Podcast, Tobias Macey interviews Hannes Muehleisen and Mark Raasveldt about Duck Lake, a new entrant into the open lakehouse ecosystem. They discuss the motivations behind creating Duck Lake, its architecture, and how it simplifies the lakehouse format by unifying the catalog and table format using a SQL relational database for metadata management and object storage for data. The conversation covers Duck Lake's scalability, its relationship to DuckDB and other lakehouse formats like Iceberg and Delta, and its potential impact on data architectures by enabling local-first compute models. They also highlight features like data inlining, encryption, and the ability to handle numerous snapshots, as well as future plans for vector type support and integration with other engines like Trino and Spark.

Outlines

Part 1: Introduction to Duck Lake

Part 2: Use Cases and Ecosystem Integration

Part 3: Features and Impact

Part 4: Practical Considerations and Future

Sign in to continue reading, translating and more.

Continue

Data Engineering Podcast

Part 1: Introduction to Duck Lake

Introduction to Duck Lake and the Lakehouse Ecosystem

Duck Lake vs. MotherDuck: Architecture and Compute

Duck Lake's Scalability and Deployment Flexibility

Part 2: Use Cases and Ecosystem Integration

Use Cases and Scalability Considerations for Duck Lake

DuckDB Connectors and Ecosystem Integration

Interoperability and Integration Paths with Iceberg

Duck Lake's Integration with Existing Ecosystems

Part 3: Features and Impact

Unique Features of Duck Lake: Data Inlining, Encryption, and Snapshots

Access Control, Encryption, and Vector Type Support in Duck Lake

Impact of Duck Lake on Data Architecture and Implementation

Data Gravity and the Role of Newer Utilities

Part 4: Practical Considerations and Future

Getting Started with Duck Lake and its Current Status

Performance and Multiversion Concurrency Control

Innovative Applications of Duck Lake

Lessons Learned and the Simplicity of Duck Lake

When Duck Lake is Not the Right Choice

Future Plans and Closing Remarks

Duck Lake: Simplifying the Lakehouse Ecosystem

Data Engineering Podcast

Part 1: Introduction to Duck Lake

00:11Introduction to Duck Lake and the Lakehouse Ecosystem

Introduction to Duck Lake and the Lakehouse Ecosystem

03:28Duck Lake vs. MotherDuck: Architecture and Compute

Duck Lake vs. MotherDuck: Architecture and Compute

11:56Duck Lake's Scalability and Deployment Flexibility

Duck Lake's Scalability and Deployment Flexibility

Part 2: Use Cases and Ecosystem Integration

16:36Use Cases and Scalability Considerations for Duck Lake

Use Cases and Scalability Considerations for Duck Lake

20:56DuckDB Connectors and Ecosystem Integration

DuckDB Connectors and Ecosystem Integration

25:01Interoperability and Integration Paths with Iceberg

Interoperability and Integration Paths with Iceberg

29:38Duck Lake's Integration with Existing Ecosystems

Duck Lake's Integration with Existing Ecosystems

Part 3: Features and Impact

33:56Unique Features of Duck Lake: Data Inlining, Encryption, and Snapshots

Unique Features of Duck Lake: Data Inlining, Encryption, and Snapshots

38:42Access Control, Encryption, and Vector Type Support in Duck Lake

Access Control, Encryption, and Vector Type Support in Duck Lake

43:53Impact of Duck Lake on Data Architecture and Implementation

Impact of Duck Lake on Data Architecture and Implementation

47:05Data Gravity and the Role of Newer Utilities

Data Gravity and the Role of Newer Utilities

Part 4: Practical Considerations and Future

50:26Getting Started with Duck Lake and its Current Status

Getting Started with Duck Lake and its Current Status

53:31Performance and Multiversion Concurrency Control

Performance and Multiversion Concurrency Control

58:01Innovative Applications of Duck Lake

Innovative Applications of Duck Lake

1:01:10Lessons Learned and the Simplicity of Duck Lake

Lessons Learned and the Simplicity of Duck Lake

1:04:08When Duck Lake is Not the Right Choice

When Duck Lake is Not the Right Choice

1:07:08Future Plans and Closing Remarks

Future Plans and Closing Remarks