Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability

The podcast explores applying data lakehouse architectures to observability for improved scalability and economics. Jacob Leverich, co-founder and CTO of Observe Inc., details Observe's observability solution built on a lakehouse architecture. He shares his experience at Splunk and Google, which led to founding Observe. Leverich emphasizes that a generic lakehouse setup will fail for observability due to latency requirements. He highlights the importance of OpenTelemetry for data collection, Kafka for buffering, and a dynamic loader for balancing latency and efficiency. The discussion covers data curation, enrichment, and the abstraction of SQL to optimize query execution. The conversation also addresses the role of table formats like Iceberg and AI-native workflows in enhancing observability data management.

Outlines

Part 1: Background, Origins

Part 2: Challenges, Lakehouse Architecture

Part 3: Curation, Lessons, Use Cases

Part 4: Conclusion

Sign in to continue reading, translating and more.

Open full episode in Podwise

Data Engineering Podcast

Part 1: Background, Origins

Introducing Jacob Leverich and the Genesis of Observe Inc.

From Grad School to Splunk: Jacob's Journey into Data Management and Observability

Part 2: Challenges, Lakehouse Architecture

Observability Pain Points: Data Silos, High Costs, and Limited Access

Lakehouse Architecture for Observability: Addressing Latency and Data Volume Challenges

The Role of Table Formats and AI in Lakehouse Observability

Part 3: Curation, Lessons, Use Cases

Guarding Against Entropy: Curation, AI, and Quantified Business Value

Lessons Learned: Meeting Users Where They Are

When Not to Use a Lakehouse: Scale and Organizational Complexity

Part 4: Conclusion

Closing Remarks and Show Information

Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability

Data Engineering Podcast

Part 1: Background, Origins

00:11Introducing Jacob Leverich and the Genesis of Observe Inc.

Introducing Jacob Leverich and the Genesis of Observe Inc.

04:42From Grad School to Splunk: Jacob's Journey into Data Management and Observability

From Grad School to Splunk: Jacob's Journey into Data Management and Observability

Part 2: Challenges, Lakehouse Architecture

12:59Observability Pain Points: Data Silos, High Costs, and Limited Access

Observability Pain Points: Data Silos, High Costs, and Limited Access

18:54Lakehouse Architecture for Observability: Addressing Latency and Data Volume Challenges

Lakehouse Architecture for Observability: Addressing Latency and Data Volume Challenges

35:15The Role of Table Formats and AI in Lakehouse Observability

The Role of Table Formats and AI in Lakehouse Observability

Part 3: Curation, Lessons, Use Cases

47:55Guarding Against Entropy: Curation, AI, and Quantified Business Value

Guarding Against Entropy: Curation, AI, and Quantified Business Value

59:55Lessons Learned: Meeting Users Where They Are

Lessons Learned: Meeting Users Where They Are

1:04:21When Not to Use a Lakehouse: Scale and Organizational Complexity

When Not to Use a Lakehouse: Scale and Organizational Complexity

Part 4: Conclusion

1:11:33Closing Remarks and Show Information

Closing Remarks and Show Information