Episode cover
24 Jun 2026
1h 34m

Building a data warehouse from scratch with Jacob Baskin

Podcast cover

Signals and Threads

Software engineering at a trading firm requires balancing complex economic incentives with high-performance distributed systems. Jacob Baskin, a software engineer at Jane Street, details the evolution of internal data infrastructure, specifically the transition from traditional transactional databases like PostgreSQL to custom, horizontally scalable solutions like Superstore. These systems prioritize analytical throughput and explainability over rigid transactional semantics, often utilizing asynchronous writes and columnar storage to manage massive data volumes. Beyond data storage, Baskin discusses the challenges of scheduling large-scale compute clusters, known as the Hive, where optimizing for job urgency and resource topology is critical. By treating infrastructure as a series of strategic bets and prioritizing domain-specific requirements over off-the-shelf solutions, engineering teams can build highly specialized tools that outperform generic cloud-based alternatives in high-stakes financial environments.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise