14 Apr 2026
1h 0m

The world of voice AI, with Mati Staniszewski of ElevenLabs

Podcast cover

Cheeky Pint

AI audio technology has transitioned from primitive, hard-coded signal replication to sophisticated neural models that predict phonemes and context, enabling human-like emotional inflection. ElevenLabs, led by co-founder Mati Staniszewski, leverages this shift by building foundational models that power both creative storytelling and complex agentic workflows. While speech-to-speech models offer lower latency, cascaded systems—integrating transcription, LLMs, and text-to-speech—remain superior for enterprise reliability and task orchestration. The company’s rapid growth to over $450 million in ARR stems from a dual-track strategy: a self-serve platform that fosters developer innovation and high-touch engineering partnerships for large-scale digital transformation. Internally, the organization maintains a flat structure with small, autonomous teams, emphasizing high agency and technical proficiency to rapidly deploy AI-native solutions across diverse sectors, including government services and customer support.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise