In this episode of the Latent Space Podcast, Alessio and Swyx host Gorkem and Batuhan from Fal. They discuss Fal's journey from a Python runtime in the cloud to a generative media platform, focusing on image, video, and audio model inference. They delve into the company's pivot to specializing in diffusion and inference, the decision to focus on generative media over language models, and the technical challenges of optimizing performance with custom CUDA kernels. The conversation covers the history of popular models like Stable Diffusion and VO3, the importance of latency for users, and the impact of open source models. They also explore the potential of world models, the rise of video models, and the role of LORAs in customization, as well as requests for startups and engineers in the generative media space.
Sign in to continue reading, translating and more.
Continue