Arxiv paper - Slow-Fast Architecture for Video Multi-Modal Large Language Models | AI Breakdown | Podwise