arxiv preprint - ViNT: A Foundation Model for Visual Navigation | AI Breakdown | Podwise