Leveraging Task-Specific Pre-Training To Reason Across Images and Videos | ComputerVisionFoundation Videos | Podwise