ComputerVisionFoundation Videos - ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection
Sign in to continue reading, translating and more.