Grounding Spatio-temporal Language with Transformers | JRC Workshop 2021 | Microsoft Research | Podwise