Tightly Connecting Vision and Language | Microsoft Research | Podwise