Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models | Arxiv Papers | Podwise