arxiv preprint - Evaluating Text-to-Visual Generation with Image-to-Text Generation | AI Breakdown | Podwise