How Do Multimodal Large Language Models Perform on Clinical Vignette Questions? | Healthy Dialogue | Podwise