VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text? | Qing’an Liu et al.

VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text? | Qing’an Liu et al. | ResearchPod