VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text?
by Qing’an Liu et al.
Feb 11, 2026 • 07:10
VISTA-BenchVision-Language Models (VLMs)Modality GapVisualized Text
00:0007:10
Download on the App Store
Get the full experience with ResearchPod
ResearchPod turns research papers into podcasts you can actually follow.