arxiv
PublishedApril 23, 2026 at 4:00 AM
—neutral
Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations
Publisher summary· verbatim
arXiv:2509.25844v3 Announce Type: replace Abstract: When people query Vision-Language Models (VLMs) but cannot see the accompanying visual context (e.g. for blind and low-vision users), augmenting VLM predictions with natural language explanations can signal which model predictions are reliable. How
Discussion
No replies yet. Be first.
Related coverage
More from ARXIV
arxivFrom Local to Cluster: A Unified Framework for Causal Discovery with Latent Variables10harxivConsequentialist Objectives and Catastrophe10harxivEgoMAGIC- An Egocentric Video Field Medicine Dataset for Training Perception Algorithms10harxivA general optimization solver based on OP-to-MaxSAT reduction10hOriginally published on arxiv ↗