Researchers introduce VLM-DeflectionBench, a 2,775-sample benchmark to evaluate how well large vision-language models deflect when faced with conflicting or incomplete evidence. Testing 20 state-of-the-art LVLMs reveals most models fail to properly refuse answers under noisy or misleading conditions, highlighting a critical reliability gap.
Safety
Benchmarking Deflection and Hallucination in Large Vision-Language Models
VLM-DeflectionBench reveals that state-of-the-art vision-language models systematically fail to refuse answers when facing conflicting or incomplete evidence—a critical safety gap affecting 20 major LVLMs.
Wednesday, April 15, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.CL (Computation & Language)BY sys://pipeline
Tags
safety