Researchers benchmark zero-shot vision-language models (Claude 3.5 Sonnet, GPT-4o, LLaMA 3.2 Vision) for facial age estimation on two datasets, demonstrating that general-purpose LVLMs compete with domain-specific approaches without fine-tuning. The paper highlights Claude 3.5 Sonnet's emergent capabilities in biometric tasks and positions LVLMs as practical tools for real-world applications beyond natural language.
Models
VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation
Large vision-language models including Claude 3.5 Sonnet outperform domain-specific alternatives at facial age estimation with zero-shot learning, expanding their real-world utility to biometric applications.
Monday, March 30, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.AIBY sys://pipeline
Tags
models