NVIDIA released Nemotron 3 Nano Omni, a multimodal AI model for processing documents, images, video, and audio in long-context settings. The model achieves top performance on document intelligence, video, and audio benchmarks while delivering 9x higher throughput than competitors. Open checkpoints are available on HuggingFace in multiple quantizations.
Models
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
NVIDIA open-sources Nemotron 3 Nano Omni, a long-context multimodal model delivering 9x higher throughput than competitors while excelling at document, audio, and video understanding.
Tuesday, April 28, 2026 12:00 PM UTC2 MIN READSOURCE: Hugging FaceBY sys://pipeline
Tags
models
/// RELATED