NVIDIA Nemotron 3 Nano Omni — Specifications
| المطوِّر | NVIDIA |
|---|
| النوع | Multimodal (omni) |
|---|
| النمط | Text, Image, Audio, Video → Text |
|---|
| المعلمات | 30B total / ~3B active (MoE) |
|---|
| نافذة السياق | ٢٥٦ ألف رمز |
|---|
| أقصى إخراج | — |
|---|
| الترخيص | (ترخيص Nemotron للنماذج المفتوحة) |
|---|
| الأوزان المفتوحة | ✅ Yes |
|---|
| تاريخ الإصدار | 2026 |
|---|
| Input price | — |
|---|
| Output price | — |
|---|
| API providers | Hugging Face, OpenRouter, NVIDIA NIM |
|---|
🖥️ Run it locally
| VRAM (FP16/BF16) | ~٦٢ غيغابايت |
|---|
| VRAM (4-bit) | ~21 GB (NVFP4) |
|---|
| Minimum GPU | RTX 5090 32GB (NVFP4) / H100 80GB (BF16) |
|---|
📊 Benchmarks
| OCRBench V2 | 67.04 |
|---|
| Video-MME | 72.2 |
|---|
| OSWorld | 47.4 |
|---|
| Speech IF | 89.39 |
|---|
Official page →
NVIDIA’s open omni-modal model — it sees, hears, watches and reads (text, image, audio, video → text) in a single 30B-A3B mixture-of-experts that activates only ~3B parameters per token. A Mamba-Transformer hybrid that runs on one high-end GPU; open weights under the NVIDIA Open Model Agreement (commercial use allowed).