Llama 3.3 70B — Specifications
| Developer | Meta |
|---|
| Type | LLM (dense) |
|---|
| Modality | Text → Text |
|---|
| Parameters | 70B |
|---|
| Context window | 128K |
|---|
| Max output | — |
|---|
| License | Llama 3.3 Community (open) |
|---|
| Open weights | ✅ Yes |
|---|
| Released | 2024 |
|---|
| Input price | $0.10 /1M |
|---|
| Output price | $0.32 /1M |
|---|
| API providers | Together, DeepInfra, OpenRouter, Ollama |
|---|
🖥️ Run it locally
| VRAM (4-bit) | ~40 GB |
|---|
| Minimum GPU | 2× RTX 4090 / 1× 48GB |
|---|
Official page →
Meta’s efficient 70B dense model — near-405B quality at a fraction of the size, with a 128K context. One of the most-deployed open models for self-hosting; runs 4-bit on dual-24GB or a single 48GB card.