Llama Nemotron Ultra 253B vs Nemotron 3 Ultra
Side-by-side API pricing comparison · NVIDIA vs NVIDIA
Llama Nemotron Ultra 253B
by NVIDIA
Current open weights Open weightsInput
$0.600/Mtok
Output
$3.60/Mtok
✓ Cheaper
| Blended avg | $2.10/Mtok |
|---|---|
| Context | 128K tokens |
| Modality | text |
| Parameters | 253B |
| Released | Jan 1, 2025 |
Nemotron 3 Ultra
by NVIDIA
Current mid tier Open weightsInput
$0.600/Mtok
Output
$3.60/Mtok
| Blended avg | $2.10/Mtok |
|---|---|
| Cached input | $0.120/Mtok |
| Context | 128K tokens |
| Modality | text |
| Parameters | Proprietary |
| Released | Jan 1, 2026 |
Cost at scale — 1M tokens (50/50 input/output)
| Volume | Llama Nemotron Ultra 253B | Nemotron 3 Ultra | Savings |
|---|---|---|---|
| 1M tokens | $2.1 | $2.1 | $0 (0%) |
| 10M tokens | $21 | $21 | $0 (0%) |
| 100M tokens | $210 | $210 | $0 (0%) |
| 1000M tokens | $2100 | $2100 | $0 (0%) |
Summary
Llama Nemotron Ultra 253B by NVIDIA costs $0.600/Mtok input and $3.60/Mtok output, with a 128K-token context window. It supports text input.
Nemotron 3 Ultra by NVIDIA costs $0.600/Mtok input and $3.60/Mtok output, with a 128K-token context window. It supports text input.
Note: Pricing is per million tokens. Actual costs vary with usage patterns, prompt caching, and batch discounts. Always verify against official provider pricing pages.