LIVE Cheapest: GLM-4.7-Flash $0.000/Mtok in 153 models tracked Updated Jun 25, 2026
Jun 25, 2026
ModelPriceWatch$/Mtok
Pricing / Compare / Llama 3.3 70B vs Llama Nemotron Ultra 253B

Llama 3.3 70B vs Llama Nemotron Ultra 253B

Side-by-side API pricing comparison · Together vs NVIDIA

🏆 Llama 3.3 70B is 101.9% cheaper on blended cost ($1.04 vs $2.10/Mtok)

Llama 3.3 70B

by Together

Current open weights Open weights
Input
$1.04/Mtok
Output
$1.04/Mtok
✓ Cheaper
Blended avg$1.04/Mtok
Context128K tokens
Modalitytext
Parameters70B
ReleasedDec 6, 2024
Full details →

Llama Nemotron Ultra 253B

by NVIDIA

Current open weights Open weights
Input
$0.600/Mtok
Output
$3.60/Mtok
Blended avg$2.10/Mtok
Context128K tokens
Modalitytext
Parameters253B
ReleasedJan 1, 2025
Full details →

Cost at scale — 1M tokens (50/50 input/output)

VolumeLlama 3.3 70BLlama Nemotron Ultra 253BSavings
1M tokens $1.04 $2.1 $1.06 (50.5%)
10M tokens $10.4 $21 $10.6 (50.5%)
100M tokens $104 $210 $106 (50.5%)
1000M tokens $1040 $2100 $1060 (50.5%)

Summary

Llama 3.3 70B by Together costs $1.04/Mtok input and $1.04/Mtok output, with a 128K-token context window. It supports text input.

Llama Nemotron Ultra 253B by NVIDIA costs $0.600/Mtok input and $3.60/Mtok output, with a 128K-token context window. It supports text input.

On a blended cost basis, Llama 3.3 70B is 101.9% cheaper than Llama Nemotron Ultra 253B.

Note: Pricing is per million tokens. Actual costs vary with usage patterns, prompt caching, and batch discounts. Always verify against official provider pricing pages.