LIVE Cheapest: GLM-4.7-Flash $0/Mtok in 154 models tracked Updated Jul 1, 2026
Jul 1, 2026
ModelPriceWatch$/Mtok
Pricing / Compare / DeepSeek V4 Pro vs Llama 3.3 70B

DeepSeek V4 Pro vs Llama 3.3 70B

Side-by-side comparison of API pricing, specs, benchmarks, and capabilities

🏆 DeepSeek V4 Pro is 5.7% cheaper on blended cost ($0.652 vs $0.690/Mtok)
 
3 providers: DeepSeek $0.435/$0.870 Fireworks $1.74/$3.48 Together $1.74/$3.48
by Meta
2 providers: Meta $0.590/$0.790 Together $1.04/$1.04
Overview
Status Current mid tier Current open weights Open weights
Released Apr 24, 2026 Dec 6, 2024
Pricing per million tokens
Input $0.435/Mtok $0.590/Mtok
Output $0.870/Mtok $0.790/Mtok
Blended avg $0.652/Mtok $0.690/Mtok
Cached input $0.004/Mtok
Specifications
Context window 1M tokens 128K tokens
Parameters Proprietary 70B
Speed (TPS)
Modalities
Input
text
text
Benchmarks from Vellum LLM Leaderboard
Avg benchmark score 64.1 43.6
Perf / dollar 98.2 63.2
GPQA Diamond 72 45
SWE-Bench 50 22
HumanEval 88 78
MATH 500 80 60
Providers
Available from
DeepSeek — $0.435/$0.870/Mtok
Fireworks — $1.74/$3.48/Mtok
Together — $1.74/$3.48/Mtok
Meta — $0.590/$0.790/Mtok
Together — $1.04/$1.04/Mtok

Cost at scale — 1M tokens (50/50 input/output)

VolumeDeepSeek V4 ProLlama 3.3 70BSavings
1M tokens $0.65 $0.69 $0.04 (5.8%)
10M tokens $6.53 $6.9 $0.38 (5.5%)
100M tokens $65.25 $69 $3.75 (5.4%)
1000M tokens $652.5 $690 $37.5 (5.4%)

Summary

DeepSeek V4 Pro by DeepSeek costs $0.435/Mtok input and $0.870/Mtok output, with a 1M-token context window. It supports text input and is available from 3 providers.

Llama 3.3 70B by Meta costs $0.590/Mtok input and $0.790/Mtok output, with a 128K-token context window. It supports text input and is available from 2 providers.

On a blended cost basis, DeepSeek V4 Pro is 5.7% cheaper than Llama 3.3 70B. It also has a larger context window.

On benchmarks, DeepSeek V4 Pro scores higher (64.1 vs 43.6) on average. In terms of value, DeepSeek V4 Pro has better performance per dollar (98.2 vs 63.2).

Note: Pricing is per million tokens. Actual costs vary with usage patterns, prompt caching, and batch discounts. Always verify against official provider pricing pages.