LIVE Cheapest: GLM-4.7-Flash $0/Mtok in 154 models tracked Updated Jul 1, 2026
Jul 1, 2026
ModelPriceWatch$/Mtok
Pricing / Compare / Llama 3.3 70B vs GPT-5.4

Llama 3.3 70B vs GPT-5.4

Side-by-side comparison of API pricing, specs, benchmarks, and capabilities

🏆 Llama 3.3 70B is 1168.1% cheaper on blended cost ($0.690 vs $8.75/Mtok)
 
by Meta
2 providers: Meta $0.590/$0.790 Together $1.04/$1.04
Overview
Status Current open weights Open weights Current flagship
Released Dec 6, 2024 Aug 15, 2025
Pricing per million tokens
Input $0.590/Mtok $2.50/Mtok
Output $0.790/Mtok $15.00/Mtok
Blended avg $0.690/Mtok $8.75/Mtok
Cached input $0.250/Mtok
Specifications
Context window 128K tokens 1M tokens
Parameters 70B Proprietary
Speed (TPS)
Modalities
Input
text
textimage
Benchmarks from Vellum LLM Leaderboard
Avg benchmark score 43.6 78.7
Perf / dollar 63.2 9
GPQA Diamond 45 88
SWE-Bench 22 75
HumanEval 78 94.5
MATH 500 60 92
Providers
Available from
Meta — $0.590/$0.790/Mtok
Together — $1.04/$1.04/Mtok
OpenAI — $2.50/$15.00/Mtok

Cost at scale — 1M tokens (50/50 input/output)

VolumeLlama 3.3 70BGPT-5.4Savings
1M tokens $0.69 $8.75 $8.06 (92.1%)
10M tokens $6.9 $87.5 $80.6 (92.1%)
100M tokens $69 $875 $806 (92.1%)
1000M tokens $690 $8750 $8060 (92.1%)

Summary

Llama 3.3 70B by Meta costs $0.590/Mtok input and $0.790/Mtok output, with a 128K-token context window. It supports text input and is available from 2 providers.

GPT-5.4 by OpenAI costs $2.50/Mtok input and $15.00/Mtok output, with a 1M-token context window. It supports text, image input.

On a blended cost basis, Llama 3.3 70B is 1168.1% cheaper than GPT-5.4.

On benchmarks, GPT-5.4 scores higher (78.7 vs 43.6) on average. In terms of value, Llama 3.3 70B has better performance per dollar (63.2 vs 9).

Note: Pricing is per million tokens. Actual costs vary with usage patterns, prompt caching, and batch discounts. Always verify against official provider pricing pages.