LIVE Cheapest: GLM-4.7-Flash $0/Mtok in 154 models tracked Updated Jul 1, 2026
Jul 1, 2026
ModelPriceWatch$/Mtok
Pricing / Compare / Llama 3.3 70B vs Claude Sonnet 4.6

Llama 3.3 70B vs Claude Sonnet 4.6

Side-by-side comparison of API pricing, specs, benchmarks, and capabilities

🏆 Llama 3.3 70B is 1204.3% cheaper on blended cost ($0.690 vs $9.00/Mtok)
 
by Meta
2 providers: Meta $0.590/$0.790 Together $1.04/$1.04
Overview
Status Current open weights Open weights Current mid tier
Released Dec 6, 2024 Jan 15, 2026
Pricing per million tokens
Input $0.590/Mtok $3.00/Mtok
Output $0.790/Mtok $15.00/Mtok
Blended avg $0.690/Mtok $9.00/Mtok
Cached input $0.300/Mtok
Specifications
Context window 128K tokens 200K tokens
Parameters 70B Proprietary
Speed (TPS)
Modalities
Input
text
textimage
Benchmarks from Vellum LLM Leaderboard
Avg benchmark score 43.6 73.8
Perf / dollar 63.2 8.2
GPQA Diamond 45
SWE-Bench 22
HumanEval 78
MATH 500 60
Providers
Available from
Meta — $0.590/$0.790/Mtok
Together — $1.04/$1.04/Mtok
Anthropic — $3.00/$15.00/Mtok

Cost at scale — 1M tokens (50/50 input/output)

VolumeLlama 3.3 70BClaude Sonnet 4.6Savings
1M tokens $0.69 $9 $8.31 (92.3%)
10M tokens $6.9 $90 $83.1 (92.3%)
100M tokens $69 $900 $831 (92.3%)
1000M tokens $690 $9000 $8310 (92.3%)

Summary

Llama 3.3 70B by Meta costs $0.590/Mtok input and $0.790/Mtok output, with a 128K-token context window. It supports text input and is available from 2 providers.

Claude Sonnet 4.6 by Anthropic costs $3.00/Mtok input and $15.00/Mtok output, with a 200K-token context window. It supports text, image input.

On a blended cost basis, Llama 3.3 70B is 1204.3% cheaper than Claude Sonnet 4.6.

On benchmarks, Claude Sonnet 4.6 scores higher (73.8 vs 43.6) on average. In terms of value, Llama 3.3 70B has better performance per dollar (63.2 vs 8.2).

Note: Pricing is per million tokens. Actual costs vary with usage patterns, prompt caching, and batch discounts. Always verify against official provider pricing pages.