Llama 3.1 8B Instant vs GPT-OSS 20B
Side-by-side API pricing comparison · Groq vs Groq
🏆
Llama 3.1 8B Instant is 2.4% cheaper on blended cost ($0.525 vs $0.537/Mtok)
Llama 3.1 8B Instant
by Groq
Current fast Open weightsInput
$0.050/Mtok
Output
$1.00/Mtok
✓ Cheaper
| Blended avg | $0.525/Mtok |
|---|---|
| Context | 128K tokens |
| Modality | text |
| Parameters | 8B |
| Released | Jul 23, 2024 |
GPT-OSS 20B
by Groq
Current fast Open weightsInput
$0.075/Mtok
Output
$1.00/Mtok
| Blended avg | $0.537/Mtok |
|---|---|
| Context | 128K tokens |
| Modality | text |
| Parameters | 20B |
| Released | Jan 1, 2026 |
Cost at scale — 1M tokens (50/50 input/output)
| Volume | Llama 3.1 8B Instant | GPT-OSS 20B | Savings |
|---|---|---|---|
| 1M tokens | $0.53 | $0.54 | $0.01 (1.9%) |
| 10M tokens | $5.25 | $5.38 | $0.13 (2.4%) |
| 100M tokens | $52.5 | $53.75 | $1.25 (2.3%) |
| 1000M tokens | $525 | $537.5 | $12.5 (2.3%) |
Summary
Llama 3.1 8B Instant by Groq costs $0.050/Mtok input and $1.00/Mtok output, with a 128K-token context window. It supports text input.
GPT-OSS 20B by Groq costs $0.075/Mtok input and $1.00/Mtok output, with a 128K-token context window. It supports text input.
On a blended cost basis, Llama 3.1 8B Instant is 2.4% cheaper than GPT-OSS 20B.
Note: Pricing is per million tokens. Actual costs vary with usage patterns, prompt caching, and batch discounts. Always verify against official provider pricing pages.