Best Open Source LLM APIs
Open source LLM APIs with open weights. Compare pricing for models like Llama, Qwen, Mistral, and DeepSeek that you can self-host or use via managed API.
Cost calculator for this use case
🥇 GLM-4.7-Flash
$—
🥈 GLM-OCR
$—
🥉 Granite 4.0 Micro
$—
Full ranking — top 15 models
| # | Model | Provider | Input $/Mtok | Output $/Mtok | Blended | Context | |
|---|---|---|---|---|---|---|---|
| 1 | GLM-4.7-Flash | Zhipu | $0.000 | $0.000 | $0.000 | 128K | → |
| 2 | GLM-OCR | Zhipu | $0.030 | $0.030 | $0.030 | 128K | → |
| 3 | Granite 4.0 Micro | IBM | $0.017 | $0.112 | $0.065 | 128K | → |
| 4 | Llama 3.1 8B | Meta | $0.050 | $0.080 | $0.065 | 128K | → |
| 5 | Baichuan M2-32B | Baichuan | $0.070 | $0.070 | $0.070 | 33K | → |
| 6 | LFM2 24B A2B | Together | $0.030 | $0.120 | $0.075 | 128K | → |
| 7 | Ministral 3 3B | Mistral | $0.100 | $0.100 | $0.100 | 128K | → |
| 8 | Pixtral 12B | Mistral | $0.100 | $0.100 | $0.100 | 128K | → |
| 9 | Nemotron 70B Instruct | NVIDIA | $0.100 | $0.100 | $0.100 | 128K | → |
| 10 | GLM-4-32B-0414 | Zhipu | $0.100 | $0.100 | $0.100 | 128K | → |
| 11 | Granite Embedding 278M Multilingual | IBM | $0.106 | $0.106 | $0.106 | — | → |
| 12 | Mistral Small 3.2 24B | Mistral | $0.080 | $0.200 | $0.140 | 128K | → |
| 13 | Granite 4 H Small | IBM | $0.060 | $0.250 | $0.155 | 128K | → |
| 14 | GPT OSS 20B | Fireworks | $0.070 | $0.300 | $0.185 | 128K | → |
| 15 | Voxtral Small 24B | Mistral | $0.100 | $0.300 | $0.200 | 128K | → |
How models are selected
Models with open weights, sorted by blended cost.
Prices are per million tokens (Mtok), sourced directly from official provider pricing pages and verified by our automated scraper pipeline that runs 3x daily. "Blended cost" is the average of input and output pricing — a quick proxy for typical 50/50 usage patterns.