What is the best LLM API for open source?

Based on our verified pricing data, the cheapest model that qualifies is GLM-4.7-Flash by Zhipu at $0.000/Mtok input. See the full ranking above for more options.

How often are prices updated?

Prices are verified against official provider pricing pages 3 times daily (8am, 2pm, 8pm UTC) by our automated scraper pipeline.

Pricing / Best For / Best Open Source LLM APIs

Best Open Source LLM APIs

Open source LLM APIs with open weights. Compare pricing for models like Llama, Qwen, Mistral, and DeepSeek that you can self-host or use via managed API.

77 models qualify Showing top 15 Sorted by blended cost

GLM-4.7-Flash

Zhipu

$0.000 in $0.000 out

$0.000/Mtok blended

128K ctx

GLM-OCR

Zhipu

$0.030 in $0.030 out

$0.030/Mtok blended

128K ctx

Granite 4.0 Micro

IBM

$0.017 in $0.112 out

$0.065/Mtok blended

128K ctx

Cost calculator for this use case

Tokens per day

Input/output ratio: 70/30

Days per month

🥇 GLM-4.7-Flash $—

🥈 GLM-OCR $—

🥉 Granite 4.0 Micro $—

Full ranking — top 15 models

#	Model	Provider	Input $/Mtok	Output $/Mtok	Blended	Context
1	GLM-4.7-Flash	Zhipu	$0.000	$0.000	$0.000	128K	→
2	GLM-OCR	Zhipu	$0.030	$0.030	$0.030	128K	→
3	Granite 4.0 Micro	IBM	$0.017	$0.112	$0.065	128K	→
4	Llama 3.1 8B	Meta	$0.050	$0.080	$0.065	128K	→
5	Baichuan M2-32B	Baichuan	$0.070	$0.070	$0.070	33K	→
6	LFM2 24B A2B	Together	$0.030	$0.120	$0.075	128K	→
7	Ministral 3 3B	Mistral	$0.100	$0.100	$0.100	128K	→
8	Pixtral 12B	Mistral	$0.100	$0.100	$0.100	128K	→
9	Nemotron 70B Instruct	NVIDIA	$0.100	$0.100	$0.100	128K	→
10	GLM-4-32B-0414	Zhipu	$0.100	$0.100	$0.100	128K	→
11	Granite Embedding 278M Multilingual	IBM	$0.106	$0.106	$0.106	—	→
12	Mistral Small 3.2 24B	Mistral	$0.080	$0.200	$0.140	128K	→
13	Granite 4 H Small	IBM	$0.060	$0.250	$0.155	128K	→
14	GPT OSS 20B	Fireworks	$0.070	$0.300	$0.185	128K	→
15	Voxtral Small 24B	Mistral	$0.100	$0.300	$0.200	128K	→

How models are selected

Models with open weights, sorted by blended cost.

Prices are per million tokens (Mtok), sourced directly from official provider pricing pages and verified by our automated scraper pipeline that runs 3x daily. "Blended cost" is the average of input and output pricing — a quick proxy for typical 50/50 usage patterns.

Best Open Source LLM APIs

GLM-4.7-Flash

GLM-OCR

Granite 4.0 Micro

Cost calculator for this use case

Full ranking — top 15 models

How models are selected

Other use case rankings