LIVE Cheapest: GLM-4.7-Flash $0.000/Mtok in 153 models tracked Updated Jun 25, 2026
Jun 25, 2026
ModelPriceWatch$/Mtok
Pricing / Best For / Best Multimodal LLM APIs

Best Multimodal LLM APIs

Compare multimodal LLM APIs that accept text, images, video, and audio. Find the best vision-capable model for your use case and budget.

53 models qualify Showing top 15 Sorted by blended cost
1

GLM-OCR

Zhipu

$0.030 in $0.030 out
$0.030/Mtok blended
128K ctx
2

Pixtral 12B

Mistral

$0.100 in $0.100 out
$0.100/Mtok blended
128K ctx
3

Reka Edge

Reka

$0.100 in $0.100 out
$0.100/Mtok blended
66K ctx

Cost calculator for this use case

🥇 GLM-OCR $—
🥈 Pixtral 12B $—
🥉 Reka Edge $—

Full ranking — top 15 models

# Model Provider Input $/Mtok Output $/Mtok Blended Context
1 GLM-OCR Zhipu $0.030 $0.030 $0.030 128K
2 Pixtral 12B Mistral $0.100 $0.100 $0.100 128K
3 Reka Edge Reka $0.100 $0.100 $0.100 66K
4 Embed 4 Cohere $0.120 $0.120 $0.120
5 voyage-multimodal-3.5 Voyage AI $0.120 $0.120 $0.120
6 Nova Lite Amazon $0.060 $0.240 $0.150 300K
7 Gemini 2.5 Flash Google $0.075 $0.300 $0.188 1M
8 Llama 4 Scout Meta $0.110 $0.340 $0.225 10M
9 Gemini 2.5 Flash-Lite Google $0.100 $0.400 $0.250 1M
10 GPT-4.1 nano OpenAI $0.100 $0.400 $0.250 1M
11 Grok 4.1 Fast xAI $0.200 $0.500 $0.350 2M
12 Llama 4 Scout Groq $0.110 $1.00 $0.555 10M
13 GLM-4.6V Zhipu $0.300 $0.900 $0.600 128K
14 MiniMax M3 Fireworks $0.300 $1.20 $0.750 1M
15 MiniMax-M3 MiniMax $0.300 $1.20 $0.750 1M

How models are selected

Models supporting image input, sorted by blended cost.

Prices are per million tokens (Mtok), sourced directly from official provider pricing pages and verified by our automated scraper pipeline that runs 3x daily. "Blended cost" is the average of input and output pricing — a quick proxy for typical 50/50 usage patterns.