LIVE Cheapest: GLM-4.7-Flash $0/Mtok in 154 models tracked Updated Jul 1, 2026
Jul 1, 2026
ModelPriceWatch$/Mtok
Pricing / Compare / GPT-4.1 mini vs Gemini 2.5 Flash

GPT-4.1 mini vs Gemini 2.5 Flash

Side-by-side comparison of API pricing, specs, benchmarks, and capabilities

🏆 Gemini 2.5 Flash is 433.3% cheaper on blended cost ($0.188 vs $1.00/Mtok)
 
Overview
Status Current budget Current budget
Released Apr 14, 2025 Jun 17, 2025
Pricing per million tokens
Input $0.400/Mtok $0.075/Mtok
Output $1.60/Mtok $0.300/Mtok
Blended avg $1.00/Mtok $0.188/Mtok
Cached input $0.200/Mtok
Specifications
Context window 1M tokens 1M tokens
Parameters Proprietary Proprietary
Speed (TPS)
Modalities
Input
textimage
textimageaudiovideo
Benchmarks from Vellum LLM Leaderboard
Avg benchmark score 50.6 53.7
Perf / dollar 50.6 286.4
GPQA Diamond 55 55
SWE-Bench 30 38
HumanEval 85 82
MATH 500 70 72
Providers
Available from
OpenAI — $0.400/$1.60/Mtok
Google — $0.075/$0.300/Mtok

Cost at scale — 1M tokens (50/50 input/output)

VolumeGPT-4.1 miniGemini 2.5 FlashSavings
1M tokens $1 $0.19 $0.81 (81%)
10M tokens $10 $1.88 $8.13 (81.3%)
100M tokens $100 $18.75 $81.25 (81.3%)
1000M tokens $1000 $187.5 $812.5 (81.3%)

Summary

GPT-4.1 mini by OpenAI costs $0.400/Mtok input and $1.60/Mtok output, with a 1M-token context window. It supports text, image input.

Gemini 2.5 Flash by Google costs $0.075/Mtok input and $0.300/Mtok output, with a 1M-token context window. It supports text, image, audio, video input.

On a blended cost basis, Gemini 2.5 Flash is 433.3% cheaper than GPT-4.1 mini.

On benchmarks, Gemini 2.5 Flash scores higher (53.7 vs 50.6) on average. In terms of value, Gemini 2.5 Flash has better performance per dollar (286.4 vs 50.6).

Note: Pricing is per million tokens. Actual costs vary with usage patterns, prompt caching, and batch discounts. Always verify against official provider pricing pages.