Home/Models/Gemini 2.5 Flash
1M contextMultimodalThinks by default

Gemini 2.5 Flash on QuickSilver Pro

Gemini 2.5 Flash is Google's workhorse — 1M-token context, multimodal foundation, thinks by default. On QuickSilver Pro it lists at $0.255 input / $2.125 output per 1M tokens, ~15% below Vertex retail and OpenRouter's $0.30/$2.50. Thinking is always-on for 2.5 Flash through QuickSilver Pro — `reasoning: { enabled: false }` is silently dropped; for non-thinking Gemini, use 2.5 Flash Lite ($0.085/$0.34).

$0.26 input · $2.13 output per 1M tokens
ByRaullen Chai·Updated

At a glance

Context
1M tokens
Input / 1M
$0.26
Output / 1M
$2.13
Thinks by default
Yes

Coding, multimodal Q&A, long-context analysis at 1M tokens — Google's general-purpose pick.

Pricing comparison ($/1M tokens)

ProviderInputOutputvs QSP
QuickSilver Pro$0.26$2.13cheapest
OpenRouter (google/gemini-2.5-flash)$0.30$2.5015% cheaper
OpenAI (GPT-4o-mini)$0.15$0.60254% more expensive

When to use

Default to 2.5 Flash when you need Gemini's strengths: multimodal input (images, audio adjacent), 1M context fitting a whole codebase or transcript, thinking-by-default reasoning on harder turns of an agentic loop. Strong on coding evals (HumanEval, SWE-bench) and long-context retrieval. Good drop-in for teams already on Gemini for other Google Cloud reasons.

When to use something else

For pure cheap high-volume chat, Gemini 2.5 Flash Lite ($0.085/$0.34) is ~5-10x cheaper. For top-tier reasoning, escalate to Gemini 3.1 Pro Preview or DeepSeek R1. For non-Gemini coding at lower cost, DeepSeek V3 ($0.16/$0.616) or V4 Flash ($0.08/$0.16) beat 2.5 Flash on per-token economics.

Quickstart (curl)

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

Yes. 2.5 Flash takes image inputs via the standard `content` array with `image_url` parts, same as OpenAI's GPT-4o multimodal API. Multimodal requests are forwarded verbatim — same JSON shape on the way in and out. For image *generation* (not analysis), use gemini-2.5-flash-image instead — that's a different model SKU priced for image output.

V4 Flash is cheaper: $0.08/$0.16 per 1M vs 2.5 Flash's $0.255/$2.125. Both think by default, and V4 Flash uniquely lets you suppress reasoning via `reasoning.enabled=false` (QuickSilver Pro honors it on DeepSeek models) — on 2.5 Flash that flag is silently dropped. So for cost-bound text workloads where you want a non-thinking path, V4 Flash wins. 2.5 Flash earns its premium for multimodal input and Google-ecosystem-aligned tasks.

QuickSilver Pro lists 2.5 Flash at $0.255 input / $2.125 output per 1M — ~15% below Vertex retail and OpenRouter's $0.30/$2.50. You get unified billing across Gemini, DeepSeek, Qwen, and Kimi catalogs through one OpenAI-compatible key, plus referral rewards and `usage.cost` accounting.

Try Gemini 2.5 Flash with double credits — up to $50 free

Get API Key