Home/Models/Gemini 3.5 Flash
GA1M contextThinks by default

Gemini 3.5 Flash on QuickSilver Pro

Gemini 3.5 Flash is Google's next-generation Flash model — GA (not preview), 1M-token context, thinks by default. Sits between 3 Flash Preview and 3.1 Pro Preview on both capability and price. On QuickSilver Pro it lists at $1.275 input / $7.65 output per 1M tokens, ~15% below Vertex retail and OpenRouter's $1.50/$9.00. Thinking is always-on for 3.5 Flash through QuickSilver Pro; for non-thinking Gemini use 2.5 Flash Lite ($0.085/$0.34).

$1.27 input · $7.65 output per 1M tokens
ByRaullen Chai·Updated

At a glance

Context
1M tokens
Input / 1M
$1.27
Output / 1M
$7.65
Thinks by default
Yes

Production-grade reasoning Flash — GA stability with 3.x quality, between 3 Flash and 3.1 Pro on cost.

Pricing comparison ($/1M tokens)

ProviderInputOutputvs QSP
QuickSilver Pro$1.27$7.65cheapest
OpenRouter (google/gemini-3.5-flash)$1.50$9.0015% cheaper
OpenAI (GPT-4o)$2.50$10.0023% cheaper

When to use

Pick 3.5 Flash when you need 3.x-quality reasoning on a GA model that won't shift semantics under you: production agentic loops with non-trivial reasoning per turn, coding agents that need stronger-than-2.5-Flash quality, long-context analysis where a one-turn-thinking spend is acceptable. The GA label means Google has committed to stable behavior and pricing — fine to ship in revenue-critical paths, unlike 3 Flash Preview or 3.1 Pro Preview.

When to use something else

For routine high-volume chat, 2.5 Flash ($0.255/$2.125) or Flash Lite ($0.085/$0.34) are much cheaper. For top-tier reasoning where 3.1 Pro's deeper thinking matters, the Pro preview is the right escalation. For non-thinking Gemini (predictable token budgets), only Flash Lite ships without reasoning — 3.5 Flash thinks by default and `reasoning.enabled=false` is silently dropped.

Quickstart (curl)

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.5-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

3.5 Flash is GA — Google has committed to stable output formats, pricing, and behavior. 3 Flash Preview can change without notice. 3.5 Flash is also ~2.5-3x more expensive ($1.275/$7.65 per 1M vs 3 Flash Preview's $0.425/$2.55), reflecting the stronger reasoning quality and production stability. Run side-by-side evals on your task; for prototyping, Preview is fine, for shipping use 3.5.

3.1 Pro Preview is the flagship reasoning tier — deeper thinking, harder problems, $2/$12 per 1M at Vertex retail. 3.5 Flash is the production Flash tier — strong reasoning, lower cost ($1.50/$9 Vertex retail), and GA-stable. Use 3.5 Flash for production traffic that needs reasoning; escalate to 3.1 Pro for evals on the hardest reasoning tasks. Both think by default; neither exposes a thinking-disable toggle through QuickSilver Pro yet.

QuickSilver Pro lists 3.5 Flash at $1.275 input / $7.65 output per 1M tokens — ~15% below Vertex retail and OpenRouter's $1.50/$9.00. One bill, one OpenAI-compatible key across 14 models, referral rewards, and `usage.cost` accounting per response.

Try Gemini 3.5 Flash with double credits — up to $50 free

Get API Key