Gemini 3.5 Flash on QuickSilver Pro
Gemini 3.5 Flash is Google's next-generation Flash model — GA (not preview), 1M-token context, thinks by default. Sits between 3 Flash Preview and 3.1 Pro Preview on both capability and price. On QuickSilver Pro it lists at $1.275 input / $7.65 output per 1M tokens, ~15% below Vertex retail and OpenRouter's $1.50/$9.00. Thinking is always-on for 3.5 Flash through QuickSilver Pro; for non-thinking Gemini use 2.5 Flash Lite ($0.085/$0.34).
At a glance
Production-grade reasoning Flash — GA stability with 3.x quality, between 3 Flash and 3.1 Pro on cost.
Pricing comparison ($/1M tokens)
| Provider | Input | Output | vs QSP |
|---|---|---|---|
| QuickSilver Pro | $1.27 | $7.65 | cheapest |
| OpenRouter (google/gemini-3.5-flash) | $1.50 | $9.00 | 15% cheaper |
| OpenAI (GPT-4o) | $2.50 | $10.00 | 23% cheaper |
When to use
Pick 3.5 Flash when you need 3.x-quality reasoning on a GA model that won't shift semantics under you: production agentic loops with non-trivial reasoning per turn, coding agents that need stronger-than-2.5-Flash quality, long-context analysis where a one-turn-thinking spend is acceptable. The GA label means Google has committed to stable behavior and pricing — fine to ship in revenue-critical paths, unlike 3 Flash Preview or 3.1 Pro Preview.
When to use something else
For routine high-volume chat, 2.5 Flash ($0.255/$2.125) or Flash Lite ($0.085/$0.34) are much cheaper. For top-tier reasoning where 3.1 Pro's deeper thinking matters, the Pro preview is the right escalation. For non-thinking Gemini (predictable token budgets), only Flash Lite ships without reasoning — 3.5 Flash thinks by default and `reasoning.enabled=false` is silently dropped.
Quickstart (curl)
curl https://api.quicksilverpro.io/v1/chat/completions \
-H "Authorization: Bearer $QSP_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3.5-flash",
"messages": [{"role": "user", "content": "Hello!"}]
}'OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.
FAQ
3.5 Flash is GA — Google has committed to stable output formats, pricing, and behavior. 3 Flash Preview can change without notice. 3.5 Flash is also ~2.5-3x more expensive ($1.275/$7.65 per 1M vs 3 Flash Preview's $0.425/$2.55), reflecting the stronger reasoning quality and production stability. Run side-by-side evals on your task; for prototyping, Preview is fine, for shipping use 3.5.
3.1 Pro Preview is the flagship reasoning tier — deeper thinking, harder problems, $2/$12 per 1M at Vertex retail. 3.5 Flash is the production Flash tier — strong reasoning, lower cost ($1.50/$9 Vertex retail), and GA-stable. Use 3.5 Flash for production traffic that needs reasoning; escalate to 3.1 Pro for evals on the hardest reasoning tasks. Both think by default; neither exposes a thinking-disable toggle through QuickSilver Pro yet.
QuickSilver Pro lists 3.5 Flash at $1.275 input / $7.65 output per 1M tokens — ~15% below Vertex retail and OpenRouter's $1.50/$9.00. One bill, one OpenAI-compatible key across 14 models, referral rewards, and `usage.cost` accounting per response.