Gemini 3.1 Pro Preview on QuickSilver Pro

Name: Gemini 3.1 Pro Preview on QuickSilver Pro
Brand: QuickSilver Pro
Price: 1.7 USD
Availability: InStock

Gemini 3.1 Pro Preview is Google's flagship reasoning model — 1M-token context window, deep thinking by default, multimodal foundation. On QuickSilver Pro it lists at $1.70 input / $10.20 output per 1M tokens, ~15% below Google's Vertex retail and OpenRouter's $2/$12. This is a preview API; semantics may shift before GA.

$1.70 input · $10.20 output per 1M tokens

ByRaullen Chai·Updated May 29, 2026

At a glance

Context

1M tokens

Input / 1M

$1.70

Output / 1M

$10.20

Thinks by default

Yes

Long-context reasoning, multi-step analysis, complex code review — at o1-class quality and ~6x lower output cost.

Pricing comparison ($/1M tokens)

Provider	Input	Output	vs QSP
QuickSilver Pro	$1.70	$10.20	cheapest
OpenRouter (google/gemini-3.1-pro-preview)	$2.00	$12.00	15% cheaper
OpenAI (o1)	$15.00	$60.00	83% cheaper

When to use

Reach for Gemini 3.1 Pro Preview when the task genuinely benefits from chain-of-thought over a long context: large-codebase review, multi-document legal/research synthesis, complex agentic planning with rich memory, mathematical reasoning over a long problem. The 1M context fits a substantial corpus without chunking, and the thinking trace adds answer quality on hard problems. Equivalent quality tier to OpenAI o1 at a fraction of the cost.

When to use something else

Don't default to 3.1 Pro Preview for routine chat or short-prompt classification — it thinks by default, so a one-line question can burn 100-200 reasoning tokens before the visible answer. For high-volume cheap chat, Gemini 2.5 Flash Lite ($0.085/$0.34) or DeepSeek V4 Flash ($0.08/$0.16) win on cost-per-task. For non-Gemini-flavored reasoning, DeepSeek R1 ($0.56/$2.00) is ~3× cheaper on input and ~5× cheaper on output.

Quickstart (curl)

shellGet an API key →

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.1-pro-preview",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

Is Gemini 3.1 Pro Preview stable enough for production?

It's a preview API — Google labels it as such, and semantics may shift before GA. Output formats, thinkingConfig handling, and rate limits can change without notice. For prototyping and evals, ship today. For revenue-critical paths, pin to a Gemini GA model (e.g. Gemini 2.5 Flash) and treat 3.1 Pro as a behind-flag experiment until Google promotes it.

Why does a one-token question return 150+ tokens of usage?

Gemini 3.1 Pro Preview thinks by default — the model emits a long reasoning trace before its final answer, and usage.completion_tokens includes both reasoning_tokens and text_tokens. A simple "what is 7+8?" returned 160 reasoning + 3 text tokens in our smoke test. Budget max_tokens accordingly: anything under ~200 tokens risks the model running out of budget on the thinking phase and returning an empty answer.

Can I disable thinking to get cheaper output?

Not yet, on QuickSilver Pro — `reasoning: { enabled: false }` is silently dropped on Gemini 3.1 Pro. For non-thinking Gemini, use gemini-2.5-flash-lite ($0.085/$0.34) which is genuinely non-thinking. Note: 2.5 Flash thinks by default too — only Flash Lite ships without reasoning.

Why is QuickSilver Pro cheaper than OpenRouter or Vertex direct on Gemini 3.1 Pro?

QuickSilver Pro lists Gemini 3.1 Pro Preview at $1.70 input / $10.20 output per 1M tokens — about 15% below Google's Vertex retail and OpenRouter's $2/$12. You pay one bill across 14 models on a single OpenAI-compatible key, with `usage.cost` accounting per response. Switching from another provider is a base_url + key swap on the OpenAI SDK with `model="gemini-3.1-pro-preview"`.

Try Gemini 3.1 Pro Preview with double credits — up to $50 free

Get API Key