Gemini 3.1 Flash Lite on QuickSilver Pro

Name: Gemini 3.1 Flash Lite on QuickSilver Pro
Brand: QuickSilver Pro
Price: 0.2125 USD
Availability: InStock

Gemini 3.1 Flash Lite is Google's newest cost-efficient model — 1M-token context, fast, and built for high-volume, latency-sensitive workloads. Non-thinking by default, so token budgets stay predictable. On QuickSilver Pro it lists at $0.2125 input / $1.275 output per 1M tokens, ~15% below Vertex retail ($0.25/$1.50) — the cheapest model in the 3.x generation.

$0.21 input · $1.27 output per 1M tokens

ByRaullen Chai·Updated May 29, 2026

At a glance

Context

1M tokens

Input / 1M

$0.21

Output / 1M

$1.27

Thinks by default

Newest low-cost workhorse — predictable non-thinking output, 1M context, built for volume.

Pricing comparison ($/1M tokens)

Provider	Input	Output	vs QSP
QuickSilver Pro	$0.21	$1.27	cheapest
OpenRouter (google/gemini-3.1-flash-lite)	$0.25	$1.50	15% cheaper
OpenAI (GPT-4o mini)	$0.15	$0.60	112% more expensive

When to use

Use 3.1 Flash Lite for high-volume, cost-sensitive work where you don't need a reasoning trace: routing and classification, extraction, summarization, simple chat, and agent sub-tasks where latency and price beat raw reasoning depth. Non-thinking by default means output tokens are predictable — easy to budget at scale.

When to use something else

For multi-step reasoning, hard coding, or analysis, step up to 3.5 Flash ($1.275/$7.65) or a Pro tier — Flash Lite trades depth for cost. If you specifically want a thinking model with 1M context at low cost, 2.5 Flash ($0.255/$2.125) reasons by default. For image generation, use the Gemini image models or FLUX.

Quickstart (curl)

shellGet an API key →

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.1-flash-lite",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

Does Gemini 3.1 Flash Lite think by default?

No — Flash Lite is the non-thinking tier, so it answers directly without a reasoning trace. That keeps output token counts (and cost) predictable, which is exactly what high-volume workloads want. If you need reasoning, 2.5 Flash or 3.5 Flash think by default.

How is 3.1 Flash Lite different from 2.5 Flash Lite?

3.1 Flash Lite is the newer generation — improved quality at a similar low-cost position. 2.5 Flash Lite ($0.085/$0.34) is still the absolute cheapest Gemini; 3.1 Flash Lite ($0.2125/$1.275) costs more but brings 3.x-generation improvements. Run both on your task — for the cheapest possible routing/classification, 2.5 Flash Lite still wins.

Why is QuickSilver Pro cheaper than Vertex direct?

QuickSilver Pro lists 3.1 Flash Lite at $0.2125 input / $1.275 output per 1M tokens — ~15% below Vertex retail's $0.25/$1.50. One OpenAI-compatible key across 18 models, one bill, and a `usage.cost` field on every response so you can reconcile spend per request.

Try Gemini 3.1 Flash Lite with double credits — up to $50 free

Get API Key