Qwen 3.6 Plus on QuickSilver Pro
Qwen 3.6 Plus is Alibaba's 1-trillion-parameter MoE flagship — sized between 3.6-35B and 3.7 Max, with 1M context and thinking on by default. On QuickSilver Pro it's $0.26 input / $1.56 output per 1M tokens, ~20% below OpenRouter and ~10× cheaper than GPT-4o on output. The sweet spot for teams who need 3.5/3.6-35B's price economics but with more reasoning depth and a million-token context.
At a glance
Production reasoning + long-context RAG at 10× cheaper output than GPT-4o.
Pricing comparison ($/1M tokens)
| Provider | Input | Output | vs QSP |
|---|---|---|---|
| QuickSilver Pro | $0.26 | $1.56 | cheapest |
| OpenRouter (qwen/qwen3.6-plus) | $0.33 | $1.95 | 20% cheaper |
| OpenAI (GPT-4o) | $2.50 | $10.00 | 84% cheaper |
When to use
3.6 Plus sits in the production-default slot: too small a step from 3.6-35B to skip, too cheap to ignore for reasoning-heavy workloads. Use it for multi-document RAG over 100K+ tokens, customer-facing chat where you want noticeable answer-quality bumps over 3.6-35B without 3.7 Max's price, agentic workflows where the per-call latency penalty of thinking is acceptable, and Chinese-language tasks where Qwen's training data advantage compounds.
When to use something else
For deterministic, no-thinking production chat where token budgets must be tight, DeepSeek V3 ($0.16/$0.616) is the cheaper non-thinking choice. For frontier-level reasoning that 3.6 Plus visibly fails at on your eval, step up to 3.7 Max or V4 Pro. Like the rest of the 3.6/3.7 family, 3.6 Plus thinks by default; the QuickSilver Pro gateway sends `reasoning.enabled=false` by default to suppress the trace — pass `reasoning.enabled=true` to opt in.
Quickstart (curl)
curl https://api.quicksilverpro.io/v1/chat/completions \
-H "Authorization: Bearer $QSP_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3.6-plus",
"messages": [{"role": "user", "content": "Hello!"}]
}'OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.
FAQ
3.6-35B is the cost leader (35B params, 3B active MoE, 262K context, $0.12/$0.80 per 1M). 3.6 Plus is the next step up — 1T-parameter MoE, 1M context, $0.26/$1.56 per 1M. Same 'thinks by default' behavior on both. The price gap is meaningful (~2× on input, ~2× on output) but the answer-quality bump on hard reasoning + the 4× context window justify it on workloads where 3.6-35B's 35B/3B-active footprint hits its ceiling.
Yes — within the Qwen family, 3.6 Plus is the production-default and 3.7 Max is the flagship. The price ratio is ~7-8× (3.6 Plus $0.26/$1.56 vs 3.7 Max $2.0/$6.0). Most teams should start on 3.6 Plus, A/B 3.7 Max on the subset of requests where 3.6 Plus's answers visibly degrade. Both share the same 1M context and OpenAI-compatible surface, so swapping is just a model-ID change.
OpenRouter lists Qwen 3.6 Plus at $0.325 input / $1.95 output per 1M tokens. QuickSilver Pro is $0.26 / $1.56 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap (drop the `qwen/` provider prefix from the model ID).