Home/Models/Qwen 3.6-35B-A3B
New262K contextMoE upgrade

Qwen 3.6-35B-A3B on QuickSilver Pro

Qwen 3.6-35B-A3B is Alibaba's April 2026 MoE refresh — 35B params, 3B active per token, 262K context window, and a meaningful reasoning bump over 3.5 at the same architectural footprint. On QuickSilver Pro it's $0.12 input / $0.80 output per 1M tokens, ~20% below OpenRouter, and ~12–21x cheaper than GPT-4o on the long-context RAG tasks it's actually built for.

$0.12 input · $0.80 output per 1M tokens
ByRaullen Chai·Updated

At a glance

Context
262K tokens
Input / 1M
$0.12
Output / 1M
$0.80
Thinks by default
Yes

Long-context RAG, document summarization, drop-in 3.5 upgrade with reasoning.

Pricing comparison ($/1M tokens)

ProviderInputOutputvs QSP
QuickSilver Pro$0.12$0.80cheapest
OpenRouter (qwen/qwen3.6-35b-a3b)$0.15$1.0020% cheaper
OpenAI (GPT-4o)$2.50$10.0092% cheaper

When to use

Qwen 3.6 shines on long-document summarization, multi-document RAG over 100K+ token corpora, and Chinese-language tasks where its training data advantage is real. The 3B-active MoE means it serves cheaply at scale; the 262K context lets you concatenate a Confluence space or technical spec without chunking.

When to use something else

For coding-agent workloads, DeepSeek V3 / V4 Flash beats it on HumanEval-style benchmarks and is cheaper. For top-tier reasoning, R1 or V4 Pro. Qwen 3.6 thinks by default; the QuickSilver Pro gateway sends `reasoning.enabled=false` by default to suppress the thinking trace — pass `reasoning.enabled=true` to opt back in.

Quickstart (curl)

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6-35b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

Same architecture (35B params, 3B active MoE) and same 262K context window, but stronger reasoning on the published evals (math, coding, long-context retrieval) and adjusted reasoning behavior — Qwen 3.6 thinks by default. On QSP both lie in the same price tier (3.6 input $0.12/M vs 3.5's $0.111/M); output is $0.80/M for both, so per output token they're priced the same even with 3.6's thinking trace included.

Yes — thinking is suppressed by default. The QuickSilver Pro gateway sends `reasoning.enabled=false` on Qwen 3.6 requests by default, so you don't pay for a thinking trace on routine chat. To opt back into reasoning, pass `reasoning.enabled=true` in the request body.

OpenRouter lists Qwen 3.6-35B-A3B at $0.15 input / $1.00 output per 1M tokens. QuickSilver Pro is $0.12 / $0.80 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap. Drop the `qwen/` provider prefix from the model ID.

Try Qwen 3.6-35B-A3B with double credits — up to $50 free

Get API Key