Qwen 3.6-35B-A3B on QuickSilver Pro

Name: Qwen 3.6-35B-A3B on QuickSilver Pro
Brand: QuickSilver Pro
Price: 0.12 USD
Availability: InStock

Qwen 3.6-35B-A3B is Alibaba's April 2026 MoE refresh — 35B params, 3B active per token, 262K context window, and a meaningful reasoning bump over 3.5 at the same architectural footprint. On QuickSilver Pro it's $0.12 input / $0.80 output per 1M tokens, ~20% below OpenRouter, and ~12–21x cheaper than GPT-4o on the long-context RAG tasks it's actually built for.

$0.12 input · $0.80 output per 1M tokens

ByRaullen Chai·Updated May 29, 2026

At a glance

Context

262K tokens

Input / 1M

$0.12

Output / 1M

$0.80

Thinks by default

Yes

Long-context RAG, document summarization, drop-in 3.5 upgrade with reasoning.

Pricing comparison ($/1M tokens)

Provider	Input	Output	vs QSP
QuickSilver Pro	$0.12	$0.80	cheapest
OpenRouter (qwen/qwen3.6-35b-a3b)	$0.15	$1.00	20% cheaper
OpenAI (GPT-4o)	$2.50	$10.00	92% cheaper

When to use

Qwen 3.6 shines on long-document summarization, multi-document RAG over 100K+ token corpora, and Chinese-language tasks where its training data advantage is real. The 3B-active MoE means it serves cheaply at scale; the 262K context lets you concatenate a Confluence space or technical spec without chunking.

When to use something else

For coding-agent workloads, DeepSeek V3 / V4 Flash beats it on HumanEval-style benchmarks and is cheaper. For top-tier reasoning, R1 or V4 Pro. Qwen 3.6 thinks by default; the QuickSilver Pro gateway sends `reasoning.enabled=false` by default to suppress the thinking trace — pass `reasoning.enabled=true` to opt back in.

Quickstart (curl)

shellGet an API key →

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6-35b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

How does Qwen 3.6 compare to Qwen 3.5?

Same architecture (35B params, 3B active MoE) and same 262K context window, but stronger reasoning on the published evals (math, coding, long-context retrieval) and adjusted reasoning behavior — Qwen 3.6 thinks by default. On QSP both lie in the same price tier (3.6 input $0.12/M vs 3.5's $0.111/M); output is $0.80/M for both, so per output token they're priced the same even with 3.6's thinking trace included.

Can I disable thinking like on V4 Flash?

Yes — thinking is suppressed by default. The QuickSilver Pro gateway sends `reasoning.enabled=false` on Qwen 3.6 requests by default, so you don't pay for a thinking trace on routine chat. To opt back into reasoning, pass `reasoning.enabled=true` in the request body.

Why is QSP cheaper than OpenRouter on Qwen 3.6?

OpenRouter lists Qwen 3.6-35B-A3B at $0.15 input / $1.00 output per 1M tokens. QuickSilver Pro is $0.12 / $0.80 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap. Drop the `qwen/` provider prefix from the model ID.

Try Qwen 3.6-35B-A3B with double credits — up to $50 free

Get API Key