Qwen 3.5-35B-A3B on QuickSilver Pro

Name: Qwen 3.5-35B-A3B on QuickSilver Pro
Brand: QuickSilver Pro
Price: 0.111 USD
Availability: InStock

Qwen 3.5-35B-A3B is Alibaba's 35B MoE model with 3B active parameters per token and a 262K context window — the QSP catalog's pick for long-document RAG, multi-document summarization, and Chinese-language tasks where Qwen's training data has the edge. On QuickSilver Pro it's $0.111 input / $0.80 output per 1M tokens, ~20% below OpenRouter and ~23× cheaper than GPT-4o on input.

$0.11 input · $0.80 output per 1M tokens

ByRaullen Chai·Updated May 29, 2026

At a glance

Context

262K tokens

Input / 1M

$0.11

Output / 1M

$0.80

Thinks by default

Long-context RAG, document summarization, Chinese-language workloads — at a fraction of GPT-4o's cost.

Pricing comparison ($/1M tokens)

Provider	Input	Output	vs QSP
QuickSilver Pro	$0.11	$0.80	cheapest
OpenRouter (qwen/qwen3.5-35b-a3b)	$0.14	$1.00	20% cheaper
OpenAI (GPT-4o)	$2.50	$10.00	92% cheaper

When to use

Qwen 3.5 shines on long-document workloads: multi-document RAG over 100K+ token corpora, technical-spec summarization, Confluence-space QA, transcript analysis. The 262K context fits a substantial corpus without chunking, and the 3B-active MoE serves cheaply at scale. Particularly strong on Chinese-language tasks where its training data advantage is real.

When to use something else

For coding-agent workloads, DeepSeek V3 / V4 Flash beats Qwen 3.5 on HumanEval-style benchmarks. For top-tier reasoning, R1 or V4 Pro. For drop-in upgrade with stronger reasoning at the same architecture, see Qwen 3.6-35B-A3B (newer, same $0.80/M output price). Qwen 3.5 is still shipping for teams running it in prod or who specifically don't want the 3.6 reasoning behavior.

Quickstart (curl)

shellGet an API key →

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5-35b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

Should I migrate from Qwen 3.5 to Qwen 3.6?

Yes for most new deployments — Qwen 3.6-35B-A3B has the same architecture (35B/3B-active MoE, 262K context) but stronger reasoning on the published evals and the same $0.80/M output price (3.6 input is $0.12/M vs 3.5's $0.111/M, a hair more). The catch is 3.6 thinks by default, so output token counts go up. For teams already running 3.5 in prod with predictable token budgets, 3.5 keeps shipping unchanged.

How does Qwen 3.5 compare to GPT-4o for long-context RAG?

On the long-context retrieval benchmarks Alibaba published with the 3.5 release, Qwen lands in the same quality tier as GPT-4o for retrieval and summarization tasks up to 200K tokens. The price gap is dramatic — $0.111 input / $0.80 output vs GPT-4o's $2.50 / $10.00 per 1M tokens, about 23× cheaper on input and 13× on output. For RAG pipelines where the dominant cost is feeding long context, that's the entire economics.

Why is QuickSilver Pro cheaper than OpenRouter on Qwen 3.5?

QuickSilver Pro lists Qwen 3.5-35B-A3B at $0.111 input / $0.80 output per 1M tokens. OpenRouter's public per-token rate is $0.139 / $1.00 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap (drop the `qwen/` provider prefix from the model ID).

Try Qwen 3.5-35B-A3B with double credits — up to $50 free

Get API Key