Home/Models/Qwen 3.7 Plus
New262K contextAgent flagshipLong-horizon coding

Qwen 3.7 Plus on QuickSilver Pro

Qwen 3.7 Plus is Alibaba's hosted agent flagship — built for long-running agent loops (the launch demo ran an 11-hour session across 1,000+ tool calls and 10,000+ lines of code), with a 262K-token context. On QuickSilver Pro it's $0.256 input / $1.024 output per 1M tokens, ~20% below OpenRouter's $0.32 / $1.28 and roughly a sixth of Qwen 3.7 Max's $6.0 output price. Alibaba reports it matches 3.7 Max on AIME 2025 at ~3× the speed — A/B it against Max on your own agentic evals.

$0.26 input · $1.02 output per 1M tokens
ByRaullen Chai·Updated

At a glance

Context
262K tokens
Input / 1M
$0.26
Output / 1M
$1.02
Thinks by default
Yes

Long-running coding/agent loops — near-flagship reasoning at ~6× lower output cost than 3.7 Max.

Pricing comparison ($/1M tokens)

ProviderInputOutputvs QSP
QuickSilver Pro$0.26$1.02cheapest
OpenRouter (qwen/qwen3.7-plus)$0.32$1.2820% cheaper
OpenAI (GPT-4o)$2.50$10.0090% cheaper

When to use

Reach for 3.7 Plus on agentic and coding workloads that run long: multi-hour agent sessions, coding agents (opencode, Cline, Aider) that loop across hundreds of tool calls, and long-context analysis up to 262K tokens. Alibaba positions it as matching 3.7 Max on AIME-class reasoning at ~3× the throughput, so it's the production-default for agent loops where Max's latency and per-token output cost compound across the trajectory.

When to use something else

For the hardest single-shot reasoning, or when you need a context window beyond 262K, Qwen 3.7 Max is the step up — A/B on your evals. For cheap non-agentic chat or bulk classification, DeepSeek V4 Flash ($0.08/$0.16) or V3 ($0.16/$0.616) are far cheaper. For pure math / theorem-style reasoning, DeepSeek R1 ($0.56/$2.00).

Quickstart (curl)

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.7-plus",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

They sit at two ends of the Qwen 3.7 line: Max is the $2.0/$6.0 flagship with a 1M-token context; Plus is the $0.256/$1.024 agent model with 262K context. Alibaba reports Plus matches Max on AIME 2025 at ~3× the speed, which makes Plus the production-default for long-running agent loops where Max's cost and latency compound across hundreds of turns. Step up to Max for the hardest single-shot reasoning or when you need the full 1M window. A/B the two on your own agentic evals before committing.

That's what it's tuned for. Alibaba positions 3.7 Plus for long-horizon agent loops — the launch demo was an 11-hour autonomous coding session — and it speaks the OpenAI-compatible chat/tools API, so it drops into opencode, Cline, Aider, or any agent that expects OpenAI tool-calling shapes. Point the agent at base_url=https://api.quicksilverpro.io/v1 with model="qwen3.7-plus". Run it against your own task suite — vendor benchmarks are a starting point, not a guarantee.

OpenRouter lists Qwen 3.7 Plus at $0.32 input / $1.28 output per 1M tokens; QuickSilver Pro is $0.256 / $1.024 — ~20% below on both legs. Same OpenAI-compatible chat completions surface; migration is a base_url + key swap, with the model ID changing from `qwen/qwen3.7-plus` to QSP's alias `qwen3.7-plus`.

Try Qwen 3.7 Plus with double credits — up to $50 free

Get API Key