New1M contextReasoningAgentic coding

GLM 5.2 on QuickSilver Pro

Name: GLM 5.2 on QuickSilver Pro
Brand: QuickSilver Pro
Price: 0.8 USD
Availability: InStock

GLM 5.2 is Z.ai's large-scale reasoning flagship: a 1M-token context window tuned for long-horizon agent workflows and project-level software engineering. On QuickSilver Pro it's $0.80 input / $3.20 output per 1M tokens, ~20% below OpenRouter's $1.00 / $4.00. It thinks by default; the QuickSilver Pro gateway sends `reasoning.enabled=false` by default to suppress the thinking trace on routine calls — pass `reasoning.enabled=true` to opt back into reasoning.

$0.80 input · $3.20 output per 1M tokens

ByRaullen Chai·Updated May 29, 2026

At a glance

Context

1M tokens

Input / 1M

$0.80

Output / 1M

$3.20

Thinks by default

Yes

Long-horizon agents and project-level coding — Z.ai's reasoning flagship with a 1M-token context window.

Pricing comparison ($/1M tokens)

Provider	Input	Output	vs QSP
QuickSilver Pro	$0.80	$3.20	cheapest
OpenRouter (z-ai/glm-5.2)	$1.00	$4.00	20% cheaper
OpenAI (GPT-4o)	$2.50	$10.00	68% cheaper

When to use

Reach for GLM 5.2 on long-horizon agent loops and repo-scale software engineering: multi-step refactors, plan-then-act agents coordinating many tool calls, and tasks that need to hold a large working set in its 1M-token context. It's a reasoning model, so it's well suited to problems where an explicit reasoning trace improves the result — pass `reasoning.enabled=true` to opt into the trace (the gateway suppresses it by default to keep routine calls cheap).

When to use something else

For routine chat, short-context codegen, or single-shot tasks, the per-token price and reasoning overhead are overkill — DeepSeek V4 Flash ($0.08/$0.16) or V4 Pro ($0.348/$0.696) land most of those cheaper. For pure mathematical reasoning, DeepSeek R1. For agentic coding specifically, A/B against Kimi K2.7 Code.

Quickstart (curl)

shellGet an API key →

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-5.2",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

Does GLM 5.2 think by default? Can I turn reasoning on?

GLM 5.2 is a reasoning model, but to keep routine calls from billing a hidden reasoning trace the QuickSilver Pro gateway sends `reasoning.enabled=false` by default on GLM 5.2 requests — so out of the box you get a direct reply. To opt into the reasoning trace, pass `reasoning: { enabled: true }` in the request body and budget output tokens accordingly.

Does it work with the OpenAI SDK?

Yes — GLM 5.2 is an OpenAI-compatible chat completions endpoint on QuickSilver Pro. Set base_url=https://api.quicksilverpro.io/v1, paste your QSP key, and use model="glm-5.2". Streaming, tool calling, json_schema strict mode, and usage.cost accounting all work.

Why is QuickSilver Pro cheaper than OpenRouter on GLM 5.2?

OpenRouter lists GLM 5.2 at $1.00 input / $4.00 output per 1M tokens; QuickSilver Pro is $0.80 / $3.20 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap, dropping the `z-ai/` provider prefix from the model ID.

Try GLM 5.2 with double credits — up to $50 free

Get API Key