Home/Models/GLM 5.2
New1M contextReasoningAgentic coding

GLM 5.2 on QuickSilver Pro

GLM 5.2 is Z.ai's large-scale reasoning flagship: a 1M-token context window tuned for long-horizon agent workflows and project-level software engineering. On QuickSilver Pro it's $0.80 input / $3.20 output per 1M tokens, ~20% below OpenRouter's $1.00 / $4.00. It thinks by default; the QuickSilver Pro gateway sends `reasoning.enabled=false` by default to suppress the thinking trace on routine calls — pass `reasoning.enabled=true` to opt back into reasoning.

$0.80 input · $3.20 output per 1M tokens
ByRaullen Chai·Updated

At a glance

Context
1M tokens
Input / 1M
$0.80
Output / 1M
$3.20
Thinks by default
Yes

Long-horizon agents and project-level coding — Z.ai's reasoning flagship with a 1M-token context window.

Pricing comparison ($/1M tokens)

ProviderInputOutputvs QSP
QuickSilver Pro$0.80$3.20cheapest
OpenRouter (z-ai/glm-5.2)$1.00$4.0020% cheaper
OpenAI (GPT-4o)$2.50$10.0068% cheaper

When to use

Reach for GLM 5.2 on long-horizon agent loops and repo-scale software engineering: multi-step refactors, plan-then-act agents coordinating many tool calls, and tasks that need to hold a large working set in its 1M-token context. It's a reasoning model, so it's well suited to problems where an explicit reasoning trace improves the result — pass `reasoning.enabled=true` to opt into the trace (the gateway suppresses it by default to keep routine calls cheap).

When to use something else

For routine chat, short-context codegen, or single-shot tasks, the per-token price and reasoning overhead are overkill — DeepSeek V4 Flash ($0.08/$0.16) or V4 Pro ($0.348/$0.696) land most of those cheaper. For pure mathematical reasoning, DeepSeek R1. For agentic coding specifically, A/B against Kimi K2.7 Code.

Quickstart (curl)

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-5.2",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

GLM 5.2 is a reasoning model, but to keep routine calls from billing a hidden reasoning trace the QuickSilver Pro gateway sends `reasoning.enabled=false` by default on GLM 5.2 requests — so out of the box you get a direct reply. To opt into the reasoning trace, pass `reasoning: { enabled: true }` in the request body and budget output tokens accordingly.

Yes — GLM 5.2 is an OpenAI-compatible chat completions endpoint on QuickSilver Pro. Set base_url=https://api.quicksilverpro.io/v1, paste your QSP key, and use model="glm-5.2". Streaming, tool calling, json_schema strict mode, and usage.cost accounting all work.

OpenRouter lists GLM 5.2 at $1.00 input / $4.00 output per 1M tokens; QuickSilver Pro is $0.80 / $3.20 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap, dropping the `z-ai/` provider prefix from the model ID.

Try GLM 5.2 with double credits — up to $50 free

Get API Key