Kimi K2.7 Code on QuickSilver Pro
Kimi K2.7 Code is Moonshot's coding-tuned K2 — the same trillion-parameter architecture (32B active, 256K context) as K2.6, retuned for long-horizon agentic coding. On QuickSilver Pro it's $0.60 input / $2.80 output per 1M tokens, ~20% below OpenRouter's $0.75 / $3.50. Moonshot reports it uses ~30% fewer reasoning tokens per task than K2.6 while scoring higher on coding evals — and in an agent loop, where each turn's reasoning is re-read as input on the next, that token discipline compounds into lower cost per completed task.
At a glance
Long-horizon agentic coding — Moonshot's coding-tuned K2 with leaner reasoning-token overhead than K2.6.
Pricing comparison ($/1M tokens)
| Provider | Input | Output | vs QSP |
|---|---|---|---|
| QuickSilver Pro | $0.60 | $2.80 | cheapest |
| OpenRouter (moonshotai/kimi-k2.7-code) | $0.75 | $3.50 | 20% cheaper |
When to use
Reach for K2.7 Code as the engine behind a coding agent (opencode, Cline, Aider, and Claude-Code-style loops): multi-step refactors, repo-wide changes, and plan-then-act agents that coordinate many tool calls across a 256K-token working set. It's tuned for exactly the long-horizon agentic-coding trajectory where verbose reasoning gets re-read every turn — Moonshot reports higher coding-eval scores at ~30% fewer reasoning tokens than K2.6.
When to use something else
For routine chat, short-context codegen, or non-agentic single-shot tasks, the per-token price is overkill — DeepSeek V4 Flash ($0.08/$0.16) or V4 Pro ($0.348/$0.696) land most of those cheaper. For pure mathematical reasoning, DeepSeek R1. For general (non-coding) Opus-class agentic planning, Kimi K2.6 is the sibling to A/B against.
Quickstart (curl)
curl https://api.quicksilverpro.io/v1/chat/completions \
-H "Authorization: Bearer $QSP_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.7-code",
"messages": [{"role": "user", "content": "Hello!"}]
}'OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.
FAQ
Same K2 architecture — a trillion parameters, 32B active, 256K context — but K2.7 Code is retuned specifically for long-horizon agentic coding. Moonshot's published numbers show higher coding-eval scores (Kimi Code Bench v2, Program Bench, MLS Bench Lite) while emitting ~30% fewer reasoning tokens per task than K2.6. In agent loops the reasoning from each turn is re-read as input on the next, so fewer reasoning tokens means lower cost per completed task, not just per call. K2.6 is still the better pick for general (non-coding) agentic planning; A/B the two on your workload.
Yes — K2.7 Code is an OpenAI-compatible chat completions endpoint. Point your agent at base_url=https://api.quicksilverpro.io/v1, paste your QSP key, and set model="kimi-k2.7-code". Streaming, tool calling, and json_schema strict mode all work, and `usage.cost` is reported per response so you can watch cost-per-task directly.
OpenRouter lists K2.7 Code at $0.75 input / $3.50 output per 1M tokens; QuickSilver Pro is $0.60 / $2.80 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap, dropping the `moonshotai/` provider prefix from the model ID.