Home/Models/Kimi K2.7 Code
New256K contextAgentic codingCoding-tuned K2

Kimi K2.7 Code on QuickSilver Pro

Kimi K2.7 Code is Moonshot's coding-tuned K2 — the same trillion-parameter architecture (32B active, 256K context) as K2.6, retuned for long-horizon agentic coding. On QuickSilver Pro it's $0.60 input / $2.80 output per 1M tokens, ~20% below OpenRouter's $0.75 / $3.50. Moonshot reports it uses ~30% fewer reasoning tokens per task than K2.6 while scoring higher on coding evals — and in an agent loop, where each turn's reasoning is re-read as input on the next, that token discipline compounds into lower cost per completed task.

$0.60 input · $2.80 output per 1M tokens
ByRaullen Chai·Updated

At a glance

Context
256K tokens
Input / 1M
$0.60
Output / 1M
$2.80
Thinks by default
Yes

Long-horizon agentic coding — Moonshot's coding-tuned K2 with leaner reasoning-token overhead than K2.6.

Pricing comparison ($/1M tokens)

ProviderInputOutputvs QSP
QuickSilver Pro$0.60$2.80cheapest
OpenRouter (moonshotai/kimi-k2.7-code)$0.75$3.5020% cheaper

When to use

Reach for K2.7 Code as the engine behind a coding agent (opencode, Cline, Aider, and Claude-Code-style loops): multi-step refactors, repo-wide changes, and plan-then-act agents that coordinate many tool calls across a 256K-token working set. It's tuned for exactly the long-horizon agentic-coding trajectory where verbose reasoning gets re-read every turn — Moonshot reports higher coding-eval scores at ~30% fewer reasoning tokens than K2.6.

When to use something else

For routine chat, short-context codegen, or non-agentic single-shot tasks, the per-token price is overkill — DeepSeek V4 Flash ($0.08/$0.16) or V4 Pro ($0.348/$0.696) land most of those cheaper. For pure mathematical reasoning, DeepSeek R1. For general (non-coding) Opus-class agentic planning, Kimi K2.6 is the sibling to A/B against.

Quickstart (curl)

curl https://api.quicksilverpro.io/v1/chat/completions \
  -H "Authorization: Bearer $QSP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2.7-code",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.

FAQ

Same K2 architecture — a trillion parameters, 32B active, 256K context — but K2.7 Code is retuned specifically for long-horizon agentic coding. Moonshot's published numbers show higher coding-eval scores (Kimi Code Bench v2, Program Bench, MLS Bench Lite) while emitting ~30% fewer reasoning tokens per task than K2.6. In agent loops the reasoning from each turn is re-read as input on the next, so fewer reasoning tokens means lower cost per completed task, not just per call. K2.6 is still the better pick for general (non-coding) agentic planning; A/B the two on your workload.

Yes — K2.7 Code is an OpenAI-compatible chat completions endpoint. Point your agent at base_url=https://api.quicksilverpro.io/v1, paste your QSP key, and set model="kimi-k2.7-code". Streaming, tool calling, and json_schema strict mode all work, and `usage.cost` is reported per response so you can watch cost-per-task directly.

OpenRouter lists K2.7 Code at $0.75 input / $3.50 output per 1M tokens; QuickSilver Pro is $0.60 / $2.80 — ~20% below on both legs. Same OpenAI-compatible surface; migration is a base_url + key swap, dropping the `moonshotai/` provider prefix from the model ID.

Try Kimi K2.7 Code with double credits — up to $50 free

Get API Key