DeepSeek R1 on QuickSilver Pro
DeepSeek R1 is the open-source o1 — a dedicated reasoning model that emits a long chain-of-thought trace before its final answer. On QuickSilver Pro it's $0.56 input / $2.00 output per 1M tokens, ~27× cheaper than o1 on input and ~30× on output. R1 is a reasoning specialist — chain-of-thought is the model. If you want non-thinking chat, use V3 or V4 Flash; the `reasoning.enabled=false` flag is silently stripped on the way to R1.
At a glance
Math, multi-step proofs, competition coding, theorem-style reasoning — at o1 quality, ~30x cheaper.
Pricing comparison ($/1M tokens)
| Provider | Input | Output | vs QSP |
|---|---|---|---|
| QuickSilver Pro | $0.56 | $2.00 | cheapest |
| OpenRouter (deepseek/deepseek-r1) | $0.70 | $2.50 | 20% cheaper |
| OpenAI (o1) | $15.00 | $60.00 | 97% cheaper |
When to use
Reach for R1 when the answer quality genuinely benefits from chain-of-thought: competition math (MATH, AIME, IMO), tricky concurrency / debugging puzzles, theorem-style proofs, multi-hop logical analysis, novel algorithm derivation. R1 produces 3-5× more output tokens than V3 — the thinking trace is part of the output, so budget your max_tokens accordingly.
When to use something else
Don't default to R1 for routine chat, codegen, or production agentic workflows where V3's ~3x cheaper output and predictable latency win on cost-per-task. For premium reasoning at a much lower cost with longer context, V4 Pro at $0.348/$0.696 with 1M context is the better trade-off on most real workloads.
Quickstart (curl)
curl https://api.quicksilverpro.io/v1/chat/completions \
-H "Authorization: Bearer $QSP_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-r1",
"messages": [{"role": "user", "content": "Hello!"}]
}'OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.
FAQ
On the published benchmarks (MATH, AIME 2024, GPQA Diamond, Codeforces), R1 is within a few points of o1 on most. Real-world performance varies by task — match on your own evals before committing. For text-only reasoning where o1 would be the pick and the price is the blocker, R1 is the open-source alternative most likely to land at the same answer at ~30× lower per-token cost.
R1 is a dedicated reasoning model — it emits a longer chain-of-thought trace before the final answer, and the per-token output rate ($2.00/M vs V3's $0.616/M) plus the typical 3-5× longer output mean a single R1 call often costs 10-16× a V3 call. The trade-off: on tasks that benefit from reasoning, R1's answer quality compounds; on routine chat, the extra cost is wasted.
No — R1 is built around its chain-of-thought; the model rejects requests with `reasoning: { enabled: false }`. QuickSilver Pro silently strips that field on R1 requests so a generic client doesn't 400, but you'll always get the reasoning trace. If you want non-thinking chat at low cost, use deepseek-v3 (no thinking) or deepseek-v4-flash with `reasoning.enabled=false`.