DeepSeek V4 Pro on QuickSilver Pro
DeepSeek V4 Pro is the V4 wave's flagship for premium reasoning — 1M-token context, thinks by default, and outputs in the same quality tier as o3-mini at a fraction of the price. On QuickSilver Pro it's $0.348 input / $0.696 output per 1M tokens, ~20% below OpenRouter's $0.435 / $0.87. The most direct open-source alternative to o3-mini for long-context premium reasoning workloads.
At a glance
Premium reasoning + 1M context, at o3-mini quality and ~6x lower output cost.
Pricing comparison ($/1M tokens)
| Provider | Input | Output | vs QSP |
|---|---|---|---|
| QuickSilver Pro | $0.35 | $0.70 | cheapest |
| OpenRouter (deepseek/deepseek-v4-pro) | $0.43 | $0.87 | 20% cheaper |
| OpenAI (o3-mini) | $1.10 | $4.40 | 84% cheaper |
When to use
V4 Pro is the right pick when V4 Flash isn't smart enough but V3 chain-of-thought ($0.696/M output) starts adding up. Multi-step coding agents, refactor planners, large-document summarization with reasoning, and any workload where you'd consider o3-mini but the price is the blocker. The 1M context window scales further than o3-mini's 200K.
When to use something else
For top-tier reasoning where R1 still wins on benchmarks (competition math, theorem proving), use deepseek-r1 — it's $2.00 / 1M output but the reasoning trace is more thorough. For closed-model capabilities, stay on OpenAI's o-series. For agentic / planning at Opus class, Kimi K2.6.
Quickstart (curl)
curl https://api.quicksilverpro.io/v1/chat/completions \
-H "Authorization: Bearer $QSP_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-pro",
"messages": [{"role": "user", "content": "Hello!"}]
}'OpenAI-compatible. Same model as OpenRouter; one-line migration via base_url.
FAQ
On premium-reasoning workloads with long context, yes. V4 Pro thinks by default, supports 1M-token context vs o3-mini's 200K, and lists at $0.348 input / $0.696 output vs o3-mini's $1.10 / $4.40 per 1M tokens — about 3x cheaper input and 6x cheaper output. Match it on your eval set; for closed-model finetunes or vision, stay on OpenAI.
V4 Pro is positioned a tier below R1 on raw reasoning depth but with a much longer context window (1M vs 128K) and lower per-token output cost ($0.696/M vs $2.00/M). R1 produces a longer chain-of-thought trace and tends to win on competition-math / theorem-proving benchmarks. V4 Pro is the better default for production reasoning workloads where R1's verbosity is wasteful.
Roughly, yes — 1M tokens is about 2.5–3 million words of code in modern languages. That's enough for most monorepos. Keep in mind cost scales linearly with input tokens (~$0.348 per 1M input), so dumping a 1M-token context costs $0.348 just on input before reasoning. For frequently-repeated context, consider RAG or partial-prompt caching when we ship it.