QuickSilver Pro vs OpenRouter
For 9 popular open-source LLMs (DeepSeek V4 Flash + Pro, V3, R1, Qwen 3.7 Max + 3.6 Plus + 3.6 + 3.5-35B-A3B, Kimi K2.6), QuickSilver Pro lists the same models at ~20% below OpenRouter's public per-token rates — same OpenAI-compatible API, two-line migration. For closed models (GPT-4, Claude) or the long tail, OpenRouter is still the right tool.
At a glance
| Feature | QuickSilver Pro | openrouter |
|---|---|---|
| Models in catalog | 9 (DeepSeek V4 Flash + Pro, V3, R1, Qwen 3.7 Max + 3.6 Plus + 3.6 + 3.5-35B-A3B, Kimi K2.6) | 300+ |
| Pricing on shared models | 20% below OpenRouter | Baseline |
| OpenAI-compatible surface | Yes | Yes |
| Streaming / tools / json_schema | Yes | Yes |
| usage.cost on responses | Yes (synthetic) | Yes |
| Per-key monthly spend limits | Yes | Yes |
| Closed models (GPT-4, Claude) | No | Yes |
| Launch bonus | First deposit matched 100%, up to $50 | Limited free models |
| Minimum top-up | $5 | $10 |
Pricing (per million tokens, USD)
Public list prices as of May 2026.
| Model | QSP input | QSP output | openrouter input | openrouter output | Savings |
|---|---|---|---|---|---|
| DeepSeek V4 Flash | $0.08 | $0.16 | $0.10 | $0.20 | ~20% |
| DeepSeek V4 Pro | $0.348 | $0.696 | $0.435 | $0.87 | ~20% |
| DeepSeek V3 | $0.16 | $0.616 | $0.20 | $0.77 | ~20% |
| DeepSeek R1 | $0.56 | $2.00 | $0.70 | $2.50 | ~20% |
| Qwen3.6-35B-A3B | $0.12 | $0.80 | $0.15 | $1.00 | ~20% |
| Qwen3.5-35B-A3B | $0.111 | $0.80 | $0.139 | $1.00 | ~20% |
| Kimi K2.6 | $0.584 | $2.79 | $0.73 | $3.49 | ~20% |
Migration - two lines
from openai import OpenAI
client = OpenAI(
base_url="https://api.quicksilverpro.io/v1",
api_key=os.environ["QSP_KEY"],
)
r = client.chat.completions.create(
model="deepseek-v3",
messages=[{"role": "user", "content": "Hi"}],
)FAQ
Yes, on the 9 shared open-source models (DeepSeek V4 Flash + Pro, V3, R1, Qwen 3.7 Max + 3.6 Plus + 3.6 + 3.5-35B-A3B, Kimi K2.6): ~20% below OpenRouter's public per-token rates. See the pricing table above for exact numbers.
Two lines in your OpenAI SDK setup: change base_url from openrouter.ai/api/v1 to api.quicksilverpro.io/v1, swap the API key. Drop the provider/ prefix from model IDs (e.g. deepseek/deepseek-v4-flash → deepseek-v4-flash, qwen/qwen3.6-35b-a3b → qwen3.6-35b, moonshotai/kimi-k2.6 → kimi-k2.6).
If your workload needs closed models (GPT-4, Claude, Gemini), Llama, Mistral, or the long tail. QuickSilver Pro serves 9 open-source models; OpenRouter serves 300+.
Yes for the shared models. Streaming, tool / function calling, json_schema strict mode, and standard usage accounting all work through the official OpenAI SDK. Each response also returns a synthetic usage.cost computed from the public per-token rate.
DeepSeek V4 Flash + Pro, Qwen 3.6/3.7, and Kimi K2.6 all default to chain-of-thought reasoning on OpenRouter — so a one-token "Hi" can return hundreds of reasoning tokens. For DeepSeek V4 and Kimi K2.6 we pass requests through unchanged: set `reasoning: { enabled: false }` to get V3-style cheap chat without the thinking overhead. For the Qwen 3.6/3.7 models the gateway already sends `reasoning: { enabled: false }` by default — pass `reasoning: { enabled: true }` to opt back into reasoning. Existing V3 keeps its non-thinking behavior.