What is DeepSeek R1 good for?

DeepSeek R1 is a reasoning model trained with reinforcement learning to produce explicit chain-of-thought before answering. It excels at math (AIME, MATH), competitive programming (Codeforces), logic puzzles, formal proofs, and multi-step planning. For tasks where the answer's quality depends on the reasoning process, R1 outperforms non-reasoning models like DeepSeek V3 at the cost of 3-5x more output tokens.

How does DeepSeek R1 pricing compare to OpenAI o1?

OpenAI o1 costs $15 per million input tokens and $60 per million output tokens. DeepSeek R1 on QuickSilver Pro costs $0.56 input and $2.00 output per million tokens. For the same workload, R1 is ~27x cheaper on input and ~30x cheaper on output — with comparable math and coding benchmark performance.

How do I access the reasoning trace?

DeepSeek R1 returns a reasoning_content field in the message object alongside content. reasoning_content holds the chain-of-thought trace; content holds the final answer. Both are billed as output tokens. If you only need the answer, you can discard reasoning_content — the cost is the same.

Is R1 overkill for simple questions?

Yes. R1 generates a long chain-of-thought even for trivial questions, which is wasted output cost. For factual Q&A, simple summarization, or casual chat, use DeepSeek V3 ($0.616 per 1M output) instead of R1 ($2.00 per 1M output). Reserve R1 for problems where the reasoning step materially changes the answer quality.

首页/用例/deepseek-r1 用于 reasoning

用例

DeepSeek R1 用于推理

DeepSeek R1 是一个通过强化学习训练的开源推理模型，会显式输出 chain-of-thought。它在 AIME 和 MATH 等基准上可与 OpenAI o1 竞争，但成本低约 30 倍：QuickSilver Pro 上每 100 万 tokens 输入 $0.56、输出 $2.00，而 o1 是 $15 / $60。对于数学、代码挑战和重逻辑 agent loop，R1 是开源世界里的默认选择。

$0.56 / $2.00 per 1M tokens

R1 擅长什么

数学：在 AIME-2024、MATH-500 和奥赛级题目上很强。推理轨迹会一步步展开推导，最终答案出现在 content 里。

算法：具备接近竞赛编程水平的代码生成能力。LiveCodeBench 和 Codeforces 的成绩接近 o1。对于新算法任务通常比 V3 更强，但因为有 CoT，速度也更慢。

多步规划：适合用在 agent loop 中的 planner 位置，让模型先拆解再行动。每次规划调用都带显式 reasoning，通常能改善工具使用决策。

什么时候 R1 值得多花这些 tokens

适合 R1 的任务：数学应用题、新算法设计、逻辑谜题、定理证明、多步工具规划、困难调试。也就是“推理过程”本身决定答案质量的场景。

不适合 R1 的任务：事实问答、代码补全、摘要、实体抽取、简单分类、翻译。这些非推理型任务里，V3 更便宜、更快，质量通常也等价。

成本校准：一篇 2000 字文章，V3 大概生成 600 个输出 token（1000 篇约 $0.37）；同样任务下，R1 连同 reasoning trace 可能生成 2500 个输出 token（1000 篇约 $5.00）。贵 13 倍。只有在这 13 倍成本能换来真实收益时才值得用。

快速上手代码

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.quicksilverpro.io/v1",
    api_key="sk-qsp-...",
)

resp = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{
        "role": "user",
        "content": "A box has 12 red and 8 blue balls. Three drawn without replacement. Probability exactly two are red?",
    }],
)

# Chain-of-thought reasoning:
print(resp.choices[0].message.reasoning_content)

# Final answer:
print(resp.choices[0].message.content)

print(f"Output tokens: {resp.usage.completion_tokens}")
print(f"Cost: ${resp.usage.cost:.6f}")

常见问题

DeepSeek R1 能和 o1 一样好吗？

在公开的数学（AIME-2024、MATH-500）、编程（LiveCodeBench、Codeforces）和推理（GPQA Diamond）基准上，DeepSeek R1 与 o1 的差距通常只有几个点，并且大多数场景下强于 o1-mini。以低 30 倍的成本做生产使用，它就是开源世界里的等价替代。

推理轨迹一般有多长？

常见范围是 500-3000 tokens。对于很难的问题（例如 IMO 级数学），推理轨迹可能超过 5000 tokens。所有 reasoning tokens 都按输出 token 计费，做成本预估时必须算进去。

R1 支持工具调用吗？

R1 接受 OpenAI tools 数组，但在工具调用上的稳定性不如 V3。对于 agent loop，更好的做法通常是：用 V3 作为工具调用执行器，只在困难规划子问题上调用 R1。这种混合模式通常能兼顾效果和成本。

我能把 reasoning trace 隐藏不给用户看吗？

可以。你可以在服务端忽略 reasoning_content，只把 content 返回给用户。只是费用不会变，因为 R1 仍然需要先生成这些 reasoning tokens 才能得到答案，没有廉价的“跳过思考”模式。

首次充值双倍 — 最高 $50 免费

获取 API Key