Function calling
Pass tools to the chat-completions endpoint and the model can request that you call them. The wire format matches OpenAI — same tool_calls in the message, same role: tool replies on the way back.
Best models for tool use
- DeepSeek V3 — production default for tool-calling agents. Reliable JSON args, low latency, no reasoning trace cluttering the response.
- DeepSeek V4 Flash — also strong; thinks by default so you may want
reasoning.enabled=falseto keep tool selection fast. - Kimi K2.6 — agentic / planning workloads where tool chaining benefits from longer deliberation.
- Avoid R1for pure tool calling — the chain-of-thought trace means you pay for tokens you don't need.
Python — single tool
import os, json
from openai import OpenAI
client = OpenAI(
base_url="https://api.quicksilverpro.io/v1",
api_key=os.environ["QSP_KEY"],
)
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}]
messages = [{"role": "user", "content": "What's the weather in Tokyo?"}]
resp = client.chat.completions.create(
model="deepseek-v3",
messages=messages,
tools=tools,
)
msg = resp.choices[0].message
if msg.tool_calls:
call = msg.tool_calls[0]
args = json.loads(call.function.arguments)
# ... call your tool ...
result = {"city": args["city"], "temp_c": 22, "conditions": "clear"}
messages.append(msg)
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": json.dumps(result),
})
final = client.chat.completions.create(model="deepseek-v3", messages=messages, tools=tools)
print(final.choices[0].message.content)Parallel tool calls
When the model wants to call multiple tools in one turn, it returns multiple entries in tool_calls. Resolve them in any order; reply with one role: tool message per call, each referencing the matching tool_call_id.
Streaming tool calls
With stream=true, tool calls arrive as deltas just like content. The function name is streamed once; arguments accumulate across chunks as JSON fragments. You typically buffer until finish_reason=tool_calls before executing.
Strict mode
For tighter schemas, set strict: true inside the function object. The model is constrained to produce arguments that match your JSON schema exactly — including refusing unknown fields. See Structured output for the equivalent on plain replies.