Audio transcription
Transcribe audio
QuickSilver Pro speaks the OpenAI audio transcriptions API. Point the OpenAI SDK at our base URL and call /v1/audio/transcriptions — the same key and prepaid balance you use for chat and images work for speech-to-text too.
Model
whisper-large-v3-turbo— fast speech-to-text. Transcribes uploaded audio files across 99+ languages. $0.0004 per audio minute.
Billing is based on audio duration, not text tokens. The meter rounds to the second, so the effective price is $0.0000066667 / second.
Python — OpenAI SDK
python
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.quicksilverpro.io/v1",
api_key=os.environ["QSP_KEY"],
)
with open("meeting.mp3", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-large-v3-turbo",
file=audio_file,
)
print(transcript.text)curl
shell
curl https://api.quicksilverpro.io/v1/audio/transcriptions \
-H "Authorization: Bearer $QSP_KEY" \
-F "file=@meeting.mp3" \
-F 'model=whisper-large-v3-turbo'The response matches OpenAI's transcription shape: JSON with a top-level text field by default.
Options
- Pass
languagewhen you already know the source language; it can improve consistency on short clips. - Pass
response_formatif you want formats likeverbose_json,vtt, orsrt. - The endpoint uses multipart form data, not a JSON request body. That matches the official OpenAI SDKs and raw HTTP examples.
Notes
- This is transcription, not text-to-speech. Use
whisper-large-v3-turbowhen you already have an audio file and want text out. - One key covers chat, image generation, and transcription — no separate billing relationship or auth system.