Audio transcription

Transcribe audio

QuickSilver Pro speaks the OpenAI audio transcriptions API. Point the OpenAI SDK at our base URL and call /v1/audio/transcriptions — the same key and prepaid balance you use for chat and images work for speech-to-text too.

Model

  • whisper-large-v3-turbo fast speech-to-text. Transcribes uploaded audio files across 99+ languages. $0.0004 per audio minute.

Billing is based on audio duration, not text tokens. The meter rounds to the second, so the effective price is $0.0000066667 / second.

Python — OpenAI SDK

python
import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.quicksilverpro.io/v1",
    api_key=os.environ["QSP_KEY"],
)

with open("meeting.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-large-v3-turbo",
        file=audio_file,
    )

print(transcript.text)

curl

shell
curl https://api.quicksilverpro.io/v1/audio/transcriptions \
  -H "Authorization: Bearer $QSP_KEY" \
  -F "file=@meeting.mp3" \
  -F 'model=whisper-large-v3-turbo'

The response matches OpenAI's transcription shape: JSON with a top-level text field by default.

Options

  • Pass language when you already know the source language; it can improve consistency on short clips.
  • Pass response_format if you want formats like verbose_json, vtt, or srt.
  • The endpoint uses multipart form data, not a JSON request body. That matches the official OpenAI SDKs and raw HTTP examples.

Notes

  • This is transcription, not text-to-speech. Use whisper-large-v3-turbo when you already have an audio file and want text out.
  • One key covers chat, image generation, and transcription — no separate billing relationship or auth system.