Documentation Index
Fetch the complete documentation index at: https://docs.getpatter.com/llms.txt
Use this file to discover all available pages before exploring further.
Whisper STT
WhisperSTT is a buffered HTTP transcription adapter for OpenAI’s POST /v1/audio/transcriptions endpoint. It buffers ~1 s of incoming PCM audio (16 kHz, 16-bit mono), wraps it as a WAV blob, and submits it to Whisper for transcription. Drop-in compatible with the streaming STTProvider interface so it can be swapped for Deepgram / Soniox without changes to the calling code.
For ~10x lower latency see the GPT-4o transcribe family below — it’s a strict subclass that hits the same endpoint with gpt-4o-transcribe / gpt-4o-mini-transcribe.
Install
whisper ships in the base install.
Usage
getpatter/stt/whisper and getpatter/stt/openai-transcribe both
auto-resolve OPENAI_API_KEY from the environment when apiKey is
omitted.Models and rates
Per minute of audio (defaults fromgetpatter/pricing):
| Provider key | Model | Rate / min |
|---|---|---|
whisper | whisper-1 (default) | $0.006 |
whisper | gpt-4o-transcribe | $0.006 |
whisper | gpt-4o-mini-transcribe | $0.003 |
openai_transcribe | gpt-4o-transcribe (default) | $0.006 |
openai_transcribe | gpt-4o-mini-transcribe | $0.003 |
openai_transcribe | whisper-1 | $0.006 |
Languages
language: "en" by default. Whisper-1 and the GPT-4o transcribe family auto-detect the spoken language but accept an explicit BCP-47 hint (e.g. "it", "fr", "es", "de", "pt", "ja", "zh") for higher accuracy on short utterances. See the OpenAI language coverage list.
Options
| Option | Default | Notes |
|---|---|---|
apiKey | — | Reads from OPENAI_API_KEY when omitted. |
model | "whisper-1" (Whisper) / "gpt-4o-transcribe" (Transcribe) | Restricted to the family’s allowed model set; misconfigured calls throw. |
language | "en" | BCP-47 code. |
bufferSize | ~1 s of 16 kHz PCM | Bytes buffered before each transcription request. |
responseFormat | "json" | Pass "verbose_json" to surface per-segment confidence and timestamps. |

