Documentation Index
Fetch the complete documentation index at: https://docs.getpatter.com/llms.txt
Use this file to discover all available pages before exploring further.
OpenAI Realtime 2
OpenAIRealtime2 is the engine marker for OpenAI’s GA Realtime API (the production endpoint that replaces the beta OpenAI-Beta: realtime=v1 channel). It targets gpt-realtime-2 by default and routes through OpenAIRealtime2Adapter — a dedicated adapter that speaks the GA session.update wire shape and performs bidirectional audio transcoding (mulaw 8 kHz ↔ PCM 24 kHz) required by the GA audio engine.
For the legacy beta endpoint and the lower-cost gpt-realtime-mini model, keep using OpenAIRealtime. The two engines coexist — pick OpenAIRealtime2 only when you specifically want the GA endpoint or the gpt-realtime-2 model.
The GA endpoint rejects the legacy
OpenAI-Beta: realtime=v1 header and expects output_modalities, nested audio.{input,output} blocks with MIME-type strings, and session.type = "realtime". These wire-shape differences are why GA needs its own adapter — the beta OpenAIRealtimeAdapter cannot reach gpt-realtime-2 reliably.When to use
Use OpenAIRealtime2 when… | Stick with OpenAIRealtime when… |
|---|---|
You want gpt-realtime-2 — strongest instruction following + 128K context + configurable reasoning_effort. | You’re on gpt-realtime-mini for cost / latency reasons. |
| You’re hitting the GA endpoint and the beta channel is being deprecated for your account. | You don’t need the GA wire shape and want to keep the existing adapter path. |
| You want the bidirectional PCM 24 kHz transcoding handled by the SDK rather than the model silently dropping mulaw frames. | Your audio is already PCM 24 kHz end-to-end and beta works for you. |
Quickstart
reasoning_effort="low" is OpenAI’s recommended production tier for live voice — it gives the best instruction following without measurable per-turn latency.
Constructor
api_key falls back to the OPENAI_API_KEY environment variable.
Reasoning effort
| Value | When to use |
|---|---|
"minimal" | Snappy turn-taking. Skips most reasoning. |
"low" | Recommended for production voice. Good instruction following without measurable per-turn latency. |
"medium" | Multi-step tool flows where the model should plan. Adds latency. |
"high" | Complex reasoning. Not recommended for live phone calls. |
session.reasoning = { effort: ... } into the GA session.update payload. When omitted, the field is not sent and OpenAI’s server default applies.
Streaming transcription
Setinput_audio_transcription_model to override audio.input.transcription.model. The same identifiers as the beta endpoint apply — see the streaming-transcription table on the OpenAI Realtime page for the full list (whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe, gpt-realtime-whisper).
Audio path
The GA audio engine speaks PCM 24 kHz and silently drops mulaw frames. Patter handles the conversion transparently insideOpenAIRealtime2Adapter:
- Inbound (Twilio/Telnyx → model): mulaw 8 kHz → PCM 24 kHz
- Outbound (model → Twilio/Telnyx): PCM 24 kHz → mulaw 8 kHz
Direct adapter use
OpenAIRealtime2Adapter is exported and may be constructed directly when you need to share connection state across calls or override low-level fields:
OpenAIRealtimeAdapter and overrides connect(), send_audio(), receive_events(), and send_first_message() for the GA wire shape.
Backward compatibility
- Existing
OpenAIRealtime(...)callers are unaffected. The legacy engine continues to target the beta endpoint withgpt-realtime-minias the default. OpenAIRealtime2ships as an additive engine — no migration required. Pick it when you want the GA endpoint; otherwise stay where you are.- Pricing for
gpt-realtime-2is auto-resolved per model fromDEFAULT_PRICING["openai_realtime"].models["gpt-realtime-2"]— see Metrics.
What’s Next
OpenAI Realtime (beta)
The legacy engine for
gpt-realtime-mini and earlier preview models.Engines
All engine classes side by side.
Agents
Configure system prompts, tools, and first messages.
Tools
Function calling inside a Realtime session.

