Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getpatter.com/llms.txt

Use this file to discover all available pages before exploring further.

OpenAI Realtime 2

OpenAIRealtime2 is the engine marker for OpenAI’s GA Realtime API (the production endpoint that replaces the beta OpenAI-Beta: realtime=v1 channel). It targets gpt-realtime-2 by default and routes through OpenAIRealtime2Adapter — a dedicated adapter that speaks the GA session.update wire shape and performs bidirectional audio transcoding (mulaw 8 kHz ↔ PCM 24 kHz) required by the GA audio engine. For the legacy beta endpoint and the lower-cost gpt-realtime-mini model, keep using OpenAIRealtime. The two engines coexist — pick OpenAIRealtime2 only when you specifically want the GA endpoint or the gpt-realtime-2 model.
The GA endpoint rejects the legacy OpenAI-Beta: realtime=v1 header and expects output_modalities, nested audio.{input,output} blocks with MIME-type strings, and session.type = "realtime". These wire-shape differences are why GA needs its own adapter — the beta OpenAIRealtimeAdapter cannot reach gpt-realtime-2 reliably.

When to use

Use OpenAIRealtime2 when…Stick with OpenAIRealtime when…
You want gpt-realtime-2 — strongest instruction following + 128K context + configurable reasoning_effort.You’re on gpt-realtime-mini for cost / latency reasons.
You’re hitting the GA endpoint and the beta channel is being deprecated for your account.You don’t need the GA wire shape and want to keep the existing adapter path.
You want the bidirectional PCM 24 kHz transcoding handled by the SDK rather than the model silently dropping mulaw frames.Your audio is already PCM 24 kHz end-to-end and beta works for you.

Quickstart

import asyncio

from getpatter import Patter, Twilio, OpenAIRealtime2

phone = Patter(carrier=Twilio(), phone_number="+15555550100")  # TWILIO_* from env

agent = phone.agent(
    engine=OpenAIRealtime2(reasoning_effort="low"),
    system_prompt="You are a friendly receptionist.",
    first_message="Hello! How can I help today?",
)

async def main() -> None:
    await phone.serve(agent)

asyncio.run(main())
reasoning_effort="low" is OpenAI’s recommended production tier for live voice — it gives the best instruction following without measurable per-turn latency.

Constructor

from getpatter import OpenAIRealtime2

OpenAIRealtime2(
    api_key: str = "",                               # reads OPENAI_API_KEY
    voice: str = "alloy",
    model: str = "gpt-realtime-2",
    reasoning_effort: Literal["minimal", "low", "medium", "high"] | None = None,
    input_audio_transcription_model: str | None = None,  # default: whisper-1
)
All fields are optional with safe defaults. api_key falls back to the OPENAI_API_KEY environment variable.

Reasoning effort

ValueWhen to use
"minimal"Snappy turn-taking. Skips most reasoning.
"low"Recommended for production voice. Good instruction following without measurable per-turn latency.
"medium"Multi-step tool flows where the model should plan. Adds latency.
"high"Complex reasoning. Not recommended for live phone calls.
When set, Patter injects session.reasoning = { effort: ... } into the GA session.update payload. When omitted, the field is not sent and OpenAI’s server default applies.

Streaming transcription

Set input_audio_transcription_model to override audio.input.transcription.model. The same identifiers as the beta endpoint apply — see the streaming-transcription table on the OpenAI Realtime page for the full list (whisper-1, gpt-4o-mini-transcribe, gpt-4o-transcribe, gpt-realtime-whisper).

Audio path

The GA audio engine speaks PCM 24 kHz and silently drops mulaw frames. Patter handles the conversion transparently inside OpenAIRealtime2Adapter:
  • Inbound (Twilio/Telnyx → model): mulaw 8 kHz → PCM 24 kHz
  • Outbound (model → Twilio/Telnyx): PCM 24 kHz → mulaw 8 kHz
No caller-side change is required — both Twilio Media Streams (mulaw 8 kHz) and Telnyx Call Control (PCM 16 kHz / mulaw 8 kHz) work out of the box.

Direct adapter use

OpenAIRealtime2Adapter is exported and may be constructed directly when you need to share connection state across calls or override low-level fields:
from getpatter import OpenAIRealtime2Adapter

adapter = OpenAIRealtime2Adapter(
    api_key="",                          # reads OPENAI_API_KEY
    model="gpt-realtime-2",
    voice="nova",
    instructions="You are a helpful assistant.",
    reasoning_effort="low",
    input_audio_transcription_model="gpt-realtime-whisper",
)

agent = phone.agent(engine=adapter, system_prompt="...", first_message="...")
The adapter subclasses OpenAIRealtimeAdapter and overrides connect(), send_audio(), receive_events(), and send_first_message() for the GA wire shape.

Backward compatibility

  • Existing OpenAIRealtime(...) callers are unaffected. The legacy engine continues to target the beta endpoint with gpt-realtime-mini as the default.
  • OpenAIRealtime2 ships as an additive engine — no migration required. Pick it when you want the GA endpoint; otherwise stay where you are.
  • Pricing for gpt-realtime-2 is auto-resolved per model from DEFAULT_PRICING["openai_realtime"].models["gpt-realtime-2"] — see Metrics.

What’s Next

OpenAI Realtime (beta)

The legacy engine for gpt-realtime-mini and earlier preview models.

Engines

All engine classes side by side.

Agents

Configure system prompts, tools, and first messages.

Tools

Function calling inside a Realtime session.