Documentation Index
Fetch the complete documentation index at: https://docs.getpatter.com/llms.txt
Use this file to discover all available pages before exploring further.
Ultravox Realtime
UltravoxRealtimeAdapter bridges a bidirectional audio stream to an Ultravox managed-agent call. The transport is plain WebSocket plus one REST call to create the session — no vendor SDK is required.
It speaks the same connect / send_audio / receive_events / close surface as OpenAIRealtimeAdapter, so you can swap engines without touching the call handler.
Install
ws package.
Set ULTRAVOX_API_KEY in your environment.
Constructor
Usage
Pass the adapter as theengine on phone.agent(...):
phone.agent(tools=[Tool(...)]) / tools: [...]. Patter translates the OpenAI-style JSON schema into Ultravox dynamicParameters automatically.
How it works
- The adapter
POSTs tohttps://api.ultravox.ai/api/callsto create a call. The response includes a single-usejoinUrl. - The adapter opens that URL as a WebSocket. From here:
- Binary frames are PCM16 mono audio (both directions).
- Text frames are JSON control events (transcripts, tool invocations, state).
firstSpeaker and initialMessages are mutually exclusive on the Ultravox API. When first_message / firstMessage is set, Patter sends an initialMessages agent turn; otherwise the user speaks first.
Sample rates
UltravoxSampleRate accepts 8000, 16000, 24000, or 48000 Hz. Default is 16000 — what Patter’s pipeline-mode audio bus uses internally.
When to use Ultravox vs alternatives
| Use Ultravox when… | Use OpenAI Realtime when… | Use Gemini Live when… |
|---|---|---|
| You want a managed agent with its own voice library and no vendor SDK in your dependency tree. | You need the broadest tool-calling ecosystem and reasoning tiers. | You want native-audio Gemini voices and 1M+ context. |
Notes
- The adapter implements
cancel_response()/cancelResponse()by sendingplayback_clear_buffer, which interrupts the agent’s current turn for clean barge-in. - The Python adapter uses
aiohttp; the TS adapter useswsplus the platformfetch. - The receive loop yields
("audio", bytes)for agent audio,("transcript_input", str)/("transcript_output", str)for transcripts, and("function_call", {...})for tool calls.
What’s Next
Engines
All engines side by side.
OpenAI Realtime
The default engine.
Gemini Live
Google’s native-audio realtime API.
Tools
Function calling inside a realtime session.

