Providers
Patter supports three provider modes that control how audio is processed and AI responses are generated.OpenAI Realtime (Default)
The default provider uses OpenAI’s Realtime API for speech-to-speech processing. Audio streams directly between the phone and OpenAI with minimal latency.alloy, ash, ballad, coral, echo, sage, shimmer, verse
OpenAI Realtime is a speech-to-speech model. It handles STT, reasoning, and TTS in a single API call for the lowest latency.
ElevenLabs Conversational AI
Uses ElevenLabs’ Conversational AI platform. Requires an ElevenLabs agent created in their dashboard.| Parameter | Description |
|---|---|
elevenlabsKey | Your ElevenLabs API key. |
elevenlabsAgentId | The agent ID from the ElevenLabs dashboard. |
voice | ElevenLabs voice ID. |
Pipeline Mode
Pipeline mode gives you full control over the STT, LLM, and TTS stages independently. Use it when you want to mix providers or add custom processing between stages.onMessage callback receives the user’s transcript and must return the text for TTS.
STT Factory Functions
Create STT configurations using static factory methods:Patter.deepgram()
Patter.whisper()
TTS Factory Functions
Create TTS configurations using static factory methods:Patter.elevenlabs()
Patter.openaiTts()
OpenAI TTS returns 24kHz PCM audio, which the SDK automatically resamples to 16kHz for telephony.
Provider Comparison
| Feature | OpenAI Realtime | ElevenLabs ConvAI | Pipeline |
|---|---|---|---|
| Latency | Lowest | Low | Variable |
| Voice quality | High | Very high | Depends on TTS |
| Custom LLM | No | No | Yes |
| Function calling | Yes | Limited | Via onMessage |
| Languages | Multi | Multi | Depends on STT/TTS |

