Providers

Patter supports three provider modes that control how audio is processed and AI responses are generated.

OpenAI Realtime (Default)

The default provider uses OpenAI’s Realtime API for speech-to-speech processing. Audio streams directly between the phone and OpenAI with minimal latency.

const agent = phone.agent({
  systemPrompt: "You are a helpful assistant.",
  provider: "openai_realtime",
  model: "gpt-4o-mini-realtime-preview",
  voice: "alloy",
});

Supported voices: alloy, ash, ballad, coral, echo, sage, shimmer, verse

OpenAI Realtime is a speech-to-speech model. It handles STT, reasoning, and TTS in a single API call for the lowest latency.

ElevenLabs Conversational AI

Uses ElevenLabs’ Conversational AI platform. Requires an ElevenLabs agent created in their dashboard.

const agent = phone.agent({
  systemPrompt: "You are a helpful assistant.",
  provider: "elevenlabs_convai",
  elevenlabsKey: process.env.ELEVENLABS_KEY,
  elevenlabsAgentId: process.env.ELEVENLABS_AGENT_ID,
  voice: "21m00Tcm4TlvDq8ikWAM",
  language: "en",
});

Parameter	Description
`elevenlabsKey`	Your ElevenLabs API key.
`elevenlabsAgentId`	The agent ID from the ElevenLabs dashboard.
`voice`	ElevenLabs voice ID.

Pipeline Mode

Pipeline mode gives you full control over the STT, LLM, and TTS stages independently. Use it when you want to mix providers or add custom processing between stages.

const agent = phone.agent({
  systemPrompt: "You are a helpful assistant.",
  provider: "pipeline",
  stt: Patter.deepgram({ apiKey: process.env.DEEPGRAM_KEY! }),
  tts: Patter.elevenlabs({ apiKey: process.env.ELEVENLABS_KEY!, voice: "rachel" }),
});

await phone.serve({
  agent,
  onMessage: async (data) => {
    // Custom LLM logic — return the text to be spoken
    const transcript = data.text as string;
    const response = await myCustomLLM(transcript);
    return response;
  },
});

In pipeline mode, the onMessage callback receives the user’s transcript and must return the text for TTS.

STT Factory Functions

Create STT configurations using static factory methods:

Patter.deepgram()

const stt = Patter.deepgram({
  apiKey: process.env.DEEPGRAM_KEY!,
  language: "en", // optional, defaults to "en"
});

Patter.whisper()

const stt = Patter.whisper({
  apiKey: process.env.OPENAI_KEY!,
  language: "en", // optional, defaults to "en"
});

TTS Factory Functions

Create TTS configurations using static factory methods:

Patter.elevenlabs()

const tts = Patter.elevenlabs({
  apiKey: process.env.ELEVENLABS_KEY!,
  voice: "rachel", // optional, defaults to "rachel"
});

Patter.openaiTts()

const tts = Patter.openaiTts({
  apiKey: process.env.OPENAI_KEY!,
  voice: "alloy", // optional, defaults to "alloy"
});

OpenAI TTS returns 24kHz PCM audio, which the SDK automatically resamples to 16kHz for telephony.

Provider Comparison

Feature	OpenAI Realtime	ElevenLabs ConvAI	Pipeline
Latency	Lowest	Low	Variable
Voice quality	High	Very high	Depends on TTS
Custom LLM	No	No	Yes
Function calling	Yes	Limited	Via `onMessage`
Languages	Multi	Multi	Depends on STT/TTS

Get Started

Building Agents

Observability

Development

Providers

Providers

OpenAI Realtime (Default)

ElevenLabs Conversational AI

Pipeline Mode

STT Factory Functions

Patter.deepgram()

Patter.whisper()

TTS Factory Functions

Patter.elevenlabs()

Patter.openaiTts()

Provider Comparison

Get Started

Building Agents

Observability

Development

​Providers

​OpenAI Realtime (Default)

​ElevenLabs Conversational AI

​Pipeline Mode

​STT Factory Functions

​Patter.deepgram()

​Patter.whisper()

​TTS Factory Functions

​Patter.elevenlabs()

​Patter.openaiTts()

​Provider Comparison

Providers

OpenAI Realtime (Default)

ElevenLabs Conversational AI

Pipeline Mode

STT Factory Functions

Patter.deepgram()

Patter.whisper()

TTS Factory Functions

Patter.elevenlabs()

Patter.openaiTts()

Provider Comparison