Skip to main content

TTS (Text-to-Speech)

TTS is used in pipeline mode to synthesize the agent’s response audio. If you use an engine such as OpenAIRealtime or ElevenLabsConvAI, speech synthesis is handled internally by the engine. Each TTS ships as both a namespaced class (import * as elevenlabs from "getpatter/tts/elevenlabs"new elevenlabs.TTS()) and a flat alias (import { ElevenLabsTTS } from "getpatter"). They are equivalent — the flat aliases are convenient for short examples, the namespaced form avoids name collisions when mixing providers.

Quickstart

// npx tsx example.ts
import { Patter, Twilio, DeepgramSTT, ElevenLabsTTS } from "getpatter";

const phone = new Patter({ carrier: new Twilio(), phoneNumber: "+15550001234" });

const agent = phone.agent({
  stt: new DeepgramSTT(),                            // DEEPGRAM_API_KEY from env
  tts: new ElevenLabsTTS({ voiceId: "rachel" }),     // ELEVENLABS_API_KEY from env
  systemPrompt: "You are a helpful assistant.",
});

await phone.serve({ agent });
The same agent using namespaced imports:
import * as deepgram from "getpatter/stt/deepgram";
import * as elevenlabs from "getpatter/tts/elevenlabs";

const agent = phone.agent({
  stt: new deepgram.STT(),
  tts: new elevenlabs.TTS({ voiceId: "rachel" }),
  systemPrompt: "You are a helpful assistant.",
});

Supported providers

Flat importNamespaced importEnv var
ElevenLabsTTSgetpatter/tts/elevenlabsTTSELEVENLABS_API_KEY
OpenAITTSgetpatter/tts/openaiTTSOPENAI_API_KEY
CartesiaTTSgetpatter/tts/cartesiaTTSCARTESIA_API_KEY
RimeTTSgetpatter/tts/rimeTTSRIME_API_KEY
LMNTTTSgetpatter/tts/lmntTTSLMNT_API_KEY

ElevenLabs

Streaming HTTP TTS via ElevenLabs eleven_turbo_v2_5.
import { ElevenLabsTTS } from "getpatter";

const tts = new ElevenLabsTTS();                                  // reads ELEVENLABS_API_KEY
const tts = new ElevenLabsTTS({ voiceId: "rachel" });
const tts = new ElevenLabsTTS({ apiKey: "...", voiceId: "21m00Tcm4TlvDq8ikWAM", modelId: "eleven_turbo_v2_5" });
ParameterTypeDefaultDescription
apiKeystringAPI key — reads from ELEVENLABS_API_KEY if omitted.
voiceIdstring"21m00Tcm4TlvDq8ikWAM"ElevenLabs voice ID (or name).
modelIdstring"eleven_turbo_v2_5"Model preset.
outputFormatstring"pcm_16000"ElevenLabs output format.

OpenAI

import { OpenAITTS } from "getpatter";

const tts = new OpenAITTS();                                      // reads OPENAI_API_KEY
const tts = new OpenAITTS({ voice: "nova" });
OpenAI TTS returns audio at 24 kHz — Patter automatically resamples to 16 kHz for telephony.

Cartesia

Raw PCM streaming via Cartesia’s sonic-2 bytes endpoint. See Cartesia setup.
import { CartesiaTTS } from "getpatter";

const tts = new CartesiaTTS();                                    // reads CARTESIA_API_KEY
const tts = new CartesiaTTS({ voice: "f786b574-daa5-4673-aa0c-cbe3e8534c02" });  // Katie

Rime

Arcana (high fidelity) and Mist (low latency) via Rime’s HTTP endpoint. See Rime setup.
import { RimeTTS } from "getpatter";

const tts = new RimeTTS();                                        // reads RIME_API_KEY
const tts = new RimeTTS({ model: "arcana", speaker: "astra" });
const tts = new RimeTTS({ model: "mistv2", speaker: "cove", speedAlpha: 1.1, reduceLatency: true });

LMNT

Blizzard and Aurora via the LMNT HTTP API. See LMNT setup.
import { LMNTTTS } from "getpatter";

const tts = new LMNTTTS();                                        // reads LMNT_API_KEY
const tts = new LMNTTTS({ model: "blizzard", voice: "leah" });

Missing credentials

Each class throws at construction time if no API key is resolved:
Error: ElevenLabs TTS requires an apiKey. Pass { apiKey: '...' } or
set ELEVENLABS_API_KEY in the environment.

What’s Next

STT

Speech-to-text providers.

LLM

Language model providers.