Skip to main content

TTS (Text-to-Speech)

TTS is used in pipeline mode to synthesize the agent’s response audio. If you use an engine such as OpenAIRealtime or ElevenLabsConvAI, speech synthesis is handled internally by the engine. Each TTS ships as both a namespaced class (from getpatter.tts import elevenlabselevenlabs.TTS()) and a flat alias (from getpatter import ElevenLabsTTS). They are equivalent — the flat aliases are convenient for short examples, the namespaced form avoids name collisions when mixing providers.

Quickstart

import asyncio
from getpatter import Patter, Twilio, DeepgramSTT, ElevenLabsTTS

phone = Patter(carrier=Twilio(), phone_number="+15550001234")  # TWILIO_* from env

agent = phone.agent(
    stt=DeepgramSTT(),                            # DEEPGRAM_API_KEY from env
    tts=ElevenLabsTTS(voice_id="rachel"),         # ELEVENLABS_API_KEY from env
    system_prompt="You are a helpful assistant.",
)

async def main():
    await phone.serve(agent)

asyncio.run(main())
The same agent using namespaced imports:
from getpatter.stt import deepgram
from getpatter.tts import elevenlabs

agent = phone.agent(
    stt=deepgram.STT(),
    tts=elevenlabs.TTS(voice_id="rachel"),
    system_prompt="You are a helpful assistant.",
)

Supported providers

Flat importNamespaced importEnv varInstall extra
ElevenLabsTTSgetpatter.tts.elevenlabs.TTSELEVENLABS_API_KEYincluded
OpenAITTSgetpatter.tts.openai.TTSOPENAI_API_KEYincluded
CartesiaTTSgetpatter.tts.cartesia.TTSCARTESIA_API_KEYgetpatter[cartesia]
RimeTTSgetpatter.tts.rime.TTSRIME_API_KEYgetpatter[rime]
LMNTTTSgetpatter.tts.lmnt.TTSLMNT_API_KEYgetpatter[lmnt]

ElevenLabs

Streaming HTTP TTS via ElevenLabs eleven_turbo_v2_5.
from getpatter import ElevenLabsTTS

tts = ElevenLabsTTS()                        # reads ELEVENLABS_API_KEY
tts = ElevenLabsTTS(voice_id="rachel")
tts = ElevenLabsTTS(api_key="...", voice_id="21m00Tcm4TlvDq8ikWAM", model_id="eleven_turbo_v2_5")
ParameterTypeDefaultDescription
api_keystr | NoneNoneAPI key — reads from ELEVENLABS_API_KEY if omitted.
voice_idstr"21m00Tcm4TlvDq8ikWAM"ElevenLabs voice ID (or name).
model_idstr"eleven_turbo_v2_5"Model preset.
output_formatstr"pcm_16000"ElevenLabs output format.

OpenAI

from getpatter import OpenAITTS

tts = OpenAITTS()                            # reads OPENAI_API_KEY
tts = OpenAITTS(voice="nova")
ParameterTypeDefaultDescription
api_keystr | NoneNoneAPI key — reads from OPENAI_API_KEY if omitted.
voicestr"alloy"One of "alloy", "echo", "fable", "onyx", "nova", "shimmer".
modelstr"tts-1"OpenAI TTS model ID.
OpenAI TTS returns audio at 24 kHz — Patter automatically resamples to 16 kHz for telephony.

Cartesia

Raw PCM streaming via Cartesia’s sonic-2 bytes endpoint. See Cartesia setup.
from getpatter import CartesiaTTS

tts = CartesiaTTS()                          # reads CARTESIA_API_KEY
tts = CartesiaTTS(voice="f786b574-daa5-4673-aa0c-cbe3e8534c02")  # Katie

Rime

Arcana (high fidelity) and Mist (low latency) via Rime’s HTTP endpoint. See Rime setup.
from getpatter import RimeTTS

tts = RimeTTS()                              # reads RIME_API_KEY
tts = RimeTTS(model="arcana", speaker="astra")
tts = RimeTTS(model="mistv2", speaker="cove", speed_alpha=1.1, reduce_latency=True)

LMNT

Blizzard and Aurora via the LMNT HTTP API. See LMNT setup.
from getpatter import LMNTTTS

tts = LMNTTTS()                              # reads LMNT_API_KEY
tts = LMNTTTS(model="blizzard", voice="leah")

Missing credentials

Each class raises ValueError at construction time if no API key is resolved:
ValueError: ElevenLabs TTS requires an api_key. Pass api_key='...' or
set ELEVENLABS_API_KEY in the environment.

What’s Next

STT

Speech-to-text providers.

LLM

Language model providers.