Cartesia STT
Streaming speech-to-text using Cartesia’sink-whisper model. Ported from LiveKit Agents (Apache 2.0) — pure-aiohttp transport, no vendor SDK required.
Quickstart
pip install getpatter[cartesia]. Supported sample rates: 8000, 16000, 24000, 44100, 48000 Hz.
Cartesia TTS
CartesiaTTS is a Patter TTSProvider backed by Cartesia’s bytes endpoint. It streams raw PCM_S16LE chunks that drop directly into Patter’s pipeline with no transcoding.
Install
Usage
Options
| Option | Default | Notes |
|---|---|---|
model | "sonic-2" | Any Cartesia TTS model id (e.g. "sonic-3"). |
voice | "f786b574-..." | Cartesia voice id. |
language | "en" | ISO 639-1 code. |
sample_rate | 16000 | Hz. |
speed | None | "fastest" ... "slowest" or float in [0.6, 2.0]. |
emotion | None | See Cartesia’s emotion list. |
volume | None | Float in [0.5, 2.0] for sonic-3. |
Attribution
Ported from LiveKit Agents (livekit-plugins-cartesia, Apache License 2.0).
