Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getpatter.com/llms.txt

Use this file to discover all available pages before exploring further.

Inworld TTS

InworldTTS targets the Inworld TTS HTTP endpoint (POST https://api.inworld.ai/tts/v1/voice:stream). The response is NDJSON — one JSON object per line of the form {"result": {"audioContent": "<base64>", "timestampInfo": ...}} — and the provider yields the base64-decoded audio chunks as they arrive. The default model is inworld-tts-2 (sub-200 ms time-to-first-audio, 100+ languages with mid-utterance switching, natural-language voice steering). Pass model: "inworld-tts-1.5-max" to fall back to the prior generation when you need temperature control. The default audio output is PCM_S16LE @ 16 kHz so chunks drop straight into the Patter pipeline without transcoding.

Install

npm install getpatter
No vendor SDK is required — InworldTTS uses the platform fetch, so it works on Node 18+.

Authentication

The Inworld dashboard issues a Base64 token that is already in the form expected by the Authorization: Basic <token> header — paste it into INWORLD_API_KEY as-is. Do not re-encode it.If you only have the raw API key string, base64-encode `${apiKey}:` (note the trailing colon) yourself before passing it in.
export INWORLD_API_KEY="<base64-token-from-inworld-dashboard>"

Usage

// Namespaced import (pipeline mode)
import * as inworld from "getpatter/tts/inworld";

const tts = new inworld.TTS();                            // reads INWORLD_API_KEY
const tts2 = new inworld.TTS({ apiKey: "...", voice: "Olivia", language: "en" });

// Flat alias (equivalent)
import { InworldTTS } from "getpatter";

const tts3 = new InworldTTS();
Plug it into an agent:
// npx tsx example.ts
import { Patter, Twilio, DeepgramSTT, InworldTTS } from "getpatter";

const phone = new Patter({ carrier: new Twilio(), phoneNumber: "+15550001234" });

const agent = phone.agent({
  stt: new DeepgramSTT(),                                 // DEEPGRAM_API_KEY from env
  tts: new InworldTTS({ voice: "Ashley" }),               // INWORLD_API_KEY from env
  systemPrompt: "You are a helpful assistant.",
});

await phone.serve({ agent });

Customising delivery mode (TTS-2)

deliveryMode controls how expressive the TTS-2 voice is. Use EXPRESSIVE for warm conversational agents, STABLE when you want consistent, predictable prosody (e.g. for IVR-style read-backs), and BALANCED for the middle ground.
import { InworldTTS } from "getpatter";

const tts = new InworldTTS({
  voice: "Ashley",
  deliveryMode: "EXPRESSIVE",                             // EXPRESSIVE | BALANCED | STABLE
  speakingRate: 1.05,                                     // 0.5–1.5
  language: "en",                                         // BCP-47
});
deliveryMode is TTS-2 only — it is silently ignored by the TTS-1.5 family. Conversely temperature is TTS-1.5 only and ignored by TTS-2.

Switching to TTS-1.5 for temperature control

When you need sampling-temperature control (e.g. for more variation across multi-turn conversations), drop down to inworld-tts-1.5-max:
import { InworldTTS } from "getpatter";

const tts = new InworldTTS({
  model: "inworld-tts-1.5-max",
  voice: "Olivia",
  temperature: 0.7,                                       // TTS-1.5 only
  language: "it",
});

Models

Model idFamilyNotes
inworld-tts-2TTS-2 (default)Sub-200 ms TTFA, 100+ languages, natural-language voice steering, deliveryMode support.
inworld-tts-1.5-maxTTS-1.5Higher fidelity legacy model, temperature support.
inworld-tts-1.5-miniTTS-1.5Lower latency legacy model.
inworld-tts-1-maxTTS-1Original generation.
inworld-tts-1TTS-1Original generation.

Options

TypeScriptPythonDefaultNotes
apiKeyapi_keyReads INWORLD_API_KEY when omitted. Base64 token from the Inworld dashboard.
modelmodel"inworld-tts-2"One of the model ids above.
voicevoice"Ashley"Inworld voice name (e.g. "Ashley", "Olivia", "Craig", "Remy").
languagelanguageBCP-47 tag ("en", "it", "es", …).
audioEncodingaudio_encoding"PCM"PCM / LINEAR16 / OGG_OPUS / MP3.
sampleRatesample_rate16000Hz. Pipeline default; carrier transcoding handles 8 kHz mulaw automatically.
bitratebitrate64000Used for OGG_OPUS / MP3 encodings.
temperaturetemperatureTTS-1.5 only. Sampling temperature.
speakingRatespeaking_rate1.0Multiplier in [0.5, 1.5].
deliveryModedelivery_modeTTS-2 only. "EXPRESSIVE" / "BALANCED" / "STABLE".
baseUrlbase_urlInworld /tts/v1/voice:streamOverride for proxying or on-prem deployments.

Low-level usage

If you want the streaming generator without going through the pipeline-mode wrapper, instantiate the underlying provider directly with authToken as the first argument:
import { InworldTTS as LowLevelTTS } from "getpatter/providers/inworld-tts";

const tts = new LowLevelTTS("<auth-token>", {
  model: "inworld-tts-2",
  voice: "Ashley",
  sampleRate: 16000,
});

for await (const chunk of tts.synthesizeStream("Hello from the Patter pipeline.")) {
  // raw PCM_S16LE @ 16 kHz
}

Pricing

The default rate in pricing.ts is $0.020 / 1k characters for inworld-tts-2, with $0.025 / 1k for the TTS-1.5 family. These are placeholder defaults — verify against your current Inworld platform tier and override per-project via new Patter({ pricing: {...} }) if needed.