Anthropic LLM

AnthropicLLM plugs Anthropic’s Claude models into Patter’s pipeline mode. It speaks the Messages API natively (streaming + tool_use blocks) and normalises every event into Patter’s unified {type: "text" | "tool_call" | "done"} chunk protocol, so tools defined once run across every LLM provider. Prompt caching is enabled by default. The system prompt and the last tool block are tagged with cache_control: { type: "ephemeral" }, which cuts time-to-first-token by ~100-400 ms and ~90% of input-token cost on every cached turn.

Install

npm install getpatter

pip install "getpatter[anthropic]"

Usage

// Namespaced import
import * as anthropic from "getpatter/llm/anthropic";

const llm = new anthropic.LLM();                            // reads ANTHROPIC_API_KEY
const llm = new anthropic.LLM({ apiKey: "sk-ant-...", model: "claude-haiku-4-5-20251001" });
const llm = new anthropic.LLM({ promptCaching: false });    // opt out of caching

// Flat alias (equivalent)
import { AnthropicLLM } from "getpatter";

const llm2 = new AnthropicLLM();

The namespaced import (import * as anthropic from "getpatter/llm/anthropic" / from getpatter.llm import anthropic) auto-resolves the API key from ANTHROPIC_API_KEY and exposes a uniform LLM class — the same pattern Patter uses for STT and TTS namespaces.

Plug it into an agent:

import { Patter, Twilio, DeepgramSTT, AnthropicLLM, ElevenLabsTTS } from "getpatter";

const phone = new Patter({ carrier: new Twilio(), phoneNumber: "+15550001234" });

const agent = phone.agent({
  stt: new DeepgramSTT(),
  llm: new AnthropicLLM(),                                  // ANTHROPIC_API_KEY from env
  tts: new ElevenLabsTTS({ voiceId: "rachel" }),
  systemPrompt: "You are a helpful assistant.",
  firstMessage: "Hi, how can I help?",
});

await phone.serve(agent);

Supported models

Pricing in USD per 1M tokens. cache_read is billed at ~10% of full input; cache_write at ~125%. Versioned snapshots (e.g. claude-haiku-4-5-20251001) resolve against the base entry via longest-prefix match in pricing.ts.

Model	Input	Output	Cache read	Cache write
`claude-opus-4-7`	$15.00	$75.00	$1.50	$18.75
`claude-sonnet-4-6`	$3.00	$15.00	$0.30	$3.75
`claude-haiku-4-5` (default)	$1.00	$5.00	$0.10	$1.25

Aliases that route to the latest snapshot: claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-7, claude-3-5-sonnet-latest, claude-3-5-haiku-latest. Pinned snapshots include claude-haiku-4-5-20251001, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022.

Environment variables

Variable	Required	Notes
`ANTHROPIC_API_KEY`	yes	Auto-loaded when `apiKey` / `api_key` is omitted.

Options

Option	Default	Notes
`apiKey` / `api_key`	`undefined`	Reads from `ANTHROPIC_API_KEY` when omitted.
`model`	`"claude-haiku-4-5-20251001"`	Any Anthropic Claude model id or alias.
`maxTokens` / `max_tokens`	`1024`	Required by the Messages API on every request.
`temperature`	unset	Optional sampling temperature.
`baseUrl` / `base_url`	unset	Override the Messages API endpoint (rarely needed).
`anthropicVersion`	unset	Override the `anthropic-version` header (TS only).
`promptCaching` / `prompt_caching`	`true`	Tags the system prompt and last tool block with `cache_control: ephemeral`. Disable when system prompt + tools are below Anthropic’s minimum cacheable size (~1024 tokens for Sonnet/Opus, ~2048 for Haiku) — caching has no effect below that threshold.

Prompt caching

For voice agents with long instruction-dense system prompts and large tool catalogs, prompt caching is the single biggest TTFT win Anthropic ships. Patter applies the recommended pattern automatically:

The system prompt becomes a single text block tagged cache_control: ephemeral.
The last tool definition is tagged cache_control: ephemeral, which caches the entire tool array (Anthropic caches everything up to and including a marked block).
The anthropic-beta: prompt-caching-2024-07-31 header is sent on every request for consistent behaviour across model snapshots.

The cache lives ~5 minutes — the first request writes it, subsequent requests within that window hit it for ~90% input-token savings on the cached portion.

Get Started

Setting up Patter

Observability

Integrations

Development

Anthropic

Anthropic LLM

Install

Usage

Supported models

Environment variables

Options

Prompt caching

Get Started

Setting up Patter

Observability

Integrations

Development

Documentation Index

​Anthropic LLM

​Install

​Usage

​Supported models

​Environment variables

​Options

​Prompt caching

Anthropic LLM

Install

Usage

Supported models

Environment variables

Options

Prompt caching