Documentation Index
Fetch the complete documentation index at: https://docs.getpatter.com/llms.txt
Use this file to discover all available pages before exploring further.
Anthropic LLM
AnthropicLLM plugs Anthropic’s Claude models into Patter’s pipeline mode. It speaks the Messages API natively (streaming + tool_use blocks) and normalises every event into Patter’s unified {type: "text" | "tool_call" | "done"} chunk protocol, so tools defined once run across every LLM provider.
Prompt caching is enabled by default. The system prompt and the last tool block are tagged with cache_control: { type: "ephemeral" }, which cuts time-to-first-token by ~100-400 ms and ~90% of input-token cost on every cached turn.
Install
Usage
The namespaced import (
from getpatter.llm import anthropic / import * as anthropic from "getpatter/llm/anthropic") auto-resolves the API key from ANTHROPIC_API_KEY and exposes a uniform LLM class — the same pattern Patter uses for STT and TTS namespaces.Supported models
Pricing in USD per 1M tokens.cache_read is billed at ~10% of full input; cache_write at ~125%. Versioned snapshots (e.g. claude-haiku-4-5-20251001) resolve against the base entry via longest-prefix match in pricing.py.
| Model | Input | Output | Cache read | Cache write |
|---|---|---|---|---|
claude-opus-4-7 | $15.00 | $75.00 | $1.50 | $18.75 |
claude-sonnet-4-6 | $3.00 | $15.00 | $0.30 | $3.75 |
claude-haiku-4-5 (default) | $1.00 | $5.00 | $0.10 | $1.25 |
claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-7, claude-3-5-sonnet-latest, claude-3-5-haiku-latest. Pinned snapshots include claude-haiku-4-5-20251001, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022.
Environment variables
| Variable | Required | Notes |
|---|---|---|
ANTHROPIC_API_KEY | yes | Auto-loaded when api_key / apiKey is omitted. |
Options
| Option | Default | Notes |
|---|---|---|
api_key / apiKey | None | Reads from ANTHROPIC_API_KEY when omitted. |
model | "claude-haiku-4-5-20251001" | Any Anthropic Claude model id or alias. |
max_tokens / maxTokens | 1024 | Required by the Messages API on every request. |
temperature | unset | Optional sampling temperature. |
base_url / baseUrl | unset | Override the Messages API endpoint (rarely needed). |
prompt_caching / promptCaching | True | Tags the system prompt and last tool block with cache_control: ephemeral. Disable when system prompt + tools are below Anthropic’s minimum cacheable size (~1024 tokens for Sonnet/Opus, ~2048 for Haiku) — caching has no effect below that threshold. |
Prompt caching
For voice agents with long instruction-dense system prompts and large tool catalogs, prompt caching is the single biggest TTFT win Anthropic ships. Patter applies the recommended pattern automatically:- The system prompt becomes a single
textblock taggedcache_control: ephemeral. - The last tool definition is tagged
cache_control: ephemeral, which caches the entire tool array (Anthropic caches everything up to and including a marked block). - The
anthropic-beta: prompt-caching-2024-07-31header is sent on every request for consistent behaviour across model snapshots.

