Skip to main content

Metrics & Cost Tracking

Patter automatically tracks cost and latency for every call, broken down by provider component (STT, TTS, LLM, telephony).

How It Works

Metrics are collected automatically during calls. When a call ends, the on_call_end callback receives a CallMetrics object with the full breakdown:
async def on_call_end(event):
    metrics = event.get("metrics")
    if metrics:
        print(f"Duration: {metrics.duration_seconds}s")
        print(f"Total cost: ${metrics.cost.total:.4f}")
        print(f"  STT: ${metrics.cost.stt:.4f}")
        print(f"  TTS: ${metrics.cost.tts:.4f}")
        print(f"  LLM: ${metrics.cost.llm:.4f}")
        print(f"  Telephony: ${metrics.cost.telephony:.4f}")
        print(f"Avg latency: {metrics.latency_avg.total_ms}ms")
        print(f"P95 latency: {metrics.latency_p95.total_ms}ms")

Cost Breakdown

The CostBreakdown object provides per-component costs in USD:
FieldDescription
sttSpeech-to-text cost (Deepgram, Whisper).
ttsText-to-speech cost (ElevenLabs, OpenAI TTS).
llmLLM cost (OpenAI Realtime tokens).
telephonyTelephony cost (Twilio, Telnyx per-minute).
totalSum of all components.

Latency Breakdown

The LatencyBreakdown object provides per-component latency in milliseconds:
FieldDescription
stt_msTime from user speech to transcript.
llm_msTime from transcript to LLM response.
tts_msTime from LLM response to first audio byte.
total_msEnd-to-end latency (user speech to first audio).
Both latency_avg and latency_p95 are available on CallMetrics.

Per-Turn Metrics

Each conversation turn is tracked individually:
async def on_call_end(event):
    metrics = event.get("metrics")
    if metrics:
        for turn in metrics.turns:
            print(f"Turn {turn.turn_index}:")
            print(f"  User: {turn.user_text}")
            print(f"  Agent: {turn.agent_text}")
            print(f"  Latency: {turn.latency.total_ms}ms")

Custom Pricing

Override default provider pricing estimates:
phone = Patter(
    openai_key="sk-...",
    mode="local",
    pricing={
        "deepgram": {"price": 0.005},      # Override STT price per minute
        "elevenlabs": {"price": 0.15},      # Override TTS price per 1k chars
        "twilio": {"price": 0.015},         # Override telephony price per minute
    },
)

Default Pricing

ProviderUnitDefault Price
Deepgramper minute$0.0043
Whisperper minute$0.006
ElevenLabsper 1k chars$0.18
OpenAI TTSper 1k chars$0.015
OpenAI Realtimeper tokenvaries by type
Twilioper minute$0.013
Telnyxper minute$0.007
Default pricing is based on publicly listed provider rates and may become stale. Pass your own overrides for accurate cost tracking, or check the provider’s pricing page.

Real-Time Metrics

Use the on_metrics callback for live cost updates during a call:
async def on_metrics(data):
    cost = data.get("cost_so_far")
    if cost:
        print(f"Running cost: ${cost['total']:.4f}")

await phone.serve(
    agent,
    port=8000,
    on_metrics=on_metrics,
)

Data Types

from patter import CallMetrics, CostBreakdown, LatencyBreakdown, TurnMetrics

CallMetrics

FieldTypeDescription
call_idstrUnique call identifier.
duration_secondsfloatTotal call duration.
turnstuple[TurnMetrics, ...]Per-turn metrics.
costCostBreakdownCost breakdown.
latency_avgLatencyBreakdownAverage latency.
latency_p95LatencyBreakdown95th percentile latency.
provider_modestrVoice mode used.
stt_providerstrSTT provider name.
tts_providerstrTTS provider name.
telephony_providerstrTelephony provider name.

TurnMetrics

FieldTypeDescription
turn_indexintZero-based turn index.
user_textstrWhat the user said.
agent_textstrWhat the agent replied.
latencyLatencyBreakdownLatency for this turn.
stt_audio_secondsfloatAudio duration processed by STT.
tts_charactersintCharacters synthesized by TTS.
timestampfloatUnix timestamp.