Guardrails

Guardrails let you intercept and filter AI responses before they are converted to speech. Use them to block inappropriate content, enforce compliance, or replace sensitive responses.

Basic Usage

const agent = phone.agent({
  systemPrompt: "You are a helpful assistant.",
  guardrails: [
    guardrail({
      name: "profanity_filter",
      blockedTerms: ["badword1", "badword2"],
      replacement: "I apologize, but I cannot respond to that.",
    }),
  ],
});

Guardrail Interface

interface Guardrail {
  /** Name for logging when triggered */
  name: string;
  /** List of terms that trigger the guardrail (case-insensitive) */
  blockedTerms?: string[];
  /** Custom check function — return true to block the response */
  check?: (text: string) => boolean;
  /** Replacement text spoken when guardrail triggers */
  replacement?: string;
}

Creating Guardrails

Use the guardrail() static method for convenience:

const guard = guardrail({
  name: "compliance",
  blockedTerms: ["guarantee", "promise"],
  replacement: "I need to be careful with my wording. Let me rephrase that.",
});

The default replacement is "I'm sorry, I can't respond to that." when not specified.

Blocked Terms

Blocked terms are matched case-insensitively against the AI response text:

guardrail({
  name: "competitor_mentions",
  blockedTerms: ["competitor_a", "competitor_b", "rival_corp"],
  replacement: "I can only speak about our own products and services.",
});

Custom Check Function

For more complex filtering, use a check function that receives the full response text and returns true to block it:

guardrail({
  name: "length_limit",
  check: (text) => text.length > 500,
  replacement: "Let me give you a shorter answer.",
});

Combining Blocked Terms and Check

A guardrail triggers if either blockedTerms match or the check function returns true. Blocked terms are evaluated first:

guardrail({
  name: "content_policy",
  blockedTerms: ["prohibited_term"],
  check: (text) => {
    // Block responses that contain phone numbers
    return /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/.test(text);
  },
  replacement: "I cannot share that information over the phone.",
});

Multiple Guardrails

You can stack multiple guardrails. They are evaluated in order, and the first match triggers:

const agent = phone.agent({
  systemPrompt: "You are a helpful assistant.",
  guardrails: [
    guardrail({
      name: "profanity",
      blockedTerms: ["badword"],
    }),
    guardrail({
      name: "pii_protection",
      check: (text) => /\b\d{3}-\d{2}-\d{4}\b/.test(text), // SSN pattern
      replacement: "I cannot share personal identification numbers.",
    }),
    guardrail({
      name: "compliance",
      blockedTerms: ["guarantee", "warranty"],
      replacement: "I need to be careful about making guarantees.",
    }),
  ],
});

How It Works

The AI model generates a text response.
Before TTS conversion, each guardrail is checked in order.
If a guardrail triggers, the original response is discarded and the replacement text is spoken instead.
If no guardrail triggers, the original response is spoken normally.

The guardrail name is logged when triggered, so you can monitor which guardrails are firing in production.

Documentation Index

​Guardrails

​Basic Usage

​Guardrail Interface

​Creating Guardrails

​Blocked Terms

​Custom Check Function

​Combining Blocked Terms and Check

​Multiple Guardrails

​How It Works