Guides

Voice AI Agents

Build and deploy AI-powered voice agents

Build conversational AI agents that handle phone calls autonomously. Agents can answer inbound calls, place outbound calls, invoke HTTP tools mid-conversation, and retrieve information from knowledge bases — all powered by Ultravox real-time voice AI.

What are Voice AI Agents?

A Voice AI Agent is a persistent, reusable configuration that defines how an AI handles a phone call. Each agent bundles:

  • System prompt — instructions that shape the agent's behavior and personality
  • Voice — the TTS voice used for speech output
  • Tools — HTTP endpoints the agent can call during a conversation (e.g., look up a CRM record, check inventory)
  • Knowledge bases — document corpora the agent can search via RAG for factual answers
  • Call settings — temperature, max duration, first speaker, inactivity handling

Under the hood, Trunx routes calls through SIP to Ultravox (fixie-ai/ultravox-70B), which handles real-time speech-to-text, LLM reasoning, and text-to-speech in a single low-latency pipeline.

You can make one-off AI calls without creating a reusable agent — just pass a systemPrompt to the voice API. Reusable agents are better when you need consistent behavior, attached tools, or knowledge bases.

Making an AI Call

The fastest way to get started is a direct AI call. Include a systemPrompt in your POST /api/voice request and Trunx automatically routes it through the AI pipeline instead of a standard SIP call.

curl -X POST https://api.trunx.io/api/voice \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "from": "+14155559876",
    "to": "+14155551234",
    "systemPrompt": "You are a friendly dental office assistant calling to confirm an appointment for tomorrow at 2pm. If the patient wants to reschedule, collect their preferred date and time. Be concise and professional.",
    "voice": "terrence",
    "firstSpeaker": "agent",
    "temperature": 0.4
  }'
const response = await fetch("https://api.trunx.io/api/voice", {
  method: "POST",
  headers: {
    Authorization: "Bearer tk_live_...",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    from: "+14155559876",
    to: "+14155551234",
    systemPrompt:
      "You are a friendly dental office assistant calling to confirm an appointment for tomorrow at 2pm. If the patient wants to reschedule, collect their preferred date and time. Be concise and professional.",
    voice: "terrence",
    firstSpeaker: "agent",
    temperature: 0.4,
  }),
});

const call = await response.json();
console.log(call.id, call.status); // "abc-123" "initiated"

Response:

{
  "id": "abc-123",
  "callId": "uv_call_xyz",
  "status": "initiated"
}

Request Parameters

ParameterTypeRequiredDescription
fromstringYesCaller ID in E.164 format (+1XXXXXXXXXX)
tostringYesDestination number in E.164 format
systemPromptstringYes (for AI)Agent instructions — triggers the AI pipeline
voicestringNoVoice ID for TTS output
firstSpeaker"agent" or "user"NoWho speaks first (default: "agent")
temperaturenumberNoLLM creativity, 0-1 (default: 0.4)
maxDurationstringNoCall time limit, e.g. "300s" (default: 300s)
toolsarrayNoInline tool definitions (see Tools below)
modelstringNoUltravox model (default: fixie-ai/ultravox-70B)
metadataobjectNoCustom key-value metadata for the call

Creating Reusable Agents

For production use, create a named agent in the dashboard or via MCP tools. A reusable agent stores the system prompt, voice, tools, and knowledge base references so every call is consistent.

Agents are managed through the dashboard UI:

Define the agent

Navigate to Dashboard > Agents > New Agent and configure:

  • Name — a descriptive identifier (e.g., "Appointment Scheduler")
  • System Prompt — detailed instructions for the agent's behavior
  • Voice — select from available voices
  • First Message — what the agent says when it starts speaking (for outbound calls)
  • Language — language code (default: en)
  • Max Duration — call time limit in seconds

Attach tools (optional)

Go to Agents > Tools to create HTTP tools, then attach them to the agent during creation. Tools let the agent call external APIs mid-conversation — for example, to look up an appointment in your CRM.

Attach knowledge bases (optional)

Go to Agents > Knowledge to create a knowledge base, add source URLs, then attach it to the agent. The agent can search the knowledge base during calls to provide accurate, factual answers.

Agent Configuration

System Prompt

The system prompt is the most important setting. It defines who the agent is, what it should do, and how it should behave.

You are Sarah, a scheduling assistant for Acme Dental.

Your job:
- Confirm the patient's appointment for tomorrow at 2pm
- If they want to reschedule, collect their preferred date and time
- If they want to cancel, ask for the reason

Rules:
- Be friendly and professional
- Keep responses under 2 sentences
- Never discuss pricing or insurance — transfer to a human if asked
- End the call with "Have a great day!"

Keep prompts focused and specific. Avoid embedding dynamic data (like appointment times) directly in the prompt — use tools to fetch that data at call time instead.

Voice

Trunx provides access to Ultravox's voice library. You can list available voices through the dashboard or API. Voices support cloning from an audio sample if you need a custom voice.

First Speaker

ValueUse case
"agent"Outbound calls — the agent speaks first with the configured first message
"user"Inbound calls — the agent waits for the caller to speak

Temperature

Controls how creative vs. deterministic the agent's responses are:

  • 0.0 — very deterministic, sticks closely to the prompt
  • 0.4 — default, balanced (recommended for most use cases)
  • 1.0 — more creative and varied responses

Built-in Behaviors

Every agent automatically includes:

  • Inactivity handling — asks "Are you still there?" after 8 seconds of silence, hangs up after 15 seconds
  • End-call phrases — recognizes "goodbye", "have a good day", "bye bye" to end calls naturally
  • Voice activity detection — tuned for natural turn-taking with 400ms endpoint delay
  • Recording — all calls are recorded by default

Tools

Tools let agents call HTTP endpoints during a live conversation. When the agent decides it needs external data or wants to trigger an action, it invokes the tool and incorporates the response into the conversation.

Creating a Tool

curl -X POST https://api.trunx.io/api/voice/tools \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "check_appointment",
    "description": "Look up a patient appointment by phone number",
    "parameters": {
      "phone_number": { "type": "string", "description": "Patient phone number" }
    },
    "http_url": "https://your-api.com/appointments/lookup",
    "http_method": "POST"
  }'
const tool = await fetch("https://api.trunx.io/api/voice/tools", {
  method: "POST",
  headers: {
    Authorization: "Bearer tk_live_...",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "check_appointment",
    description: "Look up a patient appointment by phone number",
    parameters: {
      phone_number: { type: "string", description: "Patient phone number" },
    },
    http_url: "https://your-api.com/appointments/lookup",
    http_method: "POST",
  }),
});

Tool Definition Fields

FieldTypeRequiredDescription
namestringYesTool name (used by the LLM to decide when to call it)
descriptionstringYesWhat the tool does (helps the LLM understand when to use it)
parametersobjectYesJSON Schema describing the tool's input parameters
http_urlstringYesThe HTTP endpoint to call
http_methodstringYesHTTP method (GET, POST, PUT, DELETE)
http_headersobjectNoCustom headers to include in the request

Inline Tools

For one-off calls, you can pass tools inline without creating them first:

curl -X POST https://api.trunx.io/api/voice \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "from": "+14155559876",
    "to": "+14155551234",
    "systemPrompt": "You are a weather assistant. Use the get_weather tool to answer questions about the weather.",
    "tools": [
      {
        "modelToolName": "get_weather",
        "description": "Get current weather for a city",
        "dynamicParameters": [
          {
            "name": "city",
            "location": "PARAMETER_LOCATION_BODY",
            "schema": { "type": "string", "description": "City name" },
            "required": true
          }
        ],
        "http": {
          "baseUrlPattern": "https://your-api.com/weather",
          "httpMethod": "POST"
        }
      }
    ]
  }'

Write clear, specific tool descriptions. The LLM uses the description and parameters to decide when and how to invoke the tool. Vague descriptions lead to unreliable tool use.

Knowledge Bases

Knowledge bases give agents access to your documents through retrieval-augmented generation (RAG). During a call, the agent searches the knowledge base and uses relevant passages to answer questions accurately.

Creating a Knowledge Base

Create the knowledge base

curl -X POST https://api.trunx.io/api/voice/knowledge \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Product FAQ",
    "description": "Frequently asked questions about our products and services"
  }'
const kb = await fetch("https://api.trunx.io/api/voice/knowledge", {
  method: "POST",
  headers: {
    Authorization: "Bearer tk_live_...",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "Product FAQ",
    description: "Frequently asked questions about our products and services",
  }),
}).then((r) => r.json());

Add source documents

Ingest content by providing a URL. The system crawls the URL and indexes the content for retrieval.

curl -X POST https://api.trunx.io/api/voice/knowledge/{id}/sources \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-site.com/faq"
  }'

You can also upload content directly with a MIME type:

curl -X POST https://api.trunx.io/api/voice/knowledge/{id}/sources \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Our return policy allows returns within 30 days of purchase...",
    "mimeType": "text/plain"
  }'

Attach to an agent

When creating or updating an agent, include the knowledge base ID in the knowledgeIds array. The agent will automatically search the knowledge base when it needs factual information.

Monitoring Calls

Get Call Details

After a call completes, retrieve the full details including the AI transcript:

curl https://api.trunx.io/api/voice/{id} \
  -H "Authorization: Bearer $TRUNX_API_KEY"
const call = await fetch("https://api.trunx.io/api/voice/{id}", {
  headers: { Authorization: "Bearer tk_live_..." },
}).then((r) => r.json());

console.log(call.ai.transcript);
console.log(call.ai.endReason);

Response for an AI call:

{
  "id": "abc-123",
  "from": "+14155559876",
  "to": "+14155551234",
  "status": "completed",
  "provider": "ultravox",
  "direction": "outbound",
  "duration": 47,
  "createdAt": "2026-03-11T10:00:00.000Z",
  "ai": {
    "status": "ended",
    "transcript": [
      { "role": "agent", "text": "Hi, this is Sarah from Acme Dental..." },
      { "role": "user", "text": "Oh hi, yes I have an appointment tomorrow." },
      { "role": "agent", "text": "Great! I'm calling to confirm your 2pm appointment..." }
    ],
    "endReason": "hangup",
    "duration": "45s"
  }
}

End a Call

Terminate an active call programmatically:

curl -X DELETE https://api.trunx.io/api/voice/{id} \
  -H "Authorization: Bearer $TRUNX_API_KEY"

Real-Time Events

Subscribe to voice events via SSE to track call progress in real time:

curl -N "https://api.trunx.io/api/events?channels=voice" \
  -H "Authorization: Bearer $TRUNX_API_KEY"

AI call events:

event: voice.ai.initiated
data: {"id":"abc-123","from":"+14155559876","to":"+14155551234","status":"initiated","ai":true}

event: voice.ai.completed
data: {"id":"abc-123","status":"completed","duration":47}

Best Practices

Prompt design:

  • Keep system prompts concise and specific — under 500 words performs best
  • Define the agent's role, goals, and constraints clearly
  • Use numbered lists for multi-step workflows
  • Include explicit instructions for edge cases ("If the caller asks about pricing, say you'll transfer them")

Tools over static data:

  • Use tools to fetch dynamic information (appointments, inventory, account status) rather than embedding it in the system prompt
  • This keeps prompts short and ensures the agent always has current data

First speaker settings:

  • Use firstSpeaker: "agent" for outbound calls so the agent introduces itself immediately
  • Use firstSpeaker: "user" for inbound calls so the agent waits for the caller to speak

Call duration:

  • Set maxDuration appropriate to the use case — a confirmation call needs 60-120 seconds, a support call may need 300+
  • The default is 300 seconds (5 minutes)

Temperature:

  • Use 0.3-0.4 for task-oriented calls (scheduling, confirmations, data collection)
  • Use 0.6-0.8 for conversational calls (customer service, sales) where variety helps

Testing:

  • Test agents with real phone calls before deploying to production
  • Review transcripts to identify where the agent struggles
  • Iterate on the system prompt based on actual conversation patterns

On this page