Voice AI Agents

Build conversational AI agents that handle phone calls autonomously. Agents can answer inbound calls, place outbound calls, invoke HTTP tools mid-conversation, and retrieve information from knowledge bases — all powered by Ultravox real-time voice AI.

What are Voice AI Agents?

A Voice AI Agent is a persistent, reusable configuration that defines how an AI handles a phone call. Each agent bundles:

System prompt — instructions that shape the agent's behavior and personality
Voice — the TTS voice used for speech output
Tools — HTTP endpoints the agent can call during a conversation (e.g., look up a CRM record, check inventory)
Knowledge bases — document corpora the agent can search via RAG for factual answers
Call settings — temperature, max duration, first speaker, inactivity handling

Under the hood, Trunx routes calls through SIP to Ultravox (fixie-ai/ultravox-70B), which handles real-time speech-to-text, LLM reasoning, and text-to-speech in a single low-latency pipeline.

You can make one-off AI calls without creating a reusable agent — just pass a systemPrompt to the voice API. Reusable agents are better when you need consistent behavior, attached tools, or knowledge bases.

Making an AI Call

The fastest way to get started is a direct AI call. Include a systemPrompt in your POST /api/voice request and Trunx automatically routes it through the AI pipeline instead of a standard SIP call.

curl -X POST https://api.trunx.io/api/voice \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "from": "+14155559876",
    "to": "+14155551234",
    "systemPrompt": "You are a friendly dental office assistant calling to confirm an appointment for tomorrow at 2pm. If the patient wants to reschedule, collect their preferred date and time. Be concise and professional.",
    "voice": "terrence",
    "firstSpeaker": "agent",
    "temperature": 0.4
  }'

const response = await fetch("https://api.trunx.io/api/voice", {
  method: "POST",
  headers: {
    Authorization: "Bearer tk_live_...",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    from: "+14155559876",
    to: "+14155551234",
    systemPrompt:
      "You are a friendly dental office assistant calling to confirm an appointment for tomorrow at 2pm. If the patient wants to reschedule, collect their preferred date and time. Be concise and professional.",
    voice: "terrence",
    firstSpeaker: "agent",
    temperature: 0.4,
  }),
});

const call = await response.json();
console.log(call.id, call.status); // "abc-123" "initiated"

Response:

{
  "id": "abc-123",
  "callId": "uv_call_xyz",
  "status": "initiated"
}

Request Parameters

Parameter	Type	Required	Description
`from`	string	Yes	Caller ID in E.164 format (`+1XXXXXXXXXX`)
`to`	string	Yes	Destination number in E.164 format
`systemPrompt`	string	Yes (for AI)	Agent instructions — triggers the AI pipeline
`voice`	string	No	Voice ID for TTS output
`firstSpeaker`	`"agent"` or `"user"`	No	Who speaks first (default: `"agent"`)
`temperature`	number	No	LLM creativity, 0-1 (default: `0.4`)
`maxDuration`	string	No	Call time limit, e.g. `"300s"` (default: `300s`)
`tools`	array	No	Inline tool definitions (see Tools below)
`model`	string	No	Ultravox model (default: `fixie-ai/ultravox-70B`)
`metadata`	object	No	Custom key-value metadata for the call

Creating Reusable Agents

For production use, create a named agent in the dashboard or via MCP tools. A reusable agent stores the system prompt, voice, tools, and knowledge base references so every call is consistent.

Agents are managed through the dashboard UI:

Define the agent

Navigate to Dashboard > Agents > New Agent and configure:

Name — a descriptive identifier (e.g., "Appointment Scheduler")
System Prompt — detailed instructions for the agent's behavior
Voice — select from available voices
First Message — what the agent says when it starts speaking (for outbound calls)
Language — language code (default: en)
Max Duration — call time limit in seconds

Attach tools (optional)

Go to Agents > Tools to create HTTP tools, then attach them to the agent during creation. Tools let the agent call external APIs mid-conversation — for example, to look up an appointment in your CRM.

Attach knowledge bases (optional)

Go to Agents > Knowledge to create a knowledge base, add source URLs, then attach it to the agent. The agent can search the knowledge base during calls to provide accurate, factual answers.

Agent Configuration

System Prompt

The system prompt is the most important setting. It defines who the agent is, what it should do, and how it should behave.

You are Sarah, a scheduling assistant for Acme Dental.

Your job:
- Confirm the patient's appointment for tomorrow at 2pm
- If they want to reschedule, collect their preferred date and time
- If they want to cancel, ask for the reason

Rules:
- Be friendly and professional
- Keep responses under 2 sentences
- Never discuss pricing or insurance — transfer to a human if asked
- End the call with "Have a great day!"

Keep prompts focused and specific. Avoid embedding dynamic data (like appointment times) directly in the prompt — use tools to fetch that data at call time instead.

Voice

Trunx provides access to Ultravox's voice library. You can list available voices through the dashboard or API. Voices support cloning from an audio sample if you need a custom voice.

First Speaker

Value	Use case
`"agent"`	Outbound calls — the agent speaks first with the configured first message
`"user"`	Inbound calls — the agent waits for the caller to speak

Temperature

Controls how creative vs. deterministic the agent's responses are:

0.0 — very deterministic, sticks closely to the prompt
0.4 — default, balanced (recommended for most use cases)
1.0 — more creative and varied responses

Built-in Behaviors

Every agent automatically includes:

Inactivity handling — asks "Are you still there?" after 8 seconds of silence, hangs up after 15 seconds
End-call phrases — recognizes "goodbye", "have a good day", "bye bye" to end calls naturally
Voice activity detection — tuned for natural turn-taking with 400ms endpoint delay
Recording — all calls are recorded by default

Tools

Tools let agents call HTTP endpoints during a live conversation. When the agent decides it needs external data or wants to trigger an action, it invokes the tool and incorporates the response into the conversation.

Creating a Tool

curl -X POST https://api.trunx.io/api/voice/tools \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "check_appointment",
    "description": "Look up a patient appointment by phone number",
    "parameters": {
      "phone_number": { "type": "string", "description": "Patient phone number" }
    },
    "http_url": "https://your-api.com/appointments/lookup",
    "http_method": "POST"
  }'

const tool = await fetch("https://api.trunx.io/api/voice/tools", {
  method: "POST",
  headers: {
    Authorization: "Bearer tk_live_...",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "check_appointment",
    description: "Look up a patient appointment by phone number",
    parameters: {
      phone_number: { type: "string", description: "Patient phone number" },
    },
    http_url: "https://your-api.com/appointments/lookup",
    http_method: "POST",
  }),
});

Tool Definition Fields

Field	Type	Required	Description
`name`	string	Yes	Tool name (used by the LLM to decide when to call it)
`description`	string	Yes	What the tool does (helps the LLM understand when to use it)
`parameters`	object	Yes	JSON Schema describing the tool's input parameters
`http_url`	string	Yes	The HTTP endpoint to call
`http_method`	string	Yes	HTTP method (`GET`, `POST`, `PUT`, `DELETE`)
`http_headers`	object	No	Custom headers to include in the request

Inline Tools

For one-off calls, you can pass tools inline without creating them first:

curl -X POST https://api.trunx.io/api/voice \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "from": "+14155559876",
    "to": "+14155551234",
    "systemPrompt": "You are a weather assistant. Use the get_weather tool to answer questions about the weather.",
    "tools": [
      {
        "modelToolName": "get_weather",
        "description": "Get current weather for a city",
        "dynamicParameters": [
          {
            "name": "city",
            "location": "PARAMETER_LOCATION_BODY",
            "schema": { "type": "string", "description": "City name" },
            "required": true
          }
        ],
        "http": {
          "baseUrlPattern": "https://your-api.com/weather",
          "httpMethod": "POST"
        }
      }
    ]
  }'

Write clear, specific tool descriptions. The LLM uses the description and parameters to decide when and how to invoke the tool. Vague descriptions lead to unreliable tool use.

curl -X POST https://api.trunx.io/api/voice/knowledge \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Product FAQ",
    "description": "Frequently asked questions about our products and services"
  }'

const kb = await fetch("https://api.trunx.io/api/voice/knowledge", {
  method: "POST",
  headers: {
    Authorization: "Bearer tk_live_...",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "Product FAQ",
    description: "Frequently asked questions about our products and services",
  }),
}).then((r) => r.json());

Add source documents

Ingest content by providing a URL. The system crawls the URL and indexes the content for retrieval.

curl -X POST https://api.trunx.io/api/voice/knowledge/{id}/sources \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-site.com/faq"
  }'

You can also upload content directly with a MIME type:

curl -X POST https://api.trunx.io/api/voice/knowledge/{id}/sources \
  -H "Authorization: Bearer $TRUNX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Our return policy allows returns within 30 days of purchase...",
    "mimeType": "text/plain"
  }'

curl https://api.trunx.io/api/voice/{id} \
  -H "Authorization: Bearer $TRUNX_API_KEY"

const call = await fetch("https://api.trunx.io/api/voice/{id}", {
  headers: { Authorization: "Bearer tk_live_..." },
}).then((r) => r.json());

console.log(call.ai.transcript);
console.log(call.ai.endReason);

Response for an AI call:

{
  "id": "abc-123",
  "from": "+14155559876",
  "to": "+14155551234",
  "status": "completed",
  "provider": "ultravox",
  "direction": "outbound",
  "duration": 47,
  "createdAt": "2026-03-11T10:00:00.000Z",
  "ai": {
    "status": "ended",
    "transcript": [
      { "role": "agent", "text": "Hi, this is Sarah from Acme Dental..." },
      { "role": "user", "text": "Oh hi, yes I have an appointment tomorrow." },
      { "role": "agent", "text": "Great! I'm calling to confirm your 2pm appointment..." }
    ],
    "endReason": "hangup",
    "duration": "45s"
  }
}

End a Call

Terminate an active call programmatically:

curl -X DELETE https://api.trunx.io/api/voice/{id} \
  -H "Authorization: Bearer $TRUNX_API_KEY"

Real-Time Events

Subscribe to voice events via SSE to track call progress in real time:

curl -N "https://api.trunx.io/api/events?channels=voice" \
  -H "Authorization: Bearer $TRUNX_API_KEY"

AI call events:

event: voice.ai.initiated
data: {"id":"abc-123","from":"+14155559876","to":"+14155551234","status":"initiated","ai":true}

event: voice.ai.completed
data: {"id":"abc-123","status":"completed","duration":47}

Best Practices

Prompt design:

Keep system prompts concise and specific — under 500 words performs best
Define the agent's role, goals, and constraints clearly
Use numbered lists for multi-step workflows
Include explicit instructions for edge cases ("If the caller asks about pricing, say you'll transfer them")

Tools over static data:

Use tools to fetch dynamic information (appointments, inventory, account status) rather than embedding it in the system prompt
This keeps prompts short and ensures the agent always has current data

First speaker settings:

Use firstSpeaker: "agent" for outbound calls so the agent introduces itself immediately
Use firstSpeaker: "user" for inbound calls so the agent waits for the caller to speak

Call duration:

Set maxDuration appropriate to the use case — a confirmation call needs 60-120 seconds, a support call may need 300+
The default is 300 seconds (5 minutes)

Temperature:

Use 0.3-0.4 for task-oriented calls (scheduling, confirmations, data collection)
Use 0.6-0.8 for conversational calls (customer service, sales) where variety helps

Testing:

Test agents with real phone calls before deploying to production
Review transcripts to identify where the agent struggles
Iterate on the system prompt based on actual conversation patterns