📘 Public beta · Endpoints are stable; OpenAPI specs and SDKs ship monthly. See changelog →
Products
AI Automation
Agents & conversations

Agents & conversations

Flows are batch. Conversations are multi-turn chat. Use conversations when an analyst (or end-user) interacts iteratively with an LLM that has access to tools.

Agents

An agent is a reusable LLM character: system prompt + tool set + model choice. Define once, instantiate into many conversations.

POST/api/agents
Auth · API keyScope · agents:write
{
  "name": "Fraud Investigator",
  "systemPrompt": "You are a fraud-investigation assistant for an Indonesian bank. You can search customer history, look up transactions, and check device flags. Always cite specific transactions when making accusations. Be conservative — when in doubt, recommend human review.",
  "tools": [
    {
      "slug": "search_customer",
      "name": "Search customer by NIK or name",
      "description": "Fetch a customer record by NIK or partial name match.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "nik":   { "type": "string" },
          "name":  { "type": "string" }
        }
      }
    },
    {
      "slug": "list_recent_transactions",
      "name": "List recent transactions",
      "description": "Get the customer's most recent transactions.",
      "inputSchema": {
        "type": "object",
        "properties": {
          "customerId": { "type": "string" },
          "limit": { "type": "integer", "default": 20 }
        },
        "required": ["customerId"]
      }
    }
  ],
  "model": "quantum-ai:default"
}

Tools are abstract — defining the tool gives the LLM permission to call it. Implementing the tool happens elsewhere: tool calls fire as agent.tool_called events that your service handles + returns results to the conversation.

Conversations

A conversation is a multi-turn session with an agent. Each user message + LLM response + tool calls live in the conversation history.

POST/api/conversations
Auth · API keyScope · conversations:write
{
  "agentId": "agt_01HXY...",
  "context": {
    "investigatorId": "usr_01HXY...",
    "openCases": ["cas_01HXY..."]
  }
}

context is opaque per-conversation state passed into the system prompt.

Response (201):

{
  "data": {
    "conversation": {
      "id": "cnv_01HXY...",
      "agentId": "agt_01HXY...",
      "createdAt": "..."
    }
  }
}

Send a message

POST/api/conversations/{id}/messages
Auth · API keyScope · conversations:write
{
  "role": "user",
  "content": "Customer cus_01HXY... reported an unauthorized IDR 5M debit yesterday. Investigate."
}

Response is the LLM's reply (streamed if you pass stream=true in the headers):

{
  "data": {
    "message": {
      "id": "msg_01HXY...",
      "role": "assistant",
      "content": "I'll search the customer's recent transactions and check for the IDR 5M debit. Let me look this up.",
      "toolCalls": [
        {
          "slug": "list_recent_transactions",
          "input": { "customerId": "cus_01HXY...", "limit": 30 }
        }
      ],
      "createdAt": "..."
    }
  }
}

When the LLM calls a tool, your service receives the call via agent.tool_called webhook (or polling), executes it, and POSTs the result back:

POST /api/conversations/{id}/messages
{
  "role": "tool",
  "toolCallId": "tc_...",
  "content": { "transactions": [...] }
}

The conversation continues — the LLM uses the tool output to compose the next assistant message.

Streaming

Pass Accept: text/event-stream on the message send. Server-sent events stream the assistant's tokens as they're generated:

event: token
data: {"text": "I'll "}

event: token
data: {"text": "search "}

event: tool_call
data: {"slug": "list_recent_transactions", "input": {...}}

event: done
data: {"messageId": "msg_..."}

Standard fetch / EventSource in any language works. Browser-side EventSource only supports GET, so use fetch with streaming reader for POST:

const res = await fetch(`/api/conversations/${conversationId}/messages`, {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type':  'application/json',
    'Accept':        'text/event-stream',
  },
  body: JSON.stringify({ role: 'user', content: text }),
});

const reader = res.body!.getReader();
const decoder = new TextDecoder();
let buffer = '';
for (;;) {
  const { value, done } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value, { stream: true });
  for (const block of buffer.split('\n\n')) {
    const [eventLine, dataLine] = block.split('\n');
    if (!dataLine) continue;
    const event = eventLine.slice('event: '.length);
    const data  = JSON.parse(dataLine.slice('data: '.length));
    if (event === 'token')     ui.appendText(data.text);
    if (event === 'tool_call') ui.showToolBubble(data.slug, data.input);
    if (event === 'done')      ui.finalize(data.messageId);
  }
  buffer = buffer.slice(buffer.lastIndexOf('\n\n') + 2);
}

Continuation after a tool call

The assistant message that requested a tool call is incomplete until you post the tool result. The conversation will not auto-resume — your service must POST the tool message:

curl -X POST .../api/conversations/cnv_01HXY.../messages \
  -d '{
    "role": "tool",
    "toolCallId": "tc_01HXY...",
    "content": { "transactions": [...] }
  }'

The server then re-runs inference with the tool result in context and emits the next assistant message (streamed if you re-subscribe to the stream, or returned synchronously).

Conversation history

GET/api/conversations/{id}
Auth · API keyScope · conversations:read
GET/api/conversations
Auth · API keyScope · conversations:read

GET /api/conversations/{id} returns the full message history. GET /api/conversations?agentId=agt_...&limit=50 lists conversations for an agent.

When to use a conversation vs a flow

Use a conversationUse a flow
Multi-turn (analyst iterating with LLM)Single-shot classification / extraction
Free-form input from a userStructured input from your service
Tool use is unbounded (LLM decides)Tool use is graph-defined (you decide)
Real-time UI (chat widget)Async backend (queue → process → callback)

Many integrations use both: a flow handles the deterministic backend pipeline; a conversation handles the analyst-facing investigation UI.

LLM cost on conversations

Each message records tokens. Long conversations get expensive — the full history goes into every call. Trim context by summarizing older turns into a single message when the conversation exceeds ~30 turns.