Agent

Multi-agent orchestration with supervisor routing and sub-agents

The Agent service runs multi-step AI workflows where a supervisor agent receives the user request and, at each step, routes to the right sub-agent based on the request intent and the sub-agent triggers (keywords/description). Sub-agents are specialists with their own action profile (persona, goal, response style), optional RAG over a collection, and their own model + temperature. The supervisor and sub-agents share a graph definition (you describe it during create / update). Ideal for customer-facing helper bots, support triage, and any flow that benefits from intent-based routing across multiple specialists.

agentllmautomationragstreaming

Overview

Features

Supervisor + sub-agents with routing

The supervisor inspects each user message, picks an action (a specific sub-agent or supervisor itself for small talk), and runs it. routing_info in the chat response tells you which action was selected and why.

Rich sub-agent definition

Each sub-agent has trigger {keywords, description}, priority, action {persona, goal, response_style}, model_name, temperature, optional output_schema, and optional RAG (use_rag + collection_key).

Per-sub-agent RAG

Sub-agents can be RAG-enabled (use_rag:true + collection_key). When the supervisor routes a question to that sub-agent, it retrieves from the linked collection before answering.

Sync REST and Streaming SSE

Send via POST core-agent-chat for a full response. Add stream:true to the same endpoint for OpenAI-compatible SSE streaming (chat.completion.chunk events ending with [DONE]).

Conversations per agent

Agent calls open or continue conversation_key threads. List, inspect or delete via the agent-conversation endpoints.

Use Cases

Helper bot for your customer base

A single supervisor serves your customers; routes by intent (FAQ, product, billing, technical) to dedicated sub-agents. Consistent tone, with escalation when needed.

Support triage

Incoming requests classified by intent and routed to billing / technical / general support sub-agents.

RAG over multiple knowledge bases

Different sub-agents can read from different collections — e.g., 'product-docs' for one sub-agent, 'help-center' for another. Supervisor picks the right one per question.

Mixed reasoning + retrieval

Combine sub-agents with RAG (knowledge questions) and sub-agents without RAG (general reasoning). The supervisor routes by intent.

Input / Output

Input

Agent configuration (name, system_message, model_name, temperature, optional collection_ids and per-sub-agent action profiles), and chat messages with optional conversation_key

JSON bodyPath parameters (agent_key, sub_agent_key, conversation_key)

Output

Agent metadata, sub-agents (with trigger/action), graph state, and chat responses with routing_info and (optionally) sources

JSONSSE stream when stream:true

Specs

Latency
~2-5s for simple supervisor decisions; longer when a sub-agent uses RAG or runs a longer prompt
Async
false
Rate Limit
60 req/min per API key
Max Input
Per sub-agent model context_window

Quickstart

Prerequisites

  • -A CN8 Gateway API key with agent services enabled
  • -Optional: a knowledge collection (cl_-prefixed key) for sub-agents that need RAG

1. Create a Supervisor Agent

core-agent-create

Create a supervisor with a system_message describing routing logic. Set requires_collection / collection_ids when sub-agents will use RAG.

POST/v1/proxy/core-agent-create
{
  "name": "Support Supervisor",
  "agent_type": "supervisor",
  "system_message": "You route questions to the right sub-agent: 'docs' for product knowledge, 'billing' for invoices and refunds. Use small-talk only for greetings.",
  "model_name": "gpt-5-mini-2025-08-07",
  "temperature": 0.7,
  "requires_collection": true,
  "collection_ids": ["cl_abc123"],
  "response_language": "auto"
}

Response

{
  "status": "success",
  "data": {
    "agent_key": "ag_xyz789",
    "name": "Support Supervisor",
    "agent_type": "supervisor",
    "model_name": "gpt-5-mini-2025-08-07",
    "requires_collection": true,
    "collection_ids": ["cl_abc123"],
    "response_language": "auto",
    "created_at": "2026-04-27T10:30:00Z"
  }
}

Use the returned agent_key (ag_ prefix) in subsequent calls. Optional fields available on create/update: pre_context_prompt, post_context_prompt, response_template, max_tokens, top_p, frequency_penalty, presence_penalty, response_format, agent_config, context_prompt, examples, required_parameters, graph_definition, state_schema, entry_node.

2. Add a Sub-Agent

core-agent-subagent-create

Attach a specialist sub-agent. Define trigger (keywords + description), priority, action (persona / goal / response_style), and optional RAG (use_rag + collection_key).

POST/v1/proxy/core-agent-subagent-create/{agent_key}
{
  "name": "Docs Agent",
  "description": "Answers product questions from the docs collection.",
  "trigger": {
    "keywords": ["docs", "documentation", "how to", "feature", "product"],
    "description": "Routes when the user asks about product behavior, features, or setup."
  },
  "priority": 1,
  "action": {
    "persona": { "role": "customer_support", "personality": "helpful" },
    "goal": { "primary": "inform", "instruction": "Answer using the docs collection." },
    "response_style": { "format": "paragraph", "tone": "professional", "language": "match_input" }
  },
  "model_name": "gpt-5-mini-2025-08-07",
  "temperature": 0.7,
  "use_rag": true,
  "collection_key": "cl_abc123",
  "is_active": true
}

Response

{
  "status": "success",
  "data": {
    "sub_agent_key": "sa_def456",
    "name": "Docs Agent",
    "trigger": { "keywords": ["docs", "documentation", "..."], "description": "..." },
    "priority": 1,
    "action": { "persona": {"role":"customer_support","personality":"helpful"}, "goal": {"primary":"inform","instruction":"..."}, "response_style": {"format":"paragraph","tone":"professional","language":"match_input"} },
    "model_name": "gpt-5-mini-2025-08-07",
    "temperature": 0.7,
    "use_rag": true,
    "collection_key": "cl_abc123",
    "is_active": true,
    "created_at": "2026-04-27T10:31:00Z"
  }
}

Use the returned sub_agent_key (sa_ prefix; note the underscore) in update / delete. Add multiple sub-agents — supervisor routes between them by trigger.keywords / trigger.description.

3. Chat with the Agent (sync REST)

core-agent-chat

Send a message. The supervisor returns routing_info to show which action it picked. data.response is the final user-facing text.

POST/v1/proxy/core-agent-chat
{
  "agent_key": "ag_xyz789",
  "message": "How do I export my data?"
}

Response

{
  "status": "success",
  "data": {
    "response": "You can export from Settings → Data → Export. The export contains all your records as JSON.",
    "agent_key": "ag_xyz789",
    "conversation_key": "cv_abc123",
    "routing_info": {
      "selected_action": "SUB_AGENT",
      "action_name": "Docs Agent",
      "reason": "Question about exporting data fits the docs trigger (keywords: how to, export).",
      "intent": "docs_lookup"
    },
    "frontend_trigger": null,
    "data_collection": null,
    "sources": []
  },
  "gateway": { "request_id": "req_abc", "service": "core-agent-chat" },
  "cost": { "units": 1.0, "unit_price": 0.001, "tokens": 0.001, "balance": 99.99 }
}

Pass conversation_key on subsequent calls to continue the same thread. The agent uses agent_key (NOT agent_id) — consistent with the rest of the agent endpoints (and unlike core-chat which uses chatbot_id).

4. Stream Responses (SSE)

core-agent-chat

Add stream:true to the same core-agent-chat endpoint. Tokens arrive as Server-Sent Events in OpenAI-compatible format.

POST/v1/proxy/core-agent-chat
{
  "agent_key": "ag_xyz789",
  "message": "How do I export my data?",
  "conversation_key": "cv_abc123",
  "stream": true
}

Response

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{"content":"You can"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{"content":" export"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":80,"completion_tokens":24,"total_tokens":104}}

data: [DONE]

- Same endpoint as sync; only stream:true is different. - Concatenate choices[0].delta.content over all chunks to reconstruct the message. - The penultimate chunk carries usage (OpenAI convention). - `data: [DONE]` is the terminator. Billing happens once after the stream ends.

Create Agent

POSTsync

Create a supervisor agent. Optional fine-grained tuning fields are supported (see notes).

/v1/proxy/core-agent-create

List Agents

GETsync

List all agents for the current account. Response data is a direct array; pagination is on the top level.

/v1/proxy/core-agents

Get Agent Details

GETsync

Get the full agent configuration. Returns every tuning field the upstream stores (including null fields for tuners you haven't set).

/v1/proxy/core-agent-details/{agent_key}

Update Agent

PUTsync

Update agent configuration fields (any subset of the create fields).

/v1/proxy/core-agent-update/{agent_key}

Delete Agent

DELETEsync

Delete an agent (and its sub-agents).

/v1/proxy/core-agent-delete/{agent_key}

Update Agent Graph

PUTsync

Bulk update the agent's full graph (graph_definition / state_schema / entry_node) atomically. Use this when you need to change supervisor wiring + sub-agent topology in a single call.

/v1/proxy/core-agent-graph/{agent_key}

List Sub-Agents

GETsync

List all sub-agents under a supervisor. Returns sub-agents with their full action / trigger / RAG configuration.

/v1/proxy/core-agent-subagents/{agent_key}

Add Sub-Agent

POSTsync

Create a sub-agent under a supervisor. Configure trigger, priority, action profile, model, optional RAG.

/v1/proxy/core-agent-subagent-create/{agent_key}

Update Sub-Agent

PUTsync

Update a sub-agent's fields (any subset of the create fields).

/v1/proxy/core-agent-subagent-update/{agent_key}/{sub_agent_key}

Delete Sub-Agent

DELETEsync

Delete a sub-agent.

/v1/proxy/core-agent-subagent-delete/{agent_key}/{sub_agent_key}

Chat with Agent

POSTsync

Send a message; the supervisor inspects intent, picks an action (sub-agent or supervisor itself), runs it, and returns the final response. routing_info reports which action was picked. Set stream:true for SSE.

/v1/proxy/core-agent-chat

List Agent Conversations

GETsync

List conversations for a single agent. agent_key is REQUIRED — pass it as ?agent_key=... query parameter.

/v1/proxy/core-agent-conversations

Get Agent Conversation Details

GETsync

Get a conversation's metadata and full message history.

/v1/proxy/core-agent-conversation-details/{conversation_key}

Delete Agent Conversation

DELETEsync

Delete a conversation and all its messages permanently.

/v1/proxy/core-agent-conversation-delete/{conversation_key}

Pricing

Pay for create/update operations and chat. List, details, and delete are free.

ServiceUnitPrice
Create Agentitem$1.0/agent
Update Agent / Update Graph / Add Sub-Agent / Update Sub-Agentitem$0.5/operation
Chat (REST or SSE)token$0.001/call (currently flat — see note below)
List, Details, DeleteitemFree
  • -core-agent-chat is currently billed at a flat 0.001 per call: the upstream chat response does not surface per-call token counts to the gateway, so units defaults to 1.0. Same situation as core-chat. To enable token-accurate billing, the upstream needs to populate usage.cost like the LLM completions endpoint does.
  • -Multi-step agent runs (supervisor + multiple sub-agent calls) can consume substantially more underlying tokens than a single LLM call — flat billing currently does not reflect this.

Guides & Tips

When to use Agent vs Chatbot

  • -Chatbot: single LLM + RAG. Best for Q&A over a single knowledge base.
  • -Agent: supervisor + multiple sub-agents, each with its own action profile and (optional) RAG. Best when intents diverge enough that a single chatbot would overload its instructions, or when different sub-agents should hit different collections.

Routing through trigger and action

  • -trigger.keywords: word list the supervisor matches against the user message.
  • -trigger.description: free-text intent description; used by the supervisor to disambiguate when keywords overlap.
  • -action.persona: persona/role of the answering sub-agent (e.g. customer_support / friendly).
  • -action.goal: what the sub-agent should achieve (primary verb + instruction).
  • -action.response_style: format/tone/language for the response.
  • -Higher priority values are considered first when multiple sub-agents could match.

Per-sub-agent RAG

  • -Set use_rag:true and collection_key on a sub-agent to enable retrieval just for that specialist.
  • -The supervisor itself does NOT RAG; only the routed sub-agent does. The sub-agent retrieves from its collection_key, builds context, and answers.
  • -sources in the chat response is empty when no RAG sub-agent was selected, or when RAG returned no relevant chunks.

Field naming quirks (vs chatbot)

  • -agent_key is consistent everywhere: in the path for details/update/delete and in the request body for core-agent-chat. (Chatbot is inconsistent — uses chatbot_id only in the chat request body. Agent does not have that quirk.)
  • -sub_agent_key uses an underscore (matches the upstream JSON convention) — not subagent_key.
  • -Prefixes: ag_ for agents, sa_ for sub-agents, cv_ for conversations, cl_ for collections.
  • -core-agent-graph is PUT (not GET / POST). Calling it with another verb returns 405.

FAQ

Q: How does the supervisor decide which sub-agent to call?

A: It looks at the user message against each sub-agent's trigger (keywords + description), respecting priority. The decision and reason are returned in routing_info.

Q: Can multiple sub-agents share the same collection?

A: Yes. Each sub-agent has its own collection_key; multiple sub-agents can point to the same one.

Q: Why is core-agent-chat billed flat instead of per-token?

A: The agent chat upstream does not currently report token counts back to the gateway, so units defaults to 1. The fix is to surface usage.cost from the upstream (same pattern as the LLM completions endpoint already does).

Q: Why am I getting 405 on core-agent-graph?

A: core-agent-graph is PUT only. Calling it with GET / POST / DELETE returns 405 Method Not Allowed (and may still consume the catalog price).

Q: Why do I get 403 on core-agent-conversations?

A: By default the conversation list / details / delete services are not in the API key's allowed_services even when the agent CRUD endpoints are. Add them explicitly to the key.

Related Products

Changelog

1.1 (2026-04-27)

  • -Corrected key prefixes: agt_ → ag_ for agents; introduced sub_agent_key (sa_ prefix) — was previously documented as subagent_key (sub_ prefix).
  • -List endpoints: data is a direct array (was data.agents), and pagination is on the top level.
  • -core-agent-details: documented all fields the upstream returns (agent_type, model_name, requires_collection, collection_ids, response_language, system_message, plus null tuning fields and graph state).
  • -core-agent-subagents: documented the rich response structure (data.sub_agents wrapped + total) and each sub-agent's trigger / priority / action {persona, goal, response_style} / use_rag / collection_key / collection_name / rag_config / is_active fields.
  • -core-agent-chat: response shape is response, agent_key, conversation_key, routing_info {selected_action, action_name, reason, intent}, frontend_trigger, data_collection, sources. Documented that agent_key is consistent (no chatbot_id-style quirk). Documented current flat 0.001/call billing behaviour.
  • -Added the previously missing services: core-agent-subagent-update, core-agent-subagent-delete, core-agent-conversations, core-agent-conversation-details, core-agent-conversation-delete (5 endpoints).
  • -core-agent-graph: documented the PUT-only constraint and 405 behaviour.
  • -Updated path patterns to reflect upstream URL templates (path params: /core-agent-details/{agent_key}, /core-agent-subagent-update/{agent_key}/{sub_agent_key}, etc.).

1.0 (2026-01-26)

  • -Initial Agent catalog: create, list, details, update, delete, graph, subagents, chat (REST + WS).