Supervisor + sub-agents with routing
The supervisor inspects each user message, picks an action (a specific sub-agent or supervisor itself for small talk), and runs it. routing_info in the chat response tells you which action was selected and why.
Multi-agent orchestration with supervisor routing and sub-agents
The Agent service runs multi-step AI workflows where a supervisor agent receives the user request and, at each step, routes to the right sub-agent based on the request intent and the sub-agent triggers (keywords/description). Sub-agents are specialists with their own action profile (persona, goal, response style), optional RAG over a collection, and their own model + temperature. The supervisor and sub-agents share a graph definition (you describe it during create / update). Ideal for customer-facing helper bots, support triage, and any flow that benefits from intent-based routing across multiple specialists.
The supervisor inspects each user message, picks an action (a specific sub-agent or supervisor itself for small talk), and runs it. routing_info in the chat response tells you which action was selected and why.
Each sub-agent has trigger {keywords, description}, priority, action {persona, goal, response_style}, model_name, temperature, optional output_schema, and optional RAG (use_rag + collection_key).
Sub-agents can be RAG-enabled (use_rag:true + collection_key). When the supervisor routes a question to that sub-agent, it retrieves from the linked collection before answering.
Send via POST core-agent-chat for a full response. Add stream:true to the same endpoint for OpenAI-compatible SSE streaming (chat.completion.chunk events ending with [DONE]).
Agent calls open or continue conversation_key threads. List, inspect or delete via the agent-conversation endpoints.
A single supervisor serves your customers; routes by intent (FAQ, product, billing, technical) to dedicated sub-agents. Consistent tone, with escalation when needed.
Incoming requests classified by intent and routed to billing / technical / general support sub-agents.
Different sub-agents can read from different collections — e.g., 'product-docs' for one sub-agent, 'help-center' for another. Supervisor picks the right one per question.
Combine sub-agents with RAG (knowledge questions) and sub-agents without RAG (general reasoning). The supervisor routes by intent.
Input
Agent configuration (name, system_message, model_name, temperature, optional collection_ids and per-sub-agent action profiles), and chat messages with optional conversation_key
Output
Agent metadata, sub-agents (with trigger/action), graph state, and chat responses with routing_info and (optionally) sources
Prerequisites
Create a supervisor with a system_message describing routing logic. Set requires_collection / collection_ids when sub-agents will use RAG.
{
"name": "Support Supervisor",
"agent_type": "supervisor",
"system_message": "You route questions to the right sub-agent: 'docs' for product knowledge, 'billing' for invoices and refunds. Use small-talk only for greetings.",
"model_name": "gpt-5-mini-2025-08-07",
"temperature": 0.7,
"requires_collection": true,
"collection_ids": ["cl_abc123"],
"response_language": "auto"
}Response
{
"status": "success",
"data": {
"agent_key": "ag_xyz789",
"name": "Support Supervisor",
"agent_type": "supervisor",
"model_name": "gpt-5-mini-2025-08-07",
"requires_collection": true,
"collection_ids": ["cl_abc123"],
"response_language": "auto",
"created_at": "2026-04-27T10:30:00Z"
}
}Use the returned agent_key (ag_ prefix) in subsequent calls. Optional fields available on create/update: pre_context_prompt, post_context_prompt, response_template, max_tokens, top_p, frequency_penalty, presence_penalty, response_format, agent_config, context_prompt, examples, required_parameters, graph_definition, state_schema, entry_node.
Attach a specialist sub-agent. Define trigger (keywords + description), priority, action (persona / goal / response_style), and optional RAG (use_rag + collection_key).
{
"name": "Docs Agent",
"description": "Answers product questions from the docs collection.",
"trigger": {
"keywords": ["docs", "documentation", "how to", "feature", "product"],
"description": "Routes when the user asks about product behavior, features, or setup."
},
"priority": 1,
"action": {
"persona": { "role": "customer_support", "personality": "helpful" },
"goal": { "primary": "inform", "instruction": "Answer using the docs collection." },
"response_style": { "format": "paragraph", "tone": "professional", "language": "match_input" }
},
"model_name": "gpt-5-mini-2025-08-07",
"temperature": 0.7,
"use_rag": true,
"collection_key": "cl_abc123",
"is_active": true
}Response
{
"status": "success",
"data": {
"sub_agent_key": "sa_def456",
"name": "Docs Agent",
"trigger": { "keywords": ["docs", "documentation", "..."], "description": "..." },
"priority": 1,
"action": { "persona": {"role":"customer_support","personality":"helpful"}, "goal": {"primary":"inform","instruction":"..."}, "response_style": {"format":"paragraph","tone":"professional","language":"match_input"} },
"model_name": "gpt-5-mini-2025-08-07",
"temperature": 0.7,
"use_rag": true,
"collection_key": "cl_abc123",
"is_active": true,
"created_at": "2026-04-27T10:31:00Z"
}
}Use the returned sub_agent_key (sa_ prefix; note the underscore) in update / delete. Add multiple sub-agents — supervisor routes between them by trigger.keywords / trigger.description.
Send a message. The supervisor returns routing_info to show which action it picked. data.response is the final user-facing text.
{
"agent_key": "ag_xyz789",
"message": "How do I export my data?"
}Response
{
"status": "success",
"data": {
"response": "You can export from Settings → Data → Export. The export contains all your records as JSON.",
"agent_key": "ag_xyz789",
"conversation_key": "cv_abc123",
"routing_info": {
"selected_action": "SUB_AGENT",
"action_name": "Docs Agent",
"reason": "Question about exporting data fits the docs trigger (keywords: how to, export).",
"intent": "docs_lookup"
},
"frontend_trigger": null,
"data_collection": null,
"sources": []
},
"gateway": { "request_id": "req_abc", "service": "core-agent-chat" },
"cost": { "units": 1.0, "unit_price": 0.001, "tokens": 0.001, "balance": 99.99 }
}Pass conversation_key on subsequent calls to continue the same thread. The agent uses agent_key (NOT agent_id) — consistent with the rest of the agent endpoints (and unlike core-chat which uses chatbot_id).
Add stream:true to the same core-agent-chat endpoint. Tokens arrive as Server-Sent Events in OpenAI-compatible format.
{
"agent_key": "ag_xyz789",
"message": "How do I export my data?",
"conversation_key": "cv_abc123",
"stream": true
}Response
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{"content":"You can"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{"content":" export"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","model":"gpt-5-mini-2025-08-07","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":80,"completion_tokens":24,"total_tokens":104}}
data: [DONE]- Same endpoint as sync; only stream:true is different. - Concatenate choices[0].delta.content over all chunks to reconstruct the message. - The penultimate chunk carries usage (OpenAI convention). - `data: [DONE]` is the terminator. Billing happens once after the stream ends.
Create a supervisor agent. Optional fine-grained tuning fields are supported (see notes).
/v1/proxy/core-agent-create
List all agents for the current account. Response data is a direct array; pagination is on the top level.
/v1/proxy/core-agents
Get the full agent configuration. Returns every tuning field the upstream stores (including null fields for tuners you haven't set).
/v1/proxy/core-agent-details/{agent_key}
Update agent configuration fields (any subset of the create fields).
/v1/proxy/core-agent-update/{agent_key}
Delete an agent (and its sub-agents).
/v1/proxy/core-agent-delete/{agent_key}
Bulk update the agent's full graph (graph_definition / state_schema / entry_node) atomically. Use this when you need to change supervisor wiring + sub-agent topology in a single call.
/v1/proxy/core-agent-graph/{agent_key}
List all sub-agents under a supervisor. Returns sub-agents with their full action / trigger / RAG configuration.
/v1/proxy/core-agent-subagents/{agent_key}
Create a sub-agent under a supervisor. Configure trigger, priority, action profile, model, optional RAG.
/v1/proxy/core-agent-subagent-create/{agent_key}
Update a sub-agent's fields (any subset of the create fields).
/v1/proxy/core-agent-subagent-update/{agent_key}/{sub_agent_key}
Delete a sub-agent.
/v1/proxy/core-agent-subagent-delete/{agent_key}/{sub_agent_key}
Send a message; the supervisor inspects intent, picks an action (sub-agent or supervisor itself), runs it, and returns the final response. routing_info reports which action was picked. Set stream:true for SSE.
/v1/proxy/core-agent-chat
List conversations for a single agent. agent_key is REQUIRED — pass it as ?agent_key=... query parameter.
/v1/proxy/core-agent-conversations
Get a conversation's metadata and full message history.
/v1/proxy/core-agent-conversation-details/{conversation_key}
Delete a conversation and all its messages permanently.
/v1/proxy/core-agent-conversation-delete/{conversation_key}
Pay for create/update operations and chat. List, details, and delete are free.
| Service | Unit | Price |
|---|---|---|
| Create Agent | item | $1.0/agent |
| Update Agent / Update Graph / Add Sub-Agent / Update Sub-Agent | item | $0.5/operation |
| Chat (REST or SSE) | token | $0.001/call (currently flat — see note below) |
| List, Details, Delete | item | Free |
A: It looks at the user message against each sub-agent's trigger (keywords + description), respecting priority. The decision and reason are returned in routing_info.
A: Yes. Each sub-agent has its own collection_key; multiple sub-agents can point to the same one.
A: The agent chat upstream does not currently report token counts back to the gateway, so units defaults to 1. The fix is to surface usage.cost from the upstream (same pattern as the LLM completions endpoint already does).
A: core-agent-graph is PUT only. Calling it with GET / POST / DELETE returns 405 Method Not Allowed (and may still consume the catalog price).
A: By default the conversation list / details / delete services are not in the API key's allowed_services even when the agent CRUD endpoints are. Add them explicitly to the key.
1.1 (2026-04-27)
1.0 (2026-01-26)