Chat Completions

Overview

The Chat Completions API is fully compatible with OpenAI’s API format, allowing you to use the OpenAI Python SDK, JavaScript SDK, or any OpenAI-compatible client.

Model Parameter (Optional): The model parameter is optional and primarily for SDK compatibility. Your organization’s orchestrator agent automatically uses the configured LLM model.If you’re using an OpenAI SDK that requires the model parameter, you can pass any value (e.g., "junis-orchestrator") for compatibility.

Prerequisites

Subscription Required: This API is only available for Basic or Pro subscribers.If you don’t have an active subscription, you’ll receive a 403 Forbidden error. Visit Subscription to start a plan.

Endpoint

POST https://api.junis.ai/api/external/v1/chat/completions

Authentication

Include your API key in the X-API-Key header:

X-API-Key: jns_live_YOUR_API_KEY_HERE

Required Scope: orchestrator:invoke See Authentication for details.

Request Format

Basic Request

cURL

curl -X POST https://api.junis.ai/api/external/v1/chat/completions \
  -H "X-API-Key: jns_live_YOUR_API_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is Junis?"}
    ]
  }'

Request Parameters

Parameter	Type	Required	Description
`messages`	array	Yes	Array of message objects (OpenAI format)
`stream`	boolean	No	Enable streaming mode (default: `false`)
`session_id`	string	No	Session ID to continue conversation (creates new if not provided)

Message Format

Messages follow the OpenAI format:

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi! How can I help you?"},
    {"role": "user", "content": "Tell me about Junis."}
  ]
}

Role Types:

system: System instructions (optional, rarely needed with Junis)
user: User messages
assistant: Assistant responses (for conversation context)
tool: Tool call results (for function calling)

Response Format

Non-Streaming Response

Status Code: 200 OK Response Body:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1767261600,
  "model": "junis-orchestrator",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Junis is a production-ready Multi-Agent AI platform..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 42,
    "total_tokens": 57
  },
  "session_id": "550e8400-e29b-41d4-a716-446655440000"
}

Response Fields

Field	Type	Description
`id`	string	Unique completion ID
`object`	string	Object type (always `"chat.completion"`)
`created`	integer	Unix timestamp
`model`	string	Model used
`choices`	array	Array of completion choices
`choices[].message`	object	Generated message
`choices[].message.role`	string	Always `"assistant"`
`choices[].message.content`	string	Response text
`choices[].finish_reason`	string	Reason for completion (`stop`, `length`, `tool_calls`, `content_filter`)
`usage`	object	Token usage statistics
`session_id`	string	Session ID for continuing the conversation

Streaming Mode

Enable real-time token-by-token streaming with stream: true.

Streaming Request

cURL

curl -N -X POST https://api.junis.ai/api/external/v1/chat/completions \
  -H "X-API-Key: jns_live_YOUR_API_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Tell me a story"}],
    "stream": true
  }'

Streaming Response Format

Server-Sent Events (SSE) format:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1767261600,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1767261600,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"content":"Once"},"finish_reason":null}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1767261600,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"content":" upon"},"finish_reason":null}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}

...

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1767261600,"model":"junis-orchestrator","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}

data: [DONE]

Key Points:

Each line starts with data:
First chunk contains role
Subsequent chunks contain content deltas
Final chunk has finish_reason
Every chunk includes session_id for conversation continuity
Stream ends with data: [DONE]

Multi-Agent Events

For Advanced Integrations: Junis streams additional agent lifecycle events alongside standard OpenAI chunks. These events have no choices field, so standard OpenAI SDKs automatically ignore them - ensuring 100% backward compatibility.Parse these events to build rich UIs that show which agents are processing your request.

Event Types

Event	Description	Fields
`agent_started`	Agent begins processing	`agent`, `agent_type`, `depth`, `parent_agent`
`agent_completed`	Agent finishes processing	`agent`, `duration_ms`
`agent_routing`	Transition between agents	`from_agent`, `to_agent`
`token_agent`	Which agent produced the token	`agent`
`tool_started`	Tool execution begins	`tool_name`, `tool_id`, `agent_name`
`tool_completed`	Tool execution ends	`tool_name`, `tool_id`, `agent_name`
`payment_required`	Payment needed to continue	`message`, `agent_name`, `nonce`, `network`, `total_amount`

SSE Stream Example (Multi-Agent)

data: {"event":"agent_started","agent":"researcher","agent_type":"LLM_AGENT","depth":1,"timestamp":"2026-01-01T10:00:00Z"}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Based on"}}],"session_id":"..."}

data: {"event":"token_agent","agent":"researcher","timestamp":"2026-01-01T10:00:00Z"}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" my research"}}],"session_id":"..."}

data: {"event":"agent_completed","agent":"researcher","duration_ms":1523,"timestamp":"2026-01-01T10:00:01Z"}

data: {"event":"agent_routing","from_agent":"researcher","to_agent":"writer","timestamp":"2026-01-01T10:00:01Z"}

data: {"event":"agent_started","agent":"writer","agent_type":"LLM_AGENT","depth":1,"timestamp":"2026-01-01T10:00:01Z"}

...

data: [DONE]

Parsing Agent Events (Python)

import json
import httpx

async def stream_with_agent_events(prompt: str, api_key: str):
    """Stream response and track agent events."""
    async with httpx.AsyncClient() as client:
        async with client.stream(
            "POST",
            "https://api.junis.ai/api/external/v1/chat/completions",
            headers={"X-API-Key": api_key},
            json={"messages": [{"role": "user", "content": prompt}], "stream": True}
        ) as response:
            current_agent = None
            agent_durations = {}

            async for line in response.aiter_lines():
                if not line.startswith("data: "):
                    continue
                data_str = line[6:]
                if data_str == "[DONE]":
                    break

                data = json.loads(data_str)

                # Agent event (no 'choices' field)
                if "event" in data and "choices" not in data:
                    event = data["event"]

                    if event == "agent_started":
                        current_agent = data["agent"]
                        print(f"🚀 Agent started: {current_agent}")

                    elif event == "agent_completed":
                        agent = data["agent"]
                        duration = data.get("duration_ms", 0)
                        agent_durations[agent] = duration
                        print(f"✅ Agent completed: {agent} ({duration}ms)")

                    elif event == "agent_routing":
                        print(f"🔄 Routing: {data['from_agent']} → {data['to_agent']}")

                # Standard OpenAI chunk (has 'choices' field)
                elif "choices" in data:
                    content = data["choices"][0].get("delta", {}).get("content", "")
                    if content:
                        print(content, end="", flush=True)

            print(f"\n\n📊 Agent performance: {agent_durations}")

Parallel Agent Support

When multiple agents run concurrently (Parallel Agent), use token_agent events to correctly attribute tokens:

agent_buffers = {}  # Per-agent content buffers
last_token_agent = None

for event in stream:
    if event.get("event") == "token_agent":
        last_token_agent = event["agent"]

    elif "choices" in event:
        content = event["choices"][0].get("delta", {}).get("content", "")
        if content and last_token_agent:
            if last_token_agent not in agent_buffers:
                agent_buffers[last_token_agent] = ""
            agent_buffers[last_token_agent] += content

# Result: {"researcher": "Research content...", "analyst": "Analysis content..."}

When to use agent events:

Build agent pipeline visualizations
Track per-agent performance metrics
Handle Parallel Agent outputs separately
Debug multi-agent workflows

Session Management

Junis supports explicit session management via session_id parameter, allowing you to control conversation context precisely.

How Sessions Work

Session ID Usage

Recommended: Always capture the session_id from the first response and include it in subsequent requests to maintain conversation context.

Behavior:

Request	Result
No `session_id`	Creates a new session and returns `session_id` in response
With `session_id` (existing)	Uses the existing session (same chat room)
With `session_id` (new ID)	Creates a new session with the provided ID

Example: Continuing a Conversation

import requests

API_KEY = "jns_live_YOUR_API_KEY_HERE"
BASE_URL = "https://api.junis.ai/api/external/v1/chat/completions"

# First request - no session_id
response1 = requests.post(BASE_URL,
    headers={"X-API-Key": API_KEY},
    json={"messages": [{"role": "user", "content": "Hello!"}]}
)
session_id = response1.json()["session_id"]  # Capture session_id

# Second request - with session_id (continues conversation)
response2 = requests.post(BASE_URL,
    headers={"X-API-Key": API_KEY},
    json={
        "messages": [{"role": "user", "content": "What did I just say?"}],
        "session_id": session_id  # Use captured session_id
    }
)
# Response will have context from the first message

Viewing Sessions

Retrieve your sessions via the Sessions API:

curl -X GET https://api.junis.ai/api/external/sessions \
  -H "X-API-Key: jns_live_YOUR_API_KEY_HERE"

See Sessions API for details.

Function Calling (Tool Use)

Current Implementation: Tool calling uses keyword pattern matching, not LLM-native function calling.Supported tools:

end_call: Detects phrases like “안녕히”, “감사합니다”, “goodbye”
transfer_call: Detects phrases like “상담원 연결”, “전화 연결”, “connect agent”

Full LLM-native function calling support is in development.

When a supported keyword is detected in the assistant’s response, the API returns a tool_calls object with finish_reason: "tool_calls".

Error Handling

Common Errors

{
  "error": {
    "message": "Invalid request: 'messages' field is required",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "missing_required_field"
  }
}

Error Types

Error Type	Status Code	Description	Action
`invalid_request_error`	400	Malformed request	Check request format
`authentication_error`	401	Invalid API key	Verify API key
`insufficient_credits`	402	Credit balance below zero	Add credits at /subscription
`subscription_required`	403	No active subscription	Subscribe to Basic or Pro plan
`invalid_subscription_plan`	403	Plan doesn’t include API access	Upgrade to Basic or Pro plan
`rate_limit_error`	429	Too many requests	Implement exponential backoff
`api_error`	500	Server error	Retry with exponential backoff

Best Practices

Use Streaming: Enable stream: true for long responses to provide real-time feedback
Handle Rate Limits: Implement exponential backoff when hitting rate limits (see error codes above)
Keep Message History Concise: Trim old messages to avoid token limits (keep last 10-20 messages)
Secure API Keys: Never hardcode API keys; use environment variables
Log Requests: Log requests and responses for debugging and monitoring

Response Parameters

Temperature Guide

Use Case	Recommended Temperature
Factual Q&A	0.0 - 0.3
General chat	0.5 - 0.7
Creative writing	0.8 - 1.0
Code generation	0.0 - 0.2

Next Steps

Sessions API

Retrieve and manage conversation sessions

Error Codes

Complete error reference

Rate Limits

Understand rate limiting

API Reference

​Overview

​Prerequisites

​Endpoint

​Authentication

​Request Format

​Basic Request

​Request Parameters

​Message Format

​Response Format

​Non-Streaming Response

​Response Fields

​Streaming Mode

​Streaming Request

​Streaming Response Format

​Multi-Agent Events

​Event Types

​SSE Stream Example (Multi-Agent)

​Parsing Agent Events (Python)

​Parallel Agent Support

​Session Management

​How Sessions Work

​Session ID Usage

​Example: Continuing a Conversation

​Viewing Sessions

​Function Calling (Tool Use)

​Error Handling

​Common Errors

​Error Types

​Best Practices

​Response Parameters

​Temperature Guide

​Next Steps

Sessions API

Error Codes

Rate Limits

Overview

Prerequisites

Endpoint

Authentication

Request Format

Basic Request

Request Parameters

Message Format

Response Format

Non-Streaming Response

Response Fields

Streaming Mode

Streaming Request

Streaming Response Format

Multi-Agent Events

Event Types

SSE Stream Example (Multi-Agent)

Parsing Agent Events (Python)

Parallel Agent Support

Session Management

How Sessions Work

Session ID Usage

Example: Continuing a Conversation

Viewing Sessions

Function Calling (Tool Use)

Error Handling

Common Errors

Error Types

Best Practices

Response Parameters

Temperature Guide

Next Steps