Overview
The Chat Completions API is fully compatible with OpenAI’s API format , allowing you to use the OpenAI Python SDK, JavaScript SDK, or any OpenAI-compatible client.
Model Parameter (Optional) : The model parameter is optional and primarily for SDK compatibility. Your organization’s orchestrator agent automatically uses the configured LLM model.If you’re using an OpenAI SDK that requires the model parameter, you can pass any value (e.g., "junis-orchestrator") for compatibility.
Prerequisites
Subscription Required : This API is only available for Basic or Pro subscribers.If you don’t have an active subscription, you’ll receive a 403 Forbidden error. Visit Subscription to start a plan.
Endpoint
POST https://api.junis.ai/api/external/v1/chat/completions
Authentication
Include your API key in the X-API-Key header:
X-API-Key: jns_live_YOUR_API_KEY_HERE
Required Scope: orchestrator:invoke
See Authentication for details.
Basic Request
curl -X POST https://api.junis.ai/api/external/v1/chat/completions \
-H "X-API-Key: jns_live_YOUR_API_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "What is Junis?"}
]
}'
Request Parameters
Parameter Type Required Description messagesarray Yes Array of message objects (OpenAI format) streamboolean No Enable streaming mode (default: false) session_idstring No Session ID to continue conversation (creates new if not provided)
Messages follow the OpenAI format:
{
"messages" : [
{ "role" : "system" , "content" : "You are a helpful assistant." },
{ "role" : "user" , "content" : "Hello!" },
{ "role" : "assistant" , "content" : "Hi! How can I help you?" },
{ "role" : "user" , "content" : "Tell me about Junis." }
]
}
Role Types:
system: System instructions (optional, rarely needed with Junis)
user: User messages
assistant: Assistant responses (for conversation context)
tool: Tool call results (for function calling)
Non-Streaming Response
Status Code: 200 OK
Response Body:
{
"id" : "chatcmpl-abc123" ,
"object" : "chat.completion" ,
"created" : 1699896916 ,
"model" : "junis-orchestrator" ,
"choices" : [
{
"index" : 0 ,
"message" : {
"role" : "assistant" ,
"content" : "Junis is a production-ready Multi-Agent AI platform..."
},
"finish_reason" : "stop"
}
],
"usage" : {
"prompt_tokens" : 15 ,
"completion_tokens" : 42 ,
"total_tokens" : 57
},
"session_id" : "550e8400-e29b-41d4-a716-446655440000"
}
Response Fields
Field Type Description idstring Unique completion ID objectstring Object type (always "chat.completion") createdinteger Unix timestamp modelstring Model used choicesarray Array of completion choices choices[].messageobject Generated message choices[].message.rolestring Always "assistant" choices[].message.contentstring Response text choices[].finish_reasonstring Reason for completion (stop, length, tool_calls, content_filter) usageobject Token usage statistics session_idstring Session ID for continuing the conversation
Streaming Mode
Enable real-time token-by-token streaming with stream: true.
Streaming Request
curl -N -X POST https://api.junis.ai/api/external/v1/chat/completions \
-H "X-API-Key: jns_live_YOUR_API_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Tell me a story"}],
"stream": true
}'
Server-Sent Events (SSE) format:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"content":"Once"},"finish_reason":null}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"content":" upon"},"finish_reason":null}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}
...
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"session_id":"550e8400-e29b-41d4-a716-446655440000"}
data: [DONE]
Key Points:
Each line starts with data:
First chunk contains role
Subsequent chunks contain content deltas
Final chunk has finish_reason
Every chunk includes session_id for conversation continuity
Stream ends with data: [DONE]
Multi-Agent Events
For Advanced Integrations : Junis streams additional agent lifecycle events alongside standard OpenAI chunks. These events have no choices field, so standard OpenAI SDKs automatically ignore them - ensuring 100% backward compatibility.Parse these events to build rich UIs that show which agents are processing your request.
Event Types
Event Description Fields agent_startedAgent begins processing agent, agent_type, depth, parent_agentagent_completedAgent finishes processing agent, duration_msagent_routingTransition between agents from_agent, to_agenttoken_agentWhich agent produced the token agenttool_startedTool execution begins tool_name, tool_id, agent_nametool_completedTool execution ends tool_name, tool_id, agent_namepayment_requiredPayment needed to continue message, agent_name, nonce, network, total_amount
SSE Stream Example (Multi-Agent)
data: {"event":"agent_started","agent":"researcher","agent_type":"LLM_AGENT","depth":1,"timestamp":"2025-01-15T10:00:00Z"}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Based on"}}],"session_id":"..."}
data: {"event":"token_agent","agent":"researcher","timestamp":"2025-01-15T10:00:00Z"}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" my research"}}],"session_id":"..."}
data: {"event":"agent_completed","agent":"researcher","duration_ms":1523,"timestamp":"2025-01-15T10:00:01Z"}
data: {"event":"agent_routing","from_agent":"researcher","to_agent":"writer","timestamp":"2025-01-15T10:00:01Z"}
data: {"event":"agent_started","agent":"writer","agent_type":"LLM_AGENT","depth":1,"timestamp":"2025-01-15T10:00:01Z"}
...
data: [DONE]
Parsing Agent Events (Python)
import json
import httpx
async def stream_with_agent_events ( prompt : str , api_key : str ):
"""Stream response and track agent events."""
async with httpx.AsyncClient() as client:
async with client.stream(
"POST" ,
"https://api.junis.ai/api/external/v1/chat/completions" ,
headers = { "X-API-Key" : api_key},
json = { "messages" : [{ "role" : "user" , "content" : prompt}], "stream" : True }
) as response:
current_agent = None
agent_durations = {}
async for line in response.aiter_lines():
if not line.startswith( "data: " ):
continue
data_str = line[ 6 :]
if data_str == "[DONE]" :
break
data = json.loads(data_str)
# Agent event (no 'choices' field)
if "event" in data and "choices" not in data:
event = data[ "event" ]
if event == "agent_started" :
current_agent = data[ "agent" ]
print ( f "🚀 Agent started: { current_agent } " )
elif event == "agent_completed" :
agent = data[ "agent" ]
duration = data.get( "duration_ms" , 0 )
agent_durations[agent] = duration
print ( f "✅ Agent completed: { agent } ( { duration } ms)" )
elif event == "agent_routing" :
print ( f "🔄 Routing: { data[ 'from_agent' ] } → { data[ 'to_agent' ] } " )
# Standard OpenAI chunk (has 'choices' field)
elif "choices" in data:
content = data[ "choices" ][ 0 ].get( "delta" , {}).get( "content" , "" )
if content:
print (content, end = "" , flush = True )
print ( f " \n\n 📊 Agent performance: { agent_durations } " )
Parallel Agent Support
When multiple agents run concurrently (Parallel Agent), use token_agent events to correctly attribute tokens:
agent_buffers = {} # Per-agent content buffers
last_token_agent = None
for event in stream:
if event.get( "event" ) == "token_agent" :
last_token_agent = event[ "agent" ]
elif "choices" in event:
content = event[ "choices" ][ 0 ].get( "delta" , {}).get( "content" , "" )
if content and last_token_agent:
if last_token_agent not in agent_buffers:
agent_buffers[last_token_agent] = ""
agent_buffers[last_token_agent] += content
# Result: {"researcher": "Research content...", "analyst": "Analysis content..."}
When to use agent events:
Build agent pipeline visualizations
Track per-agent performance metrics
Handle Parallel Agent outputs separately
Debug multi-agent workflows
Session Management
Junis supports explicit session management via session_id parameter, allowing you to control conversation context precisely.
How Sessions Work
Session ID Usage
Recommended : Always capture the session_id from the first response and include it in subsequent requests to maintain conversation context.
Behavior:
Request Result No session_id Creates a new session and returns session_id in response With session_id (existing) Uses the existing session (same chat room) With session_id (new ID) Creates a new session with the provided ID
Example: Continuing a Conversation
import requests
API_KEY = "jns_live_YOUR_API_KEY_HERE"
BASE_URL = "https://api.junis.ai/api/external/v1/chat/completions"
# First request - no session_id
response1 = requests.post( BASE_URL ,
headers = { "X-API-Key" : API_KEY },
json = { "messages" : [{ "role" : "user" , "content" : "Hello!" }]}
)
session_id = response1.json()[ "session_id" ] # Capture session_id
# Second request - with session_id (continues conversation)
response2 = requests.post( BASE_URL ,
headers = { "X-API-Key" : API_KEY },
json = {
"messages" : [{ "role" : "user" , "content" : "What did I just say?" }],
"session_id" : session_id # Use captured session_id
}
)
# Response will have context from the first message
Viewing Sessions
Retrieve your sessions via the Sessions API:
curl -X GET https://api.junis.ai/api/external/sessions \
-H "X-API-Key: jns_live_YOUR_API_KEY_HERE"
See Sessions API for details.
Current Implementation : Tool calling uses keyword pattern matching, not LLM-native function calling.Supported tools:
end_call: Detects phrases like “안녕히”, “감사합니다”, “goodbye”
transfer_call: Detects phrases like “상담원 연결”, “전화 연결”, “connect agent”
Full LLM-native function calling support is in development.
When a supported keyword is detected in the assistant’s response, the API returns a tool_calls object with finish_reason: "tool_calls".
Error Handling
Common Errors
400 Bad Request
401 Unauthorized
402 Payment Required
403 Forbidden (No Subscription)
403 Forbidden (Invalid Plan)
429 Too Many Requests (Minute Limit)
429 Too Many Requests (Hour Limit)
500 Internal Server Error
{
"error" : {
"message" : "Invalid request: 'messages' field is required" ,
"type" : "invalid_request_error" ,
"param" : "messages" ,
"code" : "missing_required_field"
}
}
Error Types
Error Type Status Code Description Action invalid_request_error400 Malformed request Check request format authentication_error401 Invalid API key Verify API key insufficient_credits402 Credit balance below zero Add credits at /subscription subscription_required403 No active subscription Subscribe to Basic or Pro plan invalid_subscription_plan403 Plan doesn’t include API access Upgrade to Basic or Pro plan rate_limit_error429 Too many requests Implement exponential backoff api_error500 Server error Retry with exponential backoff
Best Practices
Use Streaming : Enable stream: true for long responses to provide real-time feedback
Handle Rate Limits : Implement exponential backoff when hitting rate limits (see error codes above)
Keep Message History Concise : Trim old messages to avoid token limits (keep last 10-20 messages)
Secure API Keys : Never hardcode API keys; use environment variables
Log Requests : Log requests and responses for debugging and monitoring
Response Parameters
Temperature Guide
Use Case Recommended Temperature Factual Q&A 0.0 - 0.3 General chat 0.5 - 0.7 Creative writing 0.8 - 1.0 Code generation 0.0 - 0.2
Next Steps