Documentation Index
Fetch the complete documentation index at: https://docs.junis.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Chat Completions API is fully compatible with OpenAI’s API format, allowing you to use the OpenAI Python SDK, JavaScript SDK, or any OpenAI-compatible client.Model Parameter (Optional): The
model parameter is optional and primarily for SDK compatibility. Your organization’s orchestrator agent automatically uses the configured LLM model.If you’re using an OpenAI SDK that requires the model parameter, you can pass any value (e.g., "junis-orchestrator") for compatibility.Prerequisites
Endpoint
Authentication
Include your API key in theX-API-Key header:
orchestrator:invoke
See Authentication for details.
Request Format
Basic Request
cURL
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
messages | array | Yes | Array of message objects (OpenAI format) |
stream | boolean | No | Enable streaming mode (default: false) |
session_id | string | No | Session ID to continue conversation (creates new if not provided) |
Message Format
Messages follow the OpenAI format:system: System instructions (optional, rarely needed with Junis)user: User messagesassistant: Assistant responses (for conversation context)tool: Tool call results (for function calling)
Response Format
Non-Streaming Response
Status Code:200 OK
Response Body:
Response Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique completion ID |
object | string | Object type (always "chat.completion") |
created | integer | Unix timestamp |
model | string | Model used |
choices | array | Array of completion choices |
choices[].message | object | Generated message |
choices[].message.role | string | Always "assistant" |
choices[].message.content | string | Response text |
choices[].finish_reason | string | Reason for completion (stop, length, tool_calls, content_filter) |
usage | object | Token usage statistics |
session_id | string | Session ID for continuing the conversation |
Streaming Mode
Enable real-time token-by-token streaming withstream: true.
Streaming Request
cURL
Streaming Response Format
Server-Sent Events (SSE) format:- Each line starts with
data: - First chunk contains
role - Subsequent chunks contain
contentdeltas - Final chunk has
finish_reason - Every chunk includes
session_idfor conversation continuity - Stream ends with
data: [DONE]
Multi-Agent Events
For Advanced Integrations: Junis streams additional agent lifecycle events alongside standard OpenAI chunks. These events have no
choices field, so standard OpenAI SDKs automatically ignore them - ensuring 100% backward compatibility.Parse these events to build rich UIs that show which agents are processing your request.Event Types
| Event | Description | Fields |
|---|---|---|
agent_started | Agent begins processing | agent, agent_type, depth, parent_agent |
agent_completed | Agent finishes processing | agent, duration_ms |
agent_routing | Transition between agents | from_agent, to_agent |
token_agent | Which agent produced the token | agent |
tool_started | Tool execution begins | tool_name, tool_id, agent_name |
tool_completed | Tool execution ends | tool_name, tool_id, agent_name |
payment_required | Payment needed to continue | message, agent_name, nonce, network, total_amount |
SSE Stream Example (Multi-Agent)
Parsing Agent Events (Python)
Parallel Agent Support
When multiple agents run concurrently (Parallel Agent), usetoken_agent events to correctly attribute tokens:
Session Management
Junis supports explicit session management viasession_id parameter, allowing you to control conversation context precisely.
How Sessions Work
Session ID Usage
Behavior:| Request | Result |
|---|---|
No session_id | Creates a new session and returns session_id in response |
With session_id (existing) | Uses the existing session (same chat room) |
With session_id (new ID) | Creates a new session with the provided ID |
Example: Continuing a Conversation
Viewing Sessions
Retrieve your sessions via the Sessions API:Function Calling (Tool Use)
When a supported keyword is detected in the assistant’s response, the API returns atool_calls object with finish_reason: "tool_calls".
Error Handling
Common Errors
Error Types
| Error Type | Status Code | Description | Action |
|---|---|---|---|
invalid_request_error | 400 | Malformed request | Check request format |
authentication_error | 401 | Invalid API key | Verify API key |
insufficient_credits | 402 | Credit balance below zero | Add credits at /subscription |
subscription_required | 403 | No active subscription | Subscribe to Basic or Pro plan |
invalid_subscription_plan | 403 | Plan doesn’t include API access | Upgrade to Basic or Pro plan |
rate_limit_error | 429 | Too many requests | Implement exponential backoff |
api_error | 500 | Server error | Retry with exponential backoff |
Best Practices
- Use Streaming: Enable
stream: truefor long responses to provide real-time feedback - Handle Rate Limits: Implement exponential backoff when hitting rate limits (see error codes above)
- Keep Message History Concise: Trim old messages to avoid token limits (keep last 10-20 messages)
- Secure API Keys: Never hardcode API keys; use environment variables
- Log Requests: Log requests and responses for debugging and monitoring
Response Parameters
Temperature Guide
| Use Case | Recommended Temperature |
|---|---|
| Factual Q&A | 0.0 - 0.3 |
| General chat | 0.5 - 0.7 |
| Creative writing | 0.8 - 1.0 |
| Code generation | 0.0 - 0.2 |
Next Steps
Sessions API
Retrieve and manage conversation sessions
Error Codes
Complete error reference
Rate Limits
Understand rate limiting
