Skip to main content

Overview

The Chat Completions API is fully compatible with OpenAI’s API format, allowing you to use the OpenAI Python SDK, JavaScript SDK, or any OpenAI-compatible client.
Model Parameter (Optional): The model parameter is optional and primarily for SDK compatibility. Your organization’s orchestrator agent automatically uses the configured LLM model.If you’re using an OpenAI SDK that requires the model parameter, you can pass any value (e.g., "junis-orchestrator") for compatibility.

Endpoint

POST https://api.junis.ai/api/external/v1/chat/completions

Authentication

Include your API key in the X-API-Key header:
X-API-Key: jns_live_YOUR_API_KEY_HERE
See Authentication for details.

Request Format

Basic Request

cURL
curl -X POST https://api.junis.ai/api/external/v1/chat/completions \
  -H "X-API-Key: jns_live_YOUR_API_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is Junis?"}
    ]
  }'

Request Parameters

ParameterTypeRequiredDescription
messagesarrayYesArray of message objects (OpenAI format)
streambooleanNoEnable streaming mode (default: false)
temperaturenumberNoSampling temperature (0.0 - 2.0)
max_tokensintegerNoMaximum tokens in response
top_pnumberNoNucleus sampling parameter
frequency_penaltynumberNoFrequency penalty (-2.0 to 2.0)
presence_penaltynumberNoPresence penalty (-2.0 to 2.0)
stopstring or arrayNoStop sequences
nintegerNoNumber of completions to generate (default: 1)
toolsarrayNoAvailable tools for function calling (keyword-based detection)
elevenlabs_extra_bodyobjectNoElevenLabs metadata (e.g., conversation_id for session continuity)

Message Format

Messages follow the OpenAI format:
{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi! How can I help you?"},
    {"role": "user", "content": "Tell me about Junis."}
  ]
}
Role Types:
  • system: System instructions (optional, rarely needed with Junis)
  • user: User messages
  • assistant: Assistant responses (for conversation context)
  • tool: Tool call results (for function calling)

Response Format

Non-Streaming Response

Status Code: 200 OK Response Body:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699896916,
  "model": "junis-orchestrator",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Junis is a production-ready Multi-Agent AI platform..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 42,
    "total_tokens": 57
  }
}

Response Fields

FieldTypeDescription
idstringUnique completion ID
objectstringObject type (always "chat.completion")
createdintegerUnix timestamp
modelstringModel used
choicesarrayArray of completion choices
choices[].messageobjectGenerated message
choices[].message.rolestringAlways "assistant"
choices[].message.contentstringResponse text
choices[].finish_reasonstringReason for completion (stop, length, tool_calls)
usageobjectToken usage statistics

Streaming Mode

Enable real-time token-by-token streaming with stream: true.

Streaming Request

cURL
curl -N -X POST https://api.junis.ai/api/external/v1/chat/completions \
  -H "X-API-Key: jns_live_YOUR_API_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Tell me a story"}],
    "stream": true
  }'

Streaming Response Format

Server-Sent Events (SSE) format:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"content":"Once"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{"content":" upon"},"finish_reason":null}]}

...

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1699896916,"model":"junis-orchestrator","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
Key Points:
  • Each line starts with data:
  • First chunk contains role
  • Subsequent chunks contain content deltas
  • Final chunk has finish_reason
  • Stream ends with data: [DONE]

Session Management

Junis automatically manages conversation sessions for you.

How Sessions Work

Session Detection

Junis uses smart session detection with the following priority:
  1. ElevenLabs conversation_id (in elevenlabs_extra_body) → Uses or creates session with this ID
  2. Single message → Creates new session
  3. Multiple messages → Uses message hash to match existing sessions (allows conversation continuation)
For ElevenLabs integration: Pass conversation_id in elevenlabs_extra_body to maintain session continuity across calls.
{
  "messages": [...],
  "elevenlabs_extra_body": {
    "conversation_id": "your-unique-conversation-id"
  }
}

Viewing Sessions

Retrieve your sessions via the Sessions API:
curl -X GET https://api.junis.ai/api/external/sessions \
  -H "X-API-Key: jns_live_YOUR_API_KEY_HERE"
See Sessions API for details.

Function Calling (Tool Use)

Current Implementation: Tool calling uses keyword pattern matching, not LLM-native function calling.Supported tools:
  • end_call: Detects phrases like “안녕히”, “감사합니다”, “goodbye”
  • transfer_call: Detects phrases like “상담원 연결”, “전화 연결”, “connect agent”
Full LLM-native function calling support is in development.
When a supported keyword is detected in the assistant’s response, the API returns a tool_calls object with finish_reason: "tool_calls".

Error Handling

Common Errors

{
  "error": {
    "message": "Invalid request: 'messages' field is required",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "missing_required_field"
  }
}

Error Types

Error TypeStatus CodeDescriptionAction
invalid_request_error400Malformed requestCheck request format
authentication_error401Invalid API keyVerify API key
permission_error403Insufficient permissionsCheck API key scopes
rate_limit_error429Too many requestsImplement exponential backoff
api_error500Server errorRetry with exponential backoff

Best Practices

  • Use Streaming: Enable stream: true for long responses to provide real-time feedback
  • Handle Rate Limits: Implement exponential backoff when hitting rate limits (see error codes above)
  • Keep Message History Concise: Trim old messages to avoid token limits (keep last 10-20 messages)
  • Secure API Keys: Never hardcode API keys; use environment variables
  • Log Requests: Log requests and responses for debugging and monitoring

Response Parameters

Temperature Guide

Use CaseRecommended Temperature
Factual Q&A0.0 - 0.3
General chat0.5 - 0.7
Creative writing0.8 - 1.0
Code generation0.0 - 0.2

Next Steps