Rate Limits - Junis

Overview

The Junis External API implements rate limiting to ensure fair usage and maintain service quality for all users. Rate limits are applied per API key and are configurable when creating or updating an API key. Base URL: https://api.junis.ai

Rate limits are specific to each API key. If your organization has multiple API keys, each key has independent rate limits.

Rate Limiting Architecture

The External API uses a Redis-based sliding window counter with dual time windows (per-minute and per-hour) for precise, distributed rate limiting across multiple server instances.

How It Works

Storage: Redis with distributed support across all API serversWindows: Two independent sliding windows

Per-Minute: Rolling 60-second window, resets every minute
Per-Hour: Rolling 3600-second window, resets every hour

Algorithm: Sliding window counter with atomic increments

Rate Limit Enforcement

Check Before Request: Rate limits are checked before processing your requestHeaders Included: Every response includes current rate limit status429 on Exceed: Returns 429 Too Many Requests if limit exceededAutomatic Reset: Counters automatically reset at the end of each window

Default Rate Limits

When you create a new API key, the following default limits apply:

Limit Type	Default Value	Customizable
Requests per minute	100	Yes ✓
Requests per hour	1,000	Yes ✓

You can customize rate limits when creating an API key in Admin → API Keys. Contact your organization admin to adjust limits for existing keys.

How to Set Custom Limits

When creating an API key via the Admin panel:

Navigate to Admin → API Keys
Click ”+ Create API Key”
Configure Rate Limits section:
- Requests per Minute: 1 to 1000 (default: 100)
- Requests per Hour: 1 to 100000 (default: 1000)
Click “Create”

Rate Limit Headers

Every API response includes headers that show your current rate limit status.

Response Headers

X-RateLimit-Limit-Minute: 60
X-RateLimit-Remaining-Minute: 55
X-RateLimit-Reset-Minute: 1767261600
X-RateLimit-Limit-Hour: 1000
X-RateLimit-Remaining-Hour: 950
X-RateLimit-Reset-Hour: 1767261600

Header	Description
`X-RateLimit-Limit-Minute`	Maximum requests allowed per minute
`X-RateLimit-Remaining-Minute`	Requests remaining in current minute window
`X-RateLimit-Reset-Minute`	Unix timestamp when minute window resets
`X-RateLimit-Limit-Hour`	Maximum requests allowed per hour
`X-RateLimit-Remaining-Hour`	Requests remaining in current hour window
`X-RateLimit-Reset-Hour`	Unix timestamp when hour window resets

Always monitor X-RateLimit-Remaining-* headers to avoid hitting rate limits. Consider pausing requests when remaining count is low.

429 Too Many Requests

When you exceed your rate limit, the API returns a 429 status code.

Response Format

HTTP Status:

HTTP/1.1 429 Too Many Requests
Retry-After: 45

Headers:

X-RateLimit-Limit-Minute: 60
X-RateLimit-Remaining-Minute: 0
X-RateLimit-Reset-Minute: 1767261600
X-RateLimit-Limit-Hour: 1000
X-RateLimit-Remaining-Hour: 950
X-RateLimit-Reset-Hour: 1767261600
Retry-After: 45

Body:

{
  "error": "rate_limit_exceeded",
  "message": "Rate limit exceeded: 60 requests per minute",
  "limit": 60,
  "window": "minute",
  "reset_at": 1767261600,
  "retry_after": 45
}

Field	Description
`error`	Error type: `rate_limit_exceeded`
`message`	Human-readable error message
`limit`	The rate limit that was exceeded
`window`	Which window was exceeded: `minute` or `hour`
`reset_at`	Unix timestamp when the window resets
`retry_after`	Seconds to wait before retrying

Handling Rate Limits

Best Practices

Monitor Headers: Always check X-RateLimit-Remaining-Minute and X-RateLimit-Remaining-Hour headers to track your usage
Implement Exponential Backoff: When you receive a 429 response, use the Retry-After header value to wait before retrying
Cache Responses: Implement caching (TTL: 5 minutes recommended) to reduce API calls for frequently accessed data
Batch Requests: Use paginated endpoints instead of individual requests (e.g., GET /api/external/sessions?limit=100)
Use Streaming: For real-time chat, use stream: true parameter instead of polling to avoid repeated API calls

Rate Limit Optimization

Identify Bottlenecks

Log Rate Limit Headers: Track X-RateLimit-Remaining-* headers in your application logs
Analyze Request Patterns: Identify high-frequency endpoints that may benefit from caching or batching
- Repeated session fetches → Add caching
- Individual message fetches → Use paginated list endpoints
- Polling for updates → Switch to streaming
Implement Caching: Use Redis or in-memory cache with 5-minute TTL for frequently accessed data
Monitor & Alert: Set up alerts when rate limit usage exceeds 90%

Testing Rate Limits

Local Testing

Test your rate limit handling before deploying to production:

# Test minute rate limit (send 65 requests in 1 minute)
for i in {1..65}; do
  curl -s -o /dev/null -w "%{http_code}\n" \
    https://api.junis.ai/api/external/v1/chat/completions \
    -H "X-API-Key: jns_live_YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"messages":[{"role":"user","content":"test"}]}'
  sleep 0.5
done

Increasing Rate Limits

If you need higher rate limits for your use case, you have two options:

Option 1: Update Existing API Key (Admin Only)

Organization admins can update rate limits for existing API keys:

Navigate to Admin → API Keys
Click “Edit” on the API key
Update Rate Limits section
Click “Save”

Only organization admins can modify API key rate limits. Contact your admin if you need higher limits.

Option 2: Create New API Key with Higher Limits

If you’re an admin, create a new API key with custom limits:

Navigate to Admin → API Keys
Click ”+ Create API Key”
Set higher limits:
- Requests per Minute: Up to 1000
- Requests per Hour: Up to 100000
Click “Create”
Copy the API key (shown only once)
Update your application to use the new API key

Set rate limits based on your actual usage patterns. Start conservative and increase as needed.

FAQs

Are rate limits per organization or per API key?

Answer: Rate limits are per API key.Example: If your organization has 2 API keys:

API Key 1: 60 requests/minute
API Key 2: 300 requests/minute
Total capacity: 360 requests/minute (independent limits)

What happens if I exceed rate limits?

Response: 429 Too Many Requests with Retry-After headerAction: Wait for the number of seconds specified in Retry-After or until the next window resetNo penalties: No account suspension or additional charges

Can I request a temporary rate limit increase?

Short-term: Yes, organization admins can update rate limits in Admin → API KeysLong-term: Create new API keys with higher limits as needed

Do streaming responses count differently?

Answer: Streaming responses count as 1 request, regardless of how long the stream lasts.Example:

Non-streaming: 1 request = 1 complete response
Streaming: 1 request = entire conversation stream

Streaming is more efficient for real-time chat.

How do I monitor my rate limit usage?

Method 1: Check response headers (X-RateLimit-Remaining-*)Method 2: Set up logging in your application (see Best Practices)Method 3: Monitor 429 error rates in your application metrics

Next Steps

Authentication

Learn about API keys and scopes

Error Codes

Handle 429 and other API errors

Chat Completions API

Start building with streaming

Sessions API

Manage chat sessions efficiently

API Reference

​Overview

​Rate Limiting Architecture

​Default Rate Limits

​How to Set Custom Limits

​Rate Limit Headers

​Response Headers

​429 Too Many Requests

​Response Format

​Handling Rate Limits

​Best Practices

​Rate Limit Optimization

​Identify Bottlenecks

​Testing Rate Limits

​Local Testing

​Increasing Rate Limits

​Option 1: Update Existing API Key (Admin Only)

​Option 2: Create New API Key with Higher Limits

​FAQs

​Next Steps

Authentication

Error Codes

Chat Completions API

Sessions API

Overview

Rate Limiting Architecture

Default Rate Limits

How to Set Custom Limits

Rate Limit Headers

Response Headers

429 Too Many Requests

Response Format

Handling Rate Limits

Best Practices

Rate Limit Optimization

Identify Bottlenecks

Testing Rate Limits

Local Testing

Increasing Rate Limits

Option 1: Update Existing API Key (Admin Only)

Option 2: Create New API Key with Higher Limits

FAQs

Next Steps