Overview
The Junis External API implements rate limiting to ensure fair usage and maintain service quality for all users. Rate limits are applied per API key and are configurable when creating or updating an API key. Base URL:https://api.junis.ai
Rate limits are specific to each API key. If your organization has multiple API keys, each key has independent rate limits.
Rate Limiting Architecture
The External API uses a Redis-based sliding window counter with dual time windows (per-minute and per-hour) for precise, distributed rate limiting across multiple server instances.How It Works
How It Works
Storage: Redis with distributed support across all API serversWindows: Two independent sliding windows
- Per-Minute: Rolling 60-second window, resets every minute
- Per-Hour: Rolling 3600-second window, resets every hour
Rate Limit Enforcement
Rate Limit Enforcement
Check Before Request: Rate limits are checked before processing your requestHeaders Included: Every response includes current rate limit status429 on Exceed: Returns
429 Too Many Requests if limit exceededAutomatic Reset: Counters automatically reset at the end of each windowDefault Rate Limits
When you create a new API key, the following default limits apply:| Limit Type | Default Value | Customizable |
|---|---|---|
| Requests per minute | 100 | Yes ✓ |
| Requests per hour | 1,000 | Yes ✓ |
How to Set Custom Limits
When creating an API key via the Admin panel:- Navigate to Admin → API Keys
- Click ”+ Create API Key”
- Configure Rate Limits section:
- Requests per Minute:
1to1000(default: 100) - Requests per Hour:
1to100000(default: 1000)
- Requests per Minute:
- Click “Create”
Rate Limit Headers
Every API response includes headers that show your current rate limit status.Response Headers
| Header | Description |
|---|---|
X-RateLimit-Limit-Minute | Maximum requests allowed per minute |
X-RateLimit-Remaining-Minute | Requests remaining in current minute window |
X-RateLimit-Reset-Minute | Unix timestamp when minute window resets |
X-RateLimit-Limit-Hour | Maximum requests allowed per hour |
X-RateLimit-Remaining-Hour | Requests remaining in current hour window |
X-RateLimit-Reset-Hour | Unix timestamp when hour window resets |
429 Too Many Requests
When you exceed your rate limit, the API returns a429 status code.
Response Format
HTTP Status:| Field | Description |
|---|---|
error | Error type: rate_limit_exceeded |
message | Human-readable error message |
limit | The rate limit that was exceeded |
window | Which window was exceeded: minute or hour |
reset_at | Unix timestamp when the window resets |
retry_after | Seconds to wait before retrying |
Handling Rate Limits
Best Practices
- Monitor Headers: Always check
X-RateLimit-Remaining-MinuteandX-RateLimit-Remaining-Hourheaders to track your usage - Implement Exponential Backoff: When you receive a 429 response, use the
Retry-Afterheader value to wait before retrying - Cache Responses: Implement caching (TTL: 5 minutes recommended) to reduce API calls for frequently accessed data
- Batch Requests: Use paginated endpoints instead of individual requests (e.g.,
GET /api/external/sessions?limit=100) - Use Streaming: For real-time chat, use
stream: trueparameter instead of polling to avoid repeated API calls
Rate Limit Optimization
Identify Bottlenecks
- Log Rate Limit Headers: Track
X-RateLimit-Remaining-*headers in your application logs - Analyze Request Patterns: Identify high-frequency endpoints that may benefit from caching or batching
- Repeated session fetches → Add caching
- Individual message fetches → Use paginated list endpoints
- Polling for updates → Switch to streaming
- Implement Caching: Use Redis or in-memory cache with 5-minute TTL for frequently accessed data
- Monitor & Alert: Set up alerts when rate limit usage exceeds 90%
Testing Rate Limits
Local Testing
Test your rate limit handling before deploying to production:Increasing Rate Limits
If you need higher rate limits for your use case, you have two options:Option 1: Update Existing API Key (Admin Only)
Organization admins can update rate limits for existing API keys:- Navigate to Admin → API Keys
- Click “Edit” on the API key
- Update Rate Limits section
- Click “Save”
Option 2: Create New API Key with Higher Limits
If you’re an admin, create a new API key with custom limits:- Navigate to Admin → API Keys
- Click ”+ Create API Key”
- Set higher limits:
- Requests per Minute: Up to
1000 - Requests per Hour: Up to
100000
- Requests per Minute: Up to
- Click “Create”
- Copy the API key (shown only once)
- Update your application to use the new API key
FAQs
Are rate limits per organization or per API key?
Are rate limits per organization or per API key?
Answer: Rate limits are per API key.Example: If your organization has 2 API keys:
- API Key 1: 60 requests/minute
- API Key 2: 300 requests/minute
- Total capacity: 360 requests/minute (independent limits)
What happens if I exceed rate limits?
What happens if I exceed rate limits?
Response:
429 Too Many Requests with Retry-After headerAction: Wait for the number of seconds specified in Retry-After or until the next window resetNo penalties: No account suspension or additional chargesCan I request a temporary rate limit increase?
Can I request a temporary rate limit increase?
Short-term: Yes, organization admins can update rate limits in Admin → API KeysLong-term: Create new API keys with higher limits as needed
Do streaming responses count differently?
Do streaming responses count differently?
Answer: Streaming responses count as 1 request, regardless of how long the stream lasts.Example:
- Non-streaming: 1 request = 1 complete response
- Streaming: 1 request = entire conversation stream
How do I monitor my rate limit usage?
How do I monitor my rate limit usage?
Method 1: Check response headers (
X-RateLimit-Remaining-*)Method 2: Set up logging in your application (see Best Practices)Method 3: Monitor 429 error rates in your application metrics